2023-09-28 10:51:51,219 INFO [train.py:1107] (3/4) Training started 2023-09-28 10:51:51,219 INFO [train.py:1117] (3/4) Device: cuda:3 2023-09-28 10:51:51,227 INFO [train.py:1129] (3/4) {'best_train_loss': inf, 'best_valid_loss': inf, 'best_train_epoch': -1, 'best_valid_epoch': -1, 'batch_idx_train': 0, 'log_interval': 50, 'reset_interval': 200, 'valid_interval': 3000, 'feature_dim': 80, 'subsampling_factor': 4, 'warm_step': 2000, 'env_info': {'k2-version': '1.24.3', 'k2-build-type': 'Release', 'k2-with-cuda': True, 'k2-git-sha1': '821ebc378e7fb99b8adc81950227963332821e01', 'k2-git-date': 'Wed Jul 19 15:38:25 2023', 'lhotse-version': '1.16.0.dev+git.1db4d97a.clean', 'torch-version': '1.11.0+cu102', 'torch-cuda-available': True, 'torch-cuda-version': '10.2', 'python-version': '3.9', 'icefall-git-branch': 'dev/bilingual', 'icefall-git-sha1': '09ada8fb-dirty', 'icefall-git-date': 'Thu Sep 28 10:47:39 2023', 'icefall-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/icefall-1.0-py3.9.egg', 'k2-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/k2-1.24.3.dev20230721+cuda10.2.torch1.11.0-py3.9-linux-x86_64.egg/k2/__init__.py', 'lhotse-path': '/star-home/jinzengrui/lib/miniconda3/envs/dev39/lib/python3.9/site-packages/lhotse-1.16.0.dev0+git.1db4d97a.clean-py3.9.egg/lhotse/__init__.py', 'hostname': 'de-74279-k2-train-6-0423201309-7c68fd68fb-6cszs', 'IP address': '10.177.28.83'}, 'world_size': 4, 'master_port': 12354, 'tensorboard': True, 'num_epochs': 30, 'start_epoch': 1, 'start_batch': 0, 'exp_dir': PosixPath('zipformer/exp-w-tal-csasr'), 'bpe_model': 'data/lang_bbpe_2000/bbpe.model', 'base_lr': 0.045, 'lr_batches': 7500, 'lr_epochs': 3.5, 'ref_duration': 600, 'context_size': 2, 'prune_range': 5, 'lm_scale': 0.25, 'am_scale': 0.0, 'simple_loss_scale': 0.5, 'ctc_loss_scale': 0.2, 'seed': 42, 'print_diagnostics': False, 'inf_check': False, 'save_every_n': 4000, 'keep_last_k': 30, 'average_period': 200, 'use_fp16': True, 'use_tal_csasr': True, 'num_encoder_layers': '2,2,3,4,3,2', 'downsampling_factor': '1,2,4,8,4,2', 'feedforward_dim': '512,768,1024,1536,1024,768', 'num_heads': '4,4,4,8,4,4', 'encoder_dim': '192,256,384,512,384,256', 'query_head_dim': '32', 'value_head_dim': '12', 'pos_head_dim': '4', 'pos_dim': 48, 'encoder_unmasked_dim': '192,192,256,256,256,192', 'cnn_module_kernel': '31,31,15,15,15,31', 'decoder_dim': 512, 'joiner_dim': 512, 'causal': False, 'chunk_size': '16,32,64,-1', 'left_context_frames': '64,128,256,-1', 'use_transducer': True, 'use_ctc': False, 'manifest_dir': PosixPath('data/fbank'), 'max_duration': 1000, 'bucketing_sampler': True, 'num_buckets': 30, 'concatenate_cuts': False, 'duration_factor': 1.0, 'gap': 1.0, 'on_the_fly_feats': False, 'shuffle': True, 'drop_last': True, 'return_cuts': True, 'num_workers': 2, 'enable_spec_aug': True, 'spec_aug_time_warp_factor': 80, 'enable_musan': True, 'input_strategy': 'PrecomputedFeatures', 'blank_id': 0, 'vocab_size': 2000} 2023-09-28 10:51:51,228 INFO [train.py:1131] (3/4) About to create model 2023-09-28 10:51:52,066 INFO [train.py:1135] (3/4) Number of model parameters: 68625511 2023-09-28 10:51:58,833 INFO [train.py:1150] (3/4) Using DDP 2023-09-28 10:51:59,173 INFO [multi_dataset.py:39] (3/4) About to get multidataset train cuts 2023-09-28 10:51:59,173 INFO [multi_dataset.py:42] (3/4) Loading Aishell-2 in lazy mode 2023-09-28 10:51:59,249 INFO [multi_dataset.py:49] (3/4) Loading TAL-CSASR in lazy mode 2023-09-28 10:51:59,268 INFO [multi_dataset.py:142] (3/4) About to get train-clean-100 cuts 2023-09-28 10:51:59,294 INFO [multi_dataset.py:149] (3/4) About to get train-clean-360 cuts 2023-09-28 10:51:59,299 INFO [multi_dataset.py:156] (3/4) About to get train-other-500 cuts 2023-09-28 10:52:14,751 INFO [asr_datamodule.py:218] (3/4) Enable MUSAN 2023-09-28 10:52:14,751 INFO [asr_datamodule.py:219] (3/4) About to get Musan cuts 2023-09-28 10:52:17,909 INFO [asr_datamodule.py:243] (3/4) Enable SpecAugment 2023-09-28 10:52:17,910 INFO [asr_datamodule.py:244] (3/4) Time warp factor: 80 2023-09-28 10:52:17,910 INFO [asr_datamodule.py:254] (3/4) Num frame mask: 10 2023-09-28 10:52:17,910 INFO [asr_datamodule.py:267] (3/4) About to create train dataset 2023-09-28 10:52:17,910 INFO [asr_datamodule.py:294] (3/4) Using DynamicBucketingSampler. 2023-09-28 10:52:17,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:52:18,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 10:52:18,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 10:52:18,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:18,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:18,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:18,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:18,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:19,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:19,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:19,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:52:19,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:52:19,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 10:52:19,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 10:52:19,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 10:52:20,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:52:20,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 10:52:20,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 10:52:20,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:20,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:20,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:21,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:21,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:22,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:22,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:22,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:22,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:22,271 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:22,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:52:22,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:22,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:52:23,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 10:52:23,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:23,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:23,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 10:52:23,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 10:52:23,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:52:23,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:23,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 10:52:24,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 10:52:24,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:52:24,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:52:25,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:25,266 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 10:52:25,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 10:52:25,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:52:25,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:25,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 10:52:25,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 10:52:25,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 10:52:26,090 INFO [asr_datamodule.py:309] (3/4) About to create train dataloader 2023-09-28 10:52:26,091 INFO [multi_dataset.py:88] (3/4) About to get multidataset dev cuts 2023-09-28 10:52:26,091 INFO [multi_dataset.py:91] (3/4) Loading Aishell-2 DEV set in lazy mode 2023-09-28 10:52:26,109 INFO [multi_dataset.py:163] (3/4) About to get dev-clean cuts 2023-09-28 10:52:26,129 INFO [multi_dataset.py:170] (3/4) About to get dev-other cuts 2023-09-28 10:52:26,172 INFO [asr_datamodule.py:340] (3/4) About to create dev dataset 2023-09-28 10:52:26,942 INFO [asr_datamodule.py:357] (3/4) About to create dev dataloader 2023-09-28 10:52:26,942 INFO [train.py:1351] (3/4) Sanity check -- see if any of the batches in epoch 1 would cause OOM. 2023-09-28 10:52:26,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:52:26,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 10:52:27,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 10:52:27,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:27,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:27,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:27,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:27,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:27,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:27,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:27,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:52:28,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:52:28,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 10:52:28,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 10:52:28,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 10:52:29,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:52:29,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 10:52:29,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 10:52:29,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:29,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:30,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:30,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:30,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:30,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:30,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:30,840 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:30,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:31,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:31,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:52:31,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:31,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:52:32,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 10:52:32,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:32,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:32,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 10:52:32,932 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 10:52:33,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:52:33,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:33,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 10:52:33,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 10:52:33,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:52:34,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:52:34,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:34,489 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 10:52:34,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 10:52:34,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:52:34,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:34,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 10:52:34,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 10:52:35,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 10:52:35,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:52:35,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 10:52:35,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 10:52:35,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:35,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:36,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:36,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:36,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:36,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:36,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:36,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:52:37,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:52:37,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 10:52:37,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 10:52:37,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 10:52:37,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:52:37,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 10:52:37,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 10:52:37,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:38,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:38,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:38,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:38,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:39,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:39,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:39,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:39,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:39,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:52:39,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:52:39,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:39,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:52:40,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 10:52:40,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:52:41,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:52:41,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 10:52:41,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 10:52:41,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:52:41,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:41,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 10:52:41,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 10:52:41,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:52:42,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:52:42,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:52:42,845 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 10:52:42,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 10:52:42,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:52:43,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:43,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 10:52:43,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 10:52:44,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 10:52:44,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:52:45,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 10:52:45,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:52:45,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:52:46,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:46,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:52:46,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:46,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 10:52:46,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 10:52:47,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:47,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:47,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:47,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:47,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:52:47,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:52:48,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 10:52:48,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:52:49,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:52:49,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:49,454 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 10:52:49,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:52:49,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:52:50,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:51,373 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:52:51,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:52,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 10:52:52,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 10:52:53,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:53,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:52:53,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:52:53,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:53,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 10:52:53,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:52:53,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:52:54,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:52:54,728 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 10:52:54,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:52:55,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:55,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:52:55,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 10:52:55,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:52:55,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:52:56,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:56,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:52:56,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:52:57,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 10:52:57,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:58,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:52:58,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 10:52:58,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 10:52:59,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:52:59,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:52:59,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:52:59,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:52:59,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:52:59,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 10:52:59,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:00,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:00,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:53:00,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:53:00,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 10:53:01,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:53:01,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:53:01,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 10:53:01,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:01,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 10:53:02,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:02,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:02,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:02,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:53:02,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:53:03,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 10:53:03,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 10:53:03,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:03,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:53:03,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:03,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:04,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 10:53:04,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 10:53:04,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 10:53:04,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:04,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:53:05,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 10:53:05,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 10:53:05,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:05,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:53:05,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 10:53:05,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:05,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:06,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:06,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:53:06,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 10:53:06,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:07,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:53:07,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:53:07,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:53:07,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:53:07,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:53:07,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 10:53:07,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:53:07,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:07,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:08,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:53:08,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 10:53:08,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:08,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:08,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:53:09,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 10:53:09,630 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 10:53:09,664 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 10:53:09,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:09,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:53:10,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:53:10,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:11,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:12,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:12,190 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 10:53:12,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 10:53:12,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:53:13,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:53:13,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:13,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:14,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:53:14,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:53:14,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:14,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:14,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:14,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:14,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:15,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 10:53:15,062 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 10:53:15,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:15,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:53:15,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:15,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:15,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 10:53:15,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 10:53:15,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 10:53:15,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:15,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:15,746 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:15,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:53:15,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:16,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:16,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:53:16,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:16,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:17,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:17,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:17,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:53:17,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:18,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 10:53:18,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 10:53:18,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 10:53:19,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:19,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:53:19,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:53:19,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:19,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:19,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:19,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:19,782 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 10:53:20,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:20,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:21,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:53:21,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 10:53:21,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:53:21,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:21,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:53:21,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:53:22,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:22,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:53:22,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:22,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 10:53:23,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:23,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:23,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:53:23,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:53:23,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:23,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 10:53:24,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:53:25,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:53:25,283 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:25,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:53:25,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 10:53:25,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:53:25,640 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 10:53:26,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:26,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:26,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:53:26,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 10:53:27,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:27,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:27,399 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 10:53:27,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:53:27,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:27,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:28,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:53:28,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:28,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:30,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:53:30,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:30,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:53:30,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:30,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 10:53:30,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 10:53:31,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:31,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:53:32,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:32,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:32,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 10:53:32,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 10:53:32,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:32,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:53:33,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:34,518 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:34,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:53:35,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:35,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 10:53:35,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:35,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:53:35,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:35,932 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:53:36,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 10:53:36,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:36,251 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 10:53:36,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:36,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:53:36,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:36,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:37,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:53:37,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:37,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:38,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:53:39,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:40,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:40,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:53:41,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:53:41,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 10:53:41,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:53:41,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:41,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 10:53:41,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:53:41,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:41,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:53:42,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 10:53:42,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:42,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:53:42,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:53:42,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:53:42,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:53:42,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:53:42,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:53:43,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:43,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:53:43,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:43,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:53:44,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:45,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:53:45,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:45,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:53:46,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 10:53:46,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:46,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:53:46,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 10:53:46,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 10:53:46,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:53:46,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 10:53:47,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:47,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:48,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:53:48,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 10:53:48,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:53:48,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:53:48,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 10:53:48,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:49,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 10:53:49,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:53:49,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 10:53:50,039 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 10:53:50,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:50,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:50,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:50,637 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 10:53:50,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:53:51,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:53:51,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:53:51,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:52,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 10:53:52,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 10:53:52,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:53:52,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:52,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 10:53:53,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:53,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:53:54,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:53:54,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 10:53:54,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:54,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:53:54,922 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:55,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:53:55,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 10:53:55,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 10:53:55,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:53:55,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 10:53:55,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:53:55,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:55,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:56,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:56,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:56,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:56,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 10:53:57,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:58,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:53:58,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:53:58,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:59,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 10:53:59,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:53:59,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 10:53:59,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:53:59,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 10:54:00,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:54:00,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 10:54:00,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:54:00,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:54:01,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:54:01,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:01,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:01,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:54:01,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:01,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:54:01,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:01,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:02,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:02,548 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:54:02,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:02,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:03,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 10:54:03,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:04,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:04,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:04,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:04,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 10:54:04,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:05,303 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 10:54:05,465 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 10:54:05,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:05,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:54:05,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 10:54:06,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:06,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:54:06,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:06,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:06,831 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:07,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:07,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:07,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:54:07,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 10:54:07,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:08,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:08,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:54:08,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:08,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:08,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:09,096 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 10:54:09,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 10:54:09,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:09,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 10:54:09,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:10,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:54:10,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:10,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 10:54:10,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:54:10,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:10,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:10,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:10,631 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 10:54:10,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 10:54:11,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:12,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:12,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 10:54:12,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 10:54:12,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:13,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:13,825 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 10:54:14,271 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:54:14,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 10:54:14,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:14,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:54:15,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 10:54:15,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:54:15,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 10:54:15,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:16,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:16,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 10:54:16,620 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:54:16,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 10:54:17,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:54:17,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:54:17,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 10:54:17,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:54:17,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:54:17,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 10:54:17,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 10:54:18,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:54:18,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:18,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:54:18,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 10:54:18,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:54:19,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:54:19,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:54:20,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:20,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:54:20,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 10:54:21,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 10:54:21,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:54:21,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:22,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:22,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:22,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:22,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 10:54:23,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 10:54:23,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 10:54:23,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:23,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:23,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:54:23,820 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 10:54:23,850 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 10:54:23,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:24,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:24,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 10:54:24,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:54:24,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:54:24,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 10:54:24,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 10:54:25,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:54:26,079 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:54:26,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:54:26,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 10:54:26,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:54:26,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 10:54:27,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 10:54:27,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:54:27,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:28,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:28,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:54:28,319 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 10:54:28,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:28,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:54:28,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:29,028 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 10:54:29,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 10:54:29,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:29,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 10:54:30,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 10:54:30,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:54:30,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:30,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:30,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:32,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:32,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:54:32,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:54:32,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:32,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 10:54:32,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:54:33,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:33,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:54:33,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:54:33,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:33,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 10:54:34,001 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 10:54:34,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:34,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:34,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:34,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:34,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:54:35,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 10:54:35,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:54:35,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:36,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:36,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:37,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:37,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 10:54:37,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:37,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:37,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 10:54:38,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 10:54:38,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:39,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 10:54:39,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 10:54:39,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:39,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 10:54:39,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:54:39,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:39,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:40,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:40,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:54:40,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:54:40,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:40,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 10:54:40,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:54:41,276 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:41,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:41,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:41,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:42,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 10:54:42,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 10:54:42,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:54:43,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:43,358 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:54:43,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:54:43,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:43,938 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 10:54:44,064 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:44,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 10:54:44,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:54:44,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:54:44,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:54:44,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:45,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 10:54:45,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 10:54:45,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:45,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:54:46,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:54:46,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:46,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:54:46,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:54:46,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:47,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:47,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 10:54:47,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 10:54:47,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:47,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:54:47,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:47,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 10:54:47,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 10:54:48,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 10:54:49,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 10:54:49,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:54:49,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:54:49,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:50,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:50,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 10:54:50,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 10:54:50,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:54:51,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:51,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:51,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 10:54:51,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:54:53,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 10:54:53,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:54:53,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:53,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:54:54,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:54:54,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:54:54,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:55,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:54:55,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:54:55,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:54:55,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:54:56,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 10:54:57,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:54:57,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:54:57,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 10:54:57,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:54:58,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 10:54:58,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:54:58,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:54:59,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 10:54:59,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 10:55:00,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:55:00,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:55:00,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:01,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 10:55:01,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:01,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:55:01,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:02,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:02,621 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 10:55:02,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:02,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:03,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:03,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 10:55:03,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:03,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:03,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 10:55:04,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:04,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:55:04,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:55:04,559 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 10:55:04,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:04,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:05,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:05,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:05,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:06,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:55:06,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 10:55:06,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:55:06,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:55:06,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:55:06,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:06,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 10:55:06,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 10:55:06,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 10:55:07,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:07,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:07,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:55:07,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:07,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:08,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:08,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:08,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:08,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:08,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:55:08,802 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:09,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:55:09,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:09,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:09,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:10,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 10:55:10,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 10:55:10,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 10:55:11,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:11,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:11,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 10:55:12,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:13,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:55:13,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:13,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:55:13,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:13,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:14,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 10:55:14,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:55:14,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 10:55:14,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 10:55:15,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:55:15,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:55:15,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:55:16,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:16,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 10:55:16,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:16,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:55:16,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 10:55:17,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:17,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:17,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:18,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:55:18,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 10:55:19,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 10:55:19,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 10:55:19,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:20,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:20,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:20,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:20,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 10:55:21,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 10:55:21,202 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 10:55:21,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 10:55:21,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 10:55:21,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 10:55:21,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:21,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 10:55:21,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:21,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:55:22,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:22,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:22,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 10:55:22,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:22,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:55:22,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 10:55:23,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:23,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:23,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:23,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 10:55:23,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:55:23,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:24,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:24,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 10:55:24,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 10:55:24,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:24,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 10:55:24,878 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 10:55:24,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 10:55:24,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:55:25,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 10:55:25,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:55:26,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:55:26,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:26,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:55:27,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:27,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:27,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 10:55:27,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:55:27,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 10:55:27,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:55:28,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:55:28,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 10:55:28,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:28,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:29,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:55:29,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:29,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:55:29,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 10:55:29,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:30,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:30,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:30,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:30,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:30,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 10:55:31,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:31,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:32,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:32,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:32,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:32,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:33,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:33,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:33,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:55:34,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 10:55:34,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:34,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:34,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:55:34,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:34,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 10:55:34,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:34,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 10:55:35,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:35,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:35,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:35,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:36,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:36,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:36,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:36,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:55:36,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 10:55:37,078 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 10:55:37,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 10:55:37,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:55:37,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:37,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:37,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:38,008 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 10:55:38,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 10:55:38,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:55:38,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:55:38,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:55:40,043 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:40,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 10:55:40,283 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:55:40,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 10:55:41,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:55:41,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:41,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 10:55:41,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:55:41,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:42,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 10:55:42,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:42,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:55:42,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:42,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:55:42,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:43,044 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 10:55:43,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 10:55:43,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 10:55:43,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:55:43,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:43,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:43,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:43,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:55:44,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:44,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:44,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 10:55:44,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 10:55:45,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:55:45,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 10:55:45,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 10:55:46,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 10:55:46,994 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 10:55:47,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:55:47,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:47,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 10:55:47,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:47,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:47,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 10:55:47,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:55:47,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:48,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:55:48,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:55:48,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:55:48,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 10:55:48,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 10:55:49,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:55:49,225 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:49,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:55:49,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:49,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:49,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:55:50,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:55:50,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:55:50,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:50,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:55:51,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 10:55:51,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:55:51,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 10:55:51,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:55:51,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:55:51,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 10:55:52,637 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:55:53,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:53,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 10:55:54,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:55:54,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 10:55:54,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 10:55:54,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:55:54,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:54,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:55:54,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:55:55,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:55,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:55:56,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:55:56,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:55:56,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 10:55:57,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:55:57,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:55:57,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:55:57,844 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 10:55:57,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 10:55:58,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:55:58,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:55:58,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:55:59,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:55:59,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:00,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 10:56:00,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:00,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 10:56:00,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:56:00,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:01,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:01,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:01,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 10:56:01,643 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 10:56:01,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 10:56:01,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 10:56:02,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:02,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 10:56:03,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:03,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:03,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:03,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 10:56:04,058 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 10:56:04,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:04,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:56:04,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:04,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:56:04,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 10:56:04,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 10:56:05,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:05,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 10:56:05,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:05,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:05,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:05,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:06,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 10:56:06,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 10:56:07,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:56:07,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:56:08,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:08,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:08,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 10:56:08,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 10:56:08,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 10:56:08,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:08,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:08,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:56:09,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 10:56:09,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:56:09,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:10,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:10,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 10:56:10,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:10,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:56:10,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 10:56:10,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:56:11,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:11,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:11,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 10:56:11,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 10:56:12,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:12,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 10:56:13,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:13,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:14,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 10:56:14,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 10:56:14,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:14,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:14,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:15,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 10:56:15,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 10:56:15,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 10:56:15,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:16,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 10:56:16,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 10:56:16,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 10:56:16,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:16,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:17,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:17,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:17,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:18,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:18,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 10:56:18,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:18,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:56:18,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:18,402 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 10:56:18,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 10:56:18,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 10:56:19,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 10:56:20,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:56:20,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:20,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:56:20,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:21,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:56:21,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 10:56:21,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:56:21,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 10:56:21,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 10:56:21,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:21,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:22,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:22,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:56:22,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:23,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:56:23,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:56:23,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 10:56:23,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:23,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:56:24,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:56:24,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:56:24,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:24,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:56:24,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:56:24,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:56:25,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 10:56:25,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:25,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 10:56:25,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:25,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 10:56:25,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 10:56:27,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:27,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:56:27,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:27,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 10:56:27,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 10:56:27,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:56:27,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 10:56:28,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 10:56:28,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:28,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 10:56:29,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 10:56:29,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:29,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:56:29,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:56:30,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 10:56:30,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 10:56:30,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 10:56:30,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:56:30,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:56:30,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 10:56:31,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:31,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:56:31,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:56:31,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:56:31,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:32,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:32,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 10:56:32,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:56:32,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 10:56:32,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 10:56:33,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:33,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:34,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:34,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:56:35,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:56:35,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:35,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 10:56:35,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:35,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 10:56:36,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:36,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:56:36,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 10:56:36,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 10:56:36,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:36,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:56:37,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:56:37,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 10:56:37,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:38,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 10:56:38,408 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 10:56:38,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:38,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:38,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 10:56:38,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:39,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 10:56:39,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:56:39,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:56:39,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:39,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:39,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 10:56:40,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:56:40,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 10:56:41,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:56:41,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:56:42,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 10:56:42,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 10:56:42,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:42,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:42,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:42,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 10:56:42,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:56:43,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:43,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 10:56:43,276 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:56:43,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 10:56:43,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:43,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:56:43,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:56:44,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:44,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:56:44,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:44,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:44,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 10:56:45,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:45,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 10:56:45,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:45,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:56:46,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 10:56:47,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:47,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:47,803 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:47,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 10:56:47,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:56:47,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:48,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 10:56:48,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:48,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:56:49,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:50,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:56:50,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 10:56:50,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:50,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:51,723 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 10:56:51,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:52,765 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 10:56:53,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:54,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:56:54,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:56:54,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:56:54,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:55,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:56:55,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:56:55,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:55,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:55,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:56:55,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:56:56,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:56:56,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:56:56,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:56:56,484 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 10:56:56,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 10:56:57,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:56:57,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:56:57,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:56:58,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:56:58,073 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 10:56:58,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:56:59,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 10:56:59,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:56:59,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 10:56:59,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:56:59,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 10:57:00,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 10:57:00,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:01,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:01,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:01,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:57:01,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:01,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:57:01,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:57:01,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 10:57:01,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:57:02,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:57:02,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:57:02,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:02,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:02,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 10:57:03,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:57:03,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 10:57:04,005 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 10:57:04,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:04,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:57:04,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:04,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:05,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 10:57:05,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:57:05,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:05,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 10:57:06,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:57:06,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:57:07,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 10:57:07,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:07,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:57:07,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:07,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:57:08,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 10:57:08,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:57:08,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:08,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:08,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:09,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:09,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 10:57:09,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 10:57:10,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:57:10,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:10,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 10:57:10,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:10,521 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 10:57:10,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:10,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:11,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:11,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:11,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:57:11,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 10:57:11,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 10:57:11,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 10:57:12,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:12,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 10:57:12,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:12,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 10:57:12,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:13,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 10:57:13,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 10:57:13,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:57:13,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 10:57:13,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:14,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 10:57:14,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:14,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:57:14,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:57:15,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:57:15,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:15,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 10:57:16,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:16,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 10:57:16,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:16,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:17,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:57:17,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 10:57:17,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:57:17,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:57:18,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 10:57:18,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 10:57:18,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:19,052 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:19,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:19,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:19,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:57:19,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 10:57:19,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 10:57:21,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:57:21,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:57:21,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 10:57:21,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 10:57:21,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:57:21,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:21,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 10:57:22,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:22,026 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 10:57:22,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:22,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:22,738 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:57:23,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 10:57:23,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 10:57:23,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 10:57:23,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:24,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 10:57:24,518 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:25,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 10:57:25,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:26,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:26,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:26,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:26,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:57:27,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:57:27,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:57:28,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 10:57:28,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 10:57:28,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:57:28,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 10:57:28,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:29,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:29,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 10:57:29,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 10:57:29,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 10:57:29,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:29,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 10:57:31,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:32,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:32,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:32,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 10:57:32,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:32,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 10:57:32,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 10:57:33,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:34,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:34,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 10:57:34,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:57:35,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 10:57:35,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 10:57:36,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 10:57:36,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:36,805 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:36,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:37,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 10:57:37,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 10:57:38,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:57:39,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:39,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:39,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 10:57:39,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:57:40,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 10:57:41,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:41,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:42,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 10:57:42,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:57:42,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:57:42,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:57:42,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:43,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:57:43,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:43,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:57:43,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 10:57:43,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:57:43,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:44,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:57:45,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 10:57:45,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 10:57:45,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:57:45,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 10:57:46,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:46,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:57:46,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:57:47,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:47,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:47,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 10:57:48,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:48,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:57:48,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:48,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 10:57:48,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:57:48,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 10:57:48,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:49,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:57:49,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 10:57:49,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:49,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:57:49,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 10:57:49,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:57:49,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:57:49,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:50,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:57:50,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:57:50,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:57:50,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:57:50,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:51,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:57:51,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:57:51,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:51,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:51,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 10:57:52,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:57:52,697 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 10:57:52,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:53,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:57:53,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:53,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 10:57:54,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:54,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 10:57:54,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 10:57:55,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:55,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:57:55,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:57:55,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 10:57:56,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 10:57:56,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 10:57:56,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:57:56,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 10:57:57,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 10:57:57,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:57:57,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:57:57,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:57:58,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:58,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:57:58,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 10:57:58,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:57:58,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 10:57:58,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:57:58,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:57:58,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:57:59,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:59,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:57:59,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 10:57:59,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:57:59,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:00,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:00,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 10:58:01,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 10:58:01,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:01,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 10:58:02,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 10:58:02,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:58:02,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:02,621 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:02,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 10:58:02,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:58:02,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:03,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 10:58:03,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:03,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:58:03,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 10:58:04,454 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 10:58:04,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:58:05,165 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 10:58:05,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:05,263 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 10:58:05,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:05,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:05,739 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 10:58:05,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:58:06,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 10:58:06,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:06,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:06,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:06,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:07,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:07,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:58:07,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 10:58:08,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 10:58:08,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:58:08,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 10:58:08,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 10:58:08,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:08,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:08,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:08,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:58:09,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:09,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:09,435 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 10:58:09,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:09,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:58:09,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 10:58:09,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:58:09,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 10:58:10,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:10,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 10:58:10,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 10:58:10,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 10:58:10,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:10,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:11,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:11,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 10:58:11,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 10:58:12,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:12,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:12,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 10:58:12,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:58:13,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 10:58:13,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 10:58:14,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:15,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:15,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:15,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:15,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 10:58:15,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:58:15,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:58:15,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:16,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 10:58:16,047 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 10:58:16,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:17,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 10:58:17,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:17,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:18,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 10:58:18,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:58:18,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:18,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:58:18,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:58:18,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:58:19,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:19,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 10:58:19,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 10:58:19,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 10:58:19,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:20,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 10:58:20,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:20,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:20,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:22,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 10:58:22,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:22,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 10:58:22,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:22,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 10:58:23,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 10:58:24,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:24,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 10:58:24,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:24,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:24,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:58:24,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 10:58:25,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 10:58:25,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:25,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:25,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:26,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 10:58:26,247 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:58:26,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:58:26,590 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:58:27,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:27,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:27,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 10:58:27,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:58:27,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 10:58:29,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:29,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:58:29,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:29,629 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 10:58:29,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 10:58:29,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 10:58:29,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 10:58:30,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:30,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:30,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:30,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:58:30,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:30,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 10:58:31,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:31,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:31,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:58:31,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:58:31,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 10:58:31,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 10:58:32,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:58:32,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 10:58:33,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 10:58:33,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:33,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 10:58:34,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:34,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:34,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:34,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:34,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:58:34,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:35,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:35,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:36,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:58:36,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:58:36,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:36,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:58:36,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:58:36,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 10:58:37,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:58:37,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 10:58:37,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 10:58:37,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 10:58:37,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:37,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:58:37,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:37,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:37,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 10:58:38,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:38,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:58:38,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:38,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 10:58:39,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:39,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:58:39,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 10:58:39,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:58:39,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:58:39,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:39,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:58:39,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:58:39,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 10:58:40,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 10:58:41,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:41,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:58:43,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:58:43,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:58:43,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:58:43,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:43,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 10:58:43,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:58:43,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:44,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 10:58:44,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 10:58:44,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 10:58:44,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 10:58:44,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:45,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 10:58:45,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:58:46,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:46,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:46,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:58:46,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 10:58:46,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 10:58:46,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:58:47,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:47,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 10:58:47,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:47,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:47,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:47,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:47,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:47,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:58:47,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:47,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:58:47,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:48,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:48,499 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 10:58:49,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:49,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:49,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 10:58:50,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:50,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:51,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 10:58:51,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 10:58:51,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:58:51,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:58:51,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:52,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 10:58:52,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:52,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 10:58:52,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:52,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:58:53,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 10:58:53,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 10:58:53,721 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:58:53,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 10:58:54,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:58:55,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:58:55,208 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:56,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:56,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:58:56,534 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:58:56,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:57,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:58:57,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:58:57,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 10:58:57,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:58:57,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 10:58:57,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:58:58,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:58,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:58:58,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:58:58,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 10:58:58,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:58:59,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:58:59,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:58:59,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:59:00,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:00,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 10:59:00,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:00,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 10:59:00,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:00,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 10:59:00,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:59:00,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:59:00,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 10:59:01,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:01,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 10:59:01,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:02,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 10:59:02,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:03,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:03,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:03,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:03,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:59:04,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 10:59:04,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 10:59:04,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:04,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:05,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 10:59:05,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 10:59:05,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 10:59:05,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:05,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:05,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:05,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:59:06,609 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 10:59:06,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:59:06,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:07,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 10:59:07,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 10:59:07,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 10:59:07,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:59:07,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 10:59:08,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 10:59:09,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:59:09,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 10:59:09,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:09,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:09,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:59:10,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 10:59:10,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:59:10,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:10,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 10:59:10,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:11,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:11,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:59:11,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:11,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:11,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:59:11,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:11,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:12,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:59:12,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:59:13,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:59:13,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 10:59:13,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 10:59:13,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 10:59:14,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:14,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 10:59:14,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 10:59:15,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 10:59:15,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 10:59:16,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:16,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:17,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 10:59:17,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:17,548 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 10:59:17,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:59:18,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 10:59:18,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:18,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:59:18,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:18,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 10:59:18,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:19,064 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:19,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 10:59:19,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 10:59:19,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:20,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:59:20,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 10:59:20,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 10:59:20,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 10:59:20,504 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 10:59:20,628 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 10:59:20,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:59:20,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:20,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:59:20,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:21,089 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 10:59:21,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:59:21,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:21,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:59:21,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:59:21,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:21,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 10:59:21,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:21,954 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 10:59:21,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 10:59:22,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:22,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:23,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 10:59:23,795 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 10:59:23,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 10:59:24,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 10:59:24,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:24,226 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 10:59:24,296 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 10:59:24,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 10:59:25,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:59:25,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 10:59:25,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 10:59:26,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 10:59:27,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 10:59:27,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:27,418 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 10:59:27,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 10:59:27,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 10:59:27,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 10:59:27,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:28,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 10:59:28,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 10:59:28,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:28,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 10:59:29,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:59:29,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 10:59:29,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:59:31,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 10:59:31,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:59:31,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:31,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:31,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 10:59:31,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 10:59:31,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 10:59:31,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:31,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:59:32,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:59:32,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:32,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 10:59:32,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:32,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:59:33,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 10:59:33,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:33,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:59:33,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 10:59:33,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 10:59:33,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:33,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:34,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:59:34,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:34,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:34,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:34,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 10:59:34,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 10:59:34,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 10:59:34,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:35,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:35,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 10:59:35,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:35,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 10:59:35,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 10:59:35,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 10:59:35,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:36,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:37,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:37,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:37,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 10:59:38,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:59:38,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:59:38,427 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 10:59:38,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:39,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:39,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 10:59:39,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:59:40,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:40,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:40,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:40,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:41,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 10:59:41,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:41,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 10:59:41,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 10:59:41,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:41,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 10:59:41,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:59:42,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 10:59:42,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 10:59:42,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:42,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 10:59:43,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:44,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 10:59:44,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 10:59:44,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:44,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 10:59:44,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 10:59:44,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 10:59:45,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:45,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 10:59:45,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:45,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:45,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:45,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 10:59:45,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 10:59:46,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 10:59:46,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:59:46,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:46,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 10:59:46,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:46,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:47,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:59:47,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 10:59:47,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 10:59:47,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:59:47,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:48,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:48,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 10:59:48,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:59:48,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:48,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:48,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:48,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 10:59:48,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:49,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:50,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 10:59:50,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 10:59:50,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 10:59:50,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:51,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:51,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 10:59:51,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:52,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:52,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:52,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 10:59:52,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 10:59:52,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:52,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:52,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 10:59:53,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:53,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 10:59:53,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 10:59:54,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 10:59:54,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 10:59:54,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 10:59:54,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 10:59:54,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 10:59:54,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 10:59:55,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:59:55,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:56,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 10:59:56,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 10:59:57,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 10:59:57,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 10:59:57,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:57,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:57,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 10:59:58,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 10:59:58,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 10:59:58,417 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 10:59:58,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 10:59:58,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 10:59:59,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 10:59:59,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 10:59:59,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:00,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:00:00,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:00:00,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 11:00:01,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:01,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:00:01,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:00:02,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 11:00:03,278 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 11:00:03,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:03,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:03,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:00:03,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:03,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 11:00:03,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 11:00:03,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:00:04,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:00:04,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:04,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:04,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:04,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 11:00:05,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:00:05,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 11:00:05,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 11:00:05,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:05,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:00:05,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 11:00:05,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:00:06,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 11:00:06,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:00:06,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:06,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:07,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:00:07,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 11:00:07,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:07,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 11:00:07,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 11:00:07,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:07,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 11:00:07,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 11:00:07,740 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 11:00:08,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:00:08,309 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:00:08,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:00:08,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:00:09,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:09,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:09,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 11:00:09,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:09,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:09,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:10,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 11:00:10,277 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 11:00:10,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 11:00:10,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:00:11,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:00:11,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 11:00:11,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:12,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:00:12,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:12,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:00:12,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 11:00:12,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:00:12,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:12,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:00:12,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:00:12,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:13,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 11:00:13,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 11:00:13,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:13,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:13,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:00:13,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:00:13,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:00:14,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 11:00:14,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:14,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:15,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:00:15,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:15,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:00:15,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:15,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:15,746 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:00:16,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:16,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 11:00:17,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:17,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:00:17,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:17,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:17,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:17,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:00:18,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:18,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:00:18,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:18,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 11:00:18,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:00:18,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:18,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:18,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:00:19,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:19,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:19,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:00:19,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:19,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 11:00:19,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:00:20,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:20,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:20,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:20,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:00:20,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:20,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:20,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 11:00:20,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 11:00:20,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:00:21,025 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 11:00:21,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:21,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:00:21,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 11:00:21,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:00:21,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 11:00:21,392 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 11:00:21,392 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 11:00:21,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 11:00:21,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:21,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:21,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:00:21,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:22,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:00:22,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:22,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:23,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:00:23,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 11:00:24,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:25,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:25,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:00:25,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:25,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:00:25,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:25,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:25,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 11:00:26,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 11:00:26,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:00:27,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 11:00:27,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:28,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:00:28,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:00:28,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:00:28,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 11:00:29,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:00:29,743 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:29,856 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 11:00:30,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:00:31,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:31,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:31,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:32,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 11:00:32,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:32,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 11:00:32,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:32,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:00:32,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:33,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:00:33,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:33,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:33,432 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:33,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:00:33,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:34,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:00:34,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 11:00:34,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:35,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:00:35,430 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 11:00:35,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:00:35,797 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 11:00:35,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:00:36,002 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 11:00:36,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:36,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:00:36,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:36,659 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 11:00:36,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:37,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:00:37,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:00:38,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:00:39,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:39,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:00:39,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:00:39,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 11:00:39,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:39,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:00:39,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 11:00:40,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:40,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:40,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:00:40,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:41,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:00:41,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:00:41,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 11:00:41,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:41,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:00:42,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:42,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:42,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:42,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:43,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:43,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:43,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:00:44,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:00:44,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:00:44,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:00:44,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:00:45,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:00:46,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:00:46,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 11:00:46,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:46,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:00:46,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 11:00:47,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:00:47,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:47,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:47,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:00:48,245 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 11:00:48,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:49,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:00:49,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:00:49,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:49,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:49,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 11:00:49,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:00:50,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:00:50,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:00:50,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:00:50,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:00:51,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:52,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:00:52,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:00:52,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:00:53,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:00:53,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:00:53,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:00:53,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:00:53,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 11:00:54,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:00:54,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:54,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:54,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:00:54,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:54,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 11:00:54,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:00:54,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 11:00:54,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:00:54,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:54,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 11:00:55,623 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:00:56,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:00:56,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:00:56,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:00:56,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:00:56,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:00:56,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:00:57,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:00:57,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 11:00:57,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:00:57,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 11:00:59,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 11:00:59,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:00:59,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:59,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:00:59,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:01:00,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:00,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 11:01:00,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:01,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 11:01:01,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:01:01,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:01:01,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:01:02,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:01:02,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 11:01:02,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:01:02,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:02,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:02,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:03,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:01:03,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 11:01:03,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:01:03,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:03,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:04,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 11:01:04,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:01:04,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 11:01:04,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:01:05,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 11:01:06,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 11:01:06,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:06,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:01:06,343 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 11:01:06,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 11:01:06,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 11:01:07,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:07,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:01:08,050 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:01:08,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:01:08,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 11:01:08,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 11:01:09,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:01:09,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:09,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 11:01:09,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:01:09,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:09,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 11:01:10,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:10,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 11:01:11,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:01:12,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 11:01:12,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:13,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:13,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:13,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 11:01:13,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:01:14,554 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:14,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:15,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:15,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:01:15,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:01:15,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:15,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:15,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:15,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:01:15,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:16,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:01:16,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 11:01:16,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 11:01:16,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:16,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:16,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 11:01:16,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 11:01:16,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 11:01:16,799 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 11:01:16,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 11:01:17,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:17,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:17,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:17,431 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 11:01:17,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:17,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:01:18,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:01:18,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:19,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:19,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:19,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 11:01:19,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:20,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:20,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:01:20,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:01:20,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:01:20,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 11:01:21,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:21,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:01:21,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:21,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:01:21,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:01:22,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:01:22,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:22,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 11:01:22,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:01:23,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:23,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:23,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:23,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:01:23,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:24,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:01:24,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 11:01:24,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:01:24,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:01:25,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:25,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:26,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:01:26,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 11:01:26,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:01:26,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:26,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 11:01:26,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:26,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:01:27,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:01:27,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:27,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:01:28,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 11:01:28,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:01:29,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:30,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:01:30,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:30,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:30,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 11:01:31,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:01:31,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:31,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:01:31,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:01:31,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 11:01:31,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:32,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:32,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 11:01:32,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:32,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 11:01:32,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:33,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:33,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:33,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:01:33,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 11:01:33,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:34,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:34,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:35,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:35,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:01:36,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:01:36,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 11:01:36,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:36,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:01:36,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:01:36,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:01:36,978 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 11:01:36,979 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 11:01:36,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 11:01:37,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:37,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 11:01:37,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 11:01:37,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:01:37,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 11:01:38,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 11:01:39,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:39,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:39,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:01:39,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:39,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 11:01:40,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:01:40,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 11:01:40,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:01:40,975 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:41,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:41,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 11:01:41,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:01:41,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:41,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:41,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:01:41,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 11:01:41,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:01:41,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:41,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 11:01:43,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:01:44,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:44,108 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:44,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:01:44,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:01:45,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:01:46,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:01:46,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:01:46,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:01:46,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:01:46,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:01:46,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:46,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:47,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:47,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 11:01:47,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:01:47,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:47,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:01:47,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:01:47,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:48,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:48,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:49,333 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 11:01:49,708 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 11:01:49,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:01:49,788 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 11:01:49,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 11:01:49,953 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 11:01:50,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:50,339 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 11:01:50,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 11:01:50,713 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 11:01:50,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:01:51,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 11:01:51,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 11:01:51,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:01:51,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 11:01:51,913 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 11:01:51,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 11:01:53,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:53,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:53,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:53,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 11:01:53,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:01:54,510 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 11:01:55,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:55,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:55,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 11:01:55,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:55,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:01:55,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 11:01:56,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:01:56,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:01:56,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:56,808 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 11:01:56,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:01:56,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:01:57,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:01:57,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:01:57,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 11:01:57,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:01:58,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:01:58,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:01:58,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 11:01:58,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:01:59,859 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:02:00,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 11:02:00,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:00,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:02:00,656 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 11:02:00,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:00,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:01,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:02:01,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:02:01,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:01,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 11:02:01,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:02:01,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:02:02,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 11:02:02,481 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 11:02:02,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:03,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 11:02:03,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:03,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 11:02:03,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:03,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:02:03,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:03,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:04,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 11:02:04,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 11:02:04,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:02:05,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 11:02:05,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:05,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:05,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:02:05,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:05,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:06,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:06,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:06,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:07,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:02:07,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:02:07,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:07,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:02:07,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:07,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:07,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:02:08,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:08,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:02:08,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:08,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 11:02:08,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:09,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:09,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:09,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:09,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:02:09,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:09,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:09,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 11:02:10,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:02:10,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 11:02:10,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:10,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:10,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:11,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:02:11,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:11,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:02:11,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:02:11,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 11:02:11,345 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:02:11,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:02:11,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:02:11,816 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:12,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:02:12,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 11:02:12,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:02:13,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:02:13,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:14,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:02:14,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:14,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:02:14,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:02:14,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:15,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:15,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:02:15,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:02:15,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:15,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:02:16,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:02:17,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:17,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:02:17,515 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:17,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:18,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:18,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:18,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:18,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:18,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:02:18,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:02:19,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:19,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:20,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 11:02:20,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:20,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:02:20,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 11:02:20,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 11:02:20,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:21,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:21,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:21,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:21,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:02:21,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:22,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:22,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:02:22,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:02:22,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:22,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 11:02:22,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:02:22,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:22,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 11:02:23,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:23,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:23,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:02:23,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:02:23,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:23,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:02:23,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:23,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:24,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:02:24,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:02:24,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:02:24,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:24,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:02:25,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:26,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:02:26,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:27,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:27,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:02:27,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:02:28,609 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:02:28,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:02:28,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 11:02:29,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:29,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 11:02:30,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:02:30,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:02:30,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 11:02:30,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:31,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:02:31,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 11:02:31,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:02:31,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 11:02:31,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:31,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:31,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 11:02:31,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:31,975 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:32,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:32,280 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 11:02:32,281 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 11:02:32,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:33,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:02:33,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:33,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:34,873 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 11:02:35,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 11:02:35,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 11:02:35,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:35,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:02:35,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:36,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:02:36,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:36,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:02:36,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:37,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:02:37,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:38,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:38,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:38,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:38,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:02:38,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 11:02:39,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:39,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:02:39,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:02:39,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:39,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:40,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:02:40,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:02:40,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:40,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:02:40,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:02:40,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:02:41,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:41,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 11:02:41,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:41,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:42,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:42,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 11:02:42,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:42,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:02:42,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:02:42,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 11:02:43,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:02:43,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:02:43,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:02:43,845 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:44,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:02:44,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:44,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:02:44,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:44,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:45,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:02:45,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 11:02:46,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 11:02:46,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:46,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 11:02:46,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:46,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 11:02:46,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 11:02:46,995 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:49,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:49,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:02:49,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:02:49,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:02:49,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:02:49,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:02:49,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:02:49,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 11:02:50,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:02:50,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:50,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:50,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:50,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:50,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:51,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:51,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:02:51,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:51,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:51,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:52,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:02:52,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:02:52,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 11:02:52,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 11:02:53,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:02:53,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:53,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 11:02:53,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:02:53,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:53,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:53,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:02:53,586 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 11:02:53,651 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 11:02:53,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:02:53,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:54,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:02:54,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:02:54,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:02:54,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 11:02:55,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:55,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 11:02:55,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 11:02:56,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:02:56,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:02:56,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:56,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:02:57,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:02:57,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:02:57,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:02:57,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 11:02:57,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:02:57,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:58,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 11:02:58,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 11:02:58,623 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:02:58,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 11:02:58,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:02:58,989 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:02:58,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:02:59,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:02:59,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:02:59,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:00,277 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:00,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 11:03:00,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 11:03:00,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:01,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:03:01,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 11:03:01,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:03:02,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:03,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:03:03,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:03:04,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 11:03:04,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:04,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 11:03:04,584 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:04,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:03:05,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:05,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 11:03:05,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:05,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:06,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:06,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:03:06,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 11:03:06,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 11:03:06,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:03:06,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:07,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:03:07,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:03:07,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:07,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:08,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:08,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:03:08,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:08,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:08,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:03:08,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 11:03:10,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 11:03:10,225 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 11:03:10,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:03:10,567 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 11:03:10,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 11:03:10,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:10,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:03:10,876 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 11:03:10,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:03:11,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 11:03:11,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:03:11,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:03:11,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:11,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:03:12,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:12,065 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 11:03:12,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:12,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 11:03:12,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:13,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:03:13,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 11:03:13,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:13,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 11:03:13,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:13,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:13,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:03:14,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:14,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 11:03:14,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:14,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:14,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:03:14,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:03:14,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:14,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:03:15,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:15,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 11:03:15,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:15,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:15,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:03:16,727 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 11:03:16,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 11:03:17,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:03:17,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:03:17,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 11:03:17,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:18,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:03:19,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:03:20,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 11:03:20,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:03:20,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:03:20,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:20,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:20,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:20,889 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 11:03:21,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 11:03:21,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:21,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:03:21,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:03:21,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:03:21,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:21,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:22,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:03:22,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:22,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:03:23,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:03:23,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 11:03:23,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:03:23,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:23,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:03:24,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:24,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:24,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 11:03:24,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 11:03:24,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:24,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 11:03:24,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:03:24,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 11:03:25,310 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:03:25,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:03:25,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 11:03:25,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 11:03:25,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:03:25,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:25,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:25,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:03:25,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:26,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:26,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 11:03:26,638 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:03:26,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:26,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:03:27,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:27,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 11:03:28,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 11:03:28,542 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 11:03:28,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:29,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:03:30,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:30,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:30,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:30,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:03:30,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:03:30,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:30,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:30,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:30,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:31,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:31,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:31,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 11:03:31,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:31,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:03:32,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:03:32,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:03:32,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:32,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:32,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:33,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:33,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:33,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:33,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:34,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:03:34,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:03:34,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:03:34,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 11:03:34,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:03:34,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:34,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 11:03:34,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:35,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:36,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:03:36,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:03:37,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 11:03:37,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 11:03:37,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 11:03:37,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:03:38,098 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:38,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:38,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:03:38,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:39,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 11:03:40,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:03:40,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:40,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:40,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:40,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 11:03:40,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:40,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 11:03:40,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:03:40,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:41,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 11:03:41,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:41,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:03:41,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 11:03:41,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 11:03:41,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:42,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:42,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:42,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:42,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:03:43,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:03:43,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:03:43,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:03:43,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:43,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:43,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 11:03:44,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:44,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 11:03:44,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:03:44,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 11:03:44,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:44,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:44,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 11:03:46,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 11:03:46,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:46,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:46,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:46,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:03:46,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 11:03:47,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:47,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:03:47,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 11:03:47,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:47,771 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 11:03:48,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 11:03:48,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:48,372 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 11:03:48,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 11:03:48,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 11:03:48,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 11:03:48,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 11:03:48,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:48,794 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:49,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:03:49,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 11:03:50,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:50,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:50,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:50,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:03:50,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 11:03:51,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:03:51,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:03:51,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:03:51,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 11:03:51,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 11:03:51,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:03:52,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:03:52,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:03:52,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:52,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:03:52,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:03:52,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:03:52,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 11:03:52,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:03:52,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:52,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:03:52,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:03:52,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 11:03:53,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:53,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 11:03:53,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:53,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 11:03:53,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 11:03:53,873 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:03:53,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:54,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 11:03:54,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 11:03:54,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:03:54,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:03:54,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:54,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:03:55,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:03:55,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:56,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 11:03:57,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:03:57,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:03:57,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:03:57,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:03:57,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 11:03:58,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:03:58,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:03:59,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:00,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:04:01,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 11:04:01,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:01,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 11:04:02,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:04:03,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:04:03,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:04:03,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:04:03,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 11:04:03,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 11:04:04,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 11:04:04,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 11:04:04,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:04:05,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:05,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:04:05,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:05,823 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 11:04:05,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:04:06,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:06,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 11:04:06,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 11:04:06,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 11:04:06,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 11:04:07,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:04:07,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:04:07,440 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 11:04:07,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:07,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:07,665 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 11:04:08,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:04:08,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:10,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:10,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 11:04:10,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:10,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:10,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:10,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:04:10,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:04:10,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:10,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:04:10,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:10,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:10,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:10,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:11,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:11,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:04:11,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:11,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:12,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:12,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:12,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:04:12,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 11:04:12,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:12,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:04:13,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:13,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:04:13,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:04:14,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:14,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:14,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 11:04:14,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:04:14,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 11:04:14,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:15,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 11:04:15,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 11:04:15,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:04:15,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:15,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:15,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:04:16,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:16,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:16,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:17,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 11:04:17,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:17,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:04:17,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 11:04:17,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:17,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 11:04:18,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 11:04:18,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 11:04:18,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:18,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:18,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:04:18,922 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:19,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:04:19,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:04:19,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:19,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:20,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 11:04:20,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:20,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:20,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:20,731 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 11:04:20,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:20,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:04:21,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:04:21,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:21,150 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 11:04:21,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:21,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:04:21,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:21,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 11:04:22,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 11:04:22,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:22,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:04:22,567 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 11:04:23,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 11:04:23,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:04:23,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 11:04:24,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:04:24,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:04:24,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:04:24,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:24,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:24,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:24,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:04:25,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:04:25,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:25,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:04:25,500 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 11:04:25,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 11:04:25,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:04:25,995 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:25,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:26,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:26,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:26,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:04:26,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:26,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:04:26,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:26,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:04:27,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 11:04:27,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:27,432 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:27,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:04:27,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:04:27,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:28,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:04:28,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:28,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:04:28,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:28,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:04:29,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:29,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:04:30,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:30,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:04:30,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 11:04:30,640 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 11:04:30,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:31,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 11:04:31,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 11:04:31,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:04:31,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:04:31,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:31,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 11:04:31,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:31,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:04:31,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:32,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:32,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:32,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:32,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:33,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:33,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:33,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:33,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:34,010 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:34,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:34,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:34,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 11:04:34,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:04:34,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 11:04:34,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:04:34,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 11:04:35,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:35,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:36,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:36,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 11:04:36,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:04:37,043 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:04:37,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:37,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:38,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 11:04:38,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:04:38,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:04:38,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:38,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 11:04:38,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:38,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 11:04:38,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:38,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:39,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:04:39,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:04:39,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 11:04:39,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 11:04:39,791 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 11:04:39,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:40,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:04:40,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:04:40,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:40,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:04:41,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:41,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 11:04:42,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:04:42,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:42,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:42,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:04:43,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:04:44,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 11:04:45,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:45,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:45,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 11:04:45,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:45,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:45,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:45,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:04:45,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:46,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:04:46,399 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:47,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:47,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 11:04:47,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:04:48,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 11:04:48,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 11:04:48,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:49,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:04:49,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 11:04:49,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:04:49,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:04:51,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:04:51,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:51,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:04:51,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:51,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:52,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 11:04:53,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 11:04:53,078 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:04:53,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:04:53,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:53,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 11:04:54,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:04:54,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:54,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:04:54,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:04:54,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:04:55,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 11:04:55,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:55,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:04:55,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:04:56,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 11:04:57,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:04:57,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:04:57,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:58,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:58,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:04:58,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:04:58,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:04:58,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:04:59,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:04:59,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:04:59,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 11:05:00,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:05:00,637 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 11:05:00,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:00,991 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 11:05:01,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:05:01,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:01,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:05:01,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:05:01,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:05:02,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:02,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:05:02,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 11:05:02,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:02,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:05:02,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:03,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:03,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 11:05:03,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:05:04,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:04,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:05:04,687 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:04,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:05:04,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:05,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 11:05:05,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 11:05:05,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 11:05:05,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:05,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:05,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:05:05,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:05:06,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:05:06,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:05:07,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:07,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 11:05:07,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 11:05:07,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:05:07,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:07,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:05:07,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:08,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 11:05:08,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:05:08,590 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:05:08,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 11:05:08,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 11:05:09,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:09,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:09,421 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:09,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:10,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:05:11,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:11,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 11:05:12,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:12,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:05:12,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:12,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:12,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:05:13,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:05:13,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:05:13,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:13,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:05:13,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:05:14,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:05:14,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:05:14,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:05:14,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:14,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:05:14,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 11:05:14,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:14,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:14,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 11:05:15,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:15,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:15,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:16,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 11:05:16,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:05:16,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 11:05:16,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:05:17,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:05:17,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:05:17,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 11:05:17,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:18,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:18,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 11:05:18,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:19,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:19,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 11:05:20,204 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 11:05:20,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:20,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:05:20,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:20,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:20,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:05:21,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:21,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:21,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:05:21,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:05:21,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:21,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 11:05:22,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:05:22,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:22,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:05:23,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:24,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:05:24,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:24,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 11:05:24,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:05:24,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:24,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:05:24,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:25,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:25,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:25,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 11:05:26,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:26,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:05:26,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 11:05:27,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:05:27,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:28,044 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:28,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:05:28,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:05:28,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 11:05:29,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 11:05:29,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 11:05:29,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:29,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:05:29,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 11:05:29,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:29,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:05:29,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:29,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 11:05:30,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 11:05:31,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:31,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 11:05:31,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 11:05:32,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:05:32,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 11:05:32,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 11:05:33,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:33,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:05:33,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:05:33,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:05:33,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:33,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 11:05:33,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:05:34,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:34,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 11:05:34,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:05:34,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:34,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:34,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:05:34,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 11:05:35,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 11:05:35,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:35,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 11:05:35,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:35,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:05:36,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:05:36,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:36,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:05:36,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:05:37,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:05:37,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:05:38,089 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:38,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:38,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:38,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:05:38,637 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:38,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:39,493 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 11:05:39,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:39,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:40,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:05:40,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:40,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:05:40,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:40,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 11:05:40,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:41,032 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:05:41,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:41,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:05:41,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:41,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 11:05:41,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:41,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:05:41,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:05:42,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:05:42,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:42,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:42,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:05:43,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:43,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:05:43,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:44,176 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 11:05:45,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:05:45,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:05:45,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:05:45,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 11:05:45,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:05:45,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:46,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 11:05:46,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:46,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:05:46,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:46,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:05:46,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:05:47,482 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:05:47,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 11:05:47,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:47,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 11:05:48,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:05:48,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:05:48,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:48,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 11:05:48,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:49,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:05:49,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:05:49,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:49,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:05:49,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 11:05:49,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 11:05:49,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:05:49,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:50,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:05:50,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:05:51,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:05:51,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:05:51,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:51,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 11:05:52,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 11:05:52,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:05:52,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 11:05:52,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:05:52,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:52,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:05:53,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:05:53,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:53,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:05:53,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:05:54,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:05:54,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:54,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 11:05:54,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:05:54,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:55,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:05:55,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 11:05:55,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 11:05:56,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:05:56,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:05:56,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:05:56,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:57,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:05:58,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 11:05:58,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:05:58,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:05:58,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:05:59,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:05:59,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:06:00,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:06:00,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:06:00,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:06:00,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:06:01,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:06:02,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:02,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:06:02,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 11:06:02,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:02,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:03,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:06:03,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 11:06:03,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:04,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:06:04,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:04,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:06:04,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:04,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 11:06:04,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 11:06:05,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:05,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:05,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:06:05,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:06:05,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:06:05,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:06,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:06,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:07,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:06:07,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 11:06:07,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:08,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:06:08,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:09,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 11:06:09,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 11:06:09,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:09,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:09,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:09,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 11:06:10,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 11:06:10,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 11:06:11,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:11,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:06:11,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:06:11,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:06:12,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:06:12,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 11:06:12,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:06:12,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:13,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:06:13,718 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:14,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:06:14,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 11:06:14,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:15,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:15,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:15,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:06:15,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:16,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:16,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:16,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:06:16,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:16,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:16,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:16,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:06:17,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 11:06:17,536 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 11:06:17,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:17,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:18,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:18,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:18,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 11:06:18,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 11:06:18,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:19,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 11:06:19,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:06:19,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:20,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:20,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:20,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 11:06:20,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 11:06:21,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:21,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:21,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:06:21,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:06:21,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:21,894 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:21,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:06:21,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 11:06:22,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:22,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 11:06:22,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:22,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:22,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:06:22,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:22,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:06:22,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:22,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:22,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:22,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 11:06:23,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:23,490 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:24,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:06:24,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:06:24,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:24,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:06:24,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:25,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:06:25,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 11:06:25,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:25,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 11:06:25,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:25,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 11:06:25,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 11:06:26,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:06:26,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:26,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:06:26,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:06:27,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:06:27,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:06:27,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:06:27,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:27,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:06:28,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:28,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:06:29,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:06:29,889 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:06:31,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:32,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:06:32,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 11:06:32,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 11:06:32,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:06:32,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 11:06:32,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:32,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 11:06:33,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:33,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 11:06:33,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:06:34,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:06:34,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:35,097 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 11:06:35,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:06:35,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 11:06:35,334 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 11:06:35,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:35,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:35,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:06:35,816 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:36,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 11:06:36,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:36,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:06:36,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:06:36,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:06:36,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:06:38,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:38,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:06:39,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 11:06:40,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 11:06:40,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 11:06:40,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:06:40,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:06:41,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:06:41,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:06:41,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:41,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:06:41,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 11:06:42,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:06:42,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:06:42,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 11:06:44,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:45,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:46,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:46,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:46,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:06:46,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 11:06:46,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:06:46,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 11:06:46,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:06:46,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 11:06:46,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:47,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:06:47,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:47,226 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:06:47,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:47,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:06:47,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:06:47,714 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 11:06:47,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:06:47,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:48,264 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 11:06:48,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:06:48,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:49,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 11:06:49,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:06:49,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:06:49,698 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 11:06:49,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:06:49,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 11:06:49,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:50,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:50,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:06:50,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:06:50,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:06:50,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:06:50,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 11:06:50,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:06:51,679 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 11:06:52,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:06:52,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 11:06:52,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:06:53,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:53,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:06:53,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:53,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:06:54,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 11:06:54,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 11:06:54,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:06:54,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:06:54,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:06:54,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:06:54,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:06:55,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:06:55,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:06:55,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 11:06:56,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:06:56,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:06:56,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:06:56,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:06:57,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 11:06:57,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 11:06:57,489 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 11:06:58,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:06:59,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 11:06:59,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:06:59,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:07:00,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:00,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:00,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:07:00,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:07:01,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 11:07:01,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:07:01,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:01,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 11:07:02,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:02,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 11:07:03,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:03,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:07:03,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 11:07:03,343 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 11:07:03,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:03,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:07:03,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:03,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:07:05,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 11:07:05,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 11:07:05,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 11:07:05,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 11:07:05,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:06,079 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:06,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:06,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:07:06,306 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 11:07:06,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:06,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:07:06,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:06,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:07:07,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:07:07,648 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:07,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:07,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 11:07:07,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:07,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:07:07,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:07:07,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:08,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 11:07:08,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:08,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 11:07:08,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:09,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:07:09,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 11:07:09,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:07:09,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:07:09,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:07:09,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 11:07:09,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:07:09,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:07:10,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 11:07:10,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:10,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:07:10,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:12,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:12,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:07:12,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:13,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:13,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:13,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:07:14,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:07:14,636 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:07:14,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:07:14,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:14,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:07:15,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 11:07:15,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:15,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 11:07:15,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 11:07:15,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 11:07:15,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:16,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:07:16,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:16,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:16,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:17,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:07:17,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:07:17,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:07:17,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:07:18,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:19,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:07:19,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 11:07:19,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 11:07:19,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:07:19,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 11:07:19,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:07:19,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:20,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:20,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:20,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 11:07:21,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:07:21,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:07:21,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 11:07:21,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:07:21,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 11:07:21,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:07:22,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:22,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:22,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 11:07:22,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:22,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:07:22,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:07:22,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 11:07:22,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:22,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:07:22,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:07:23,149 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 11:07:23,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:23,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 11:07:23,350 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:23,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:23,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 11:07:23,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:07:24,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:07:25,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 11:07:25,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:07:25,619 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:25,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:07:25,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:25,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:26,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 11:07:26,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 11:07:26,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:26,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:07:27,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:27,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 11:07:27,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:07:27,933 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:28,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 11:07:28,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:07:28,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:28,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:28,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:07:28,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:07:28,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 11:07:28,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:29,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:07:29,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:07:29,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:07:29,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:30,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:07:30,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 11:07:30,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:30,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:07:31,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:07:31,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:07:32,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:07:33,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 11:07:33,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:33,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:07:33,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:34,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 11:07:34,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:07:35,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:07:35,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:07:35,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:36,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:07:36,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 11:07:36,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:36,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:37,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:37,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:38,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:38,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:07:38,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:39,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:39,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:39,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:39,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:07:39,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:39,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 11:07:40,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 11:07:40,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:40,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:40,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:40,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:07:40,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:40,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:41,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:07:41,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:07:41,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:41,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:42,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 11:07:42,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:07:42,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 11:07:42,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:07:42,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:07:42,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:42,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:43,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 11:07:43,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:07:43,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:07:44,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:44,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:44,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:07:45,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:45,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:45,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:45,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:45,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 11:07:46,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:46,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:46,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:07:47,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:07:48,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:48,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 11:07:48,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:07:48,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:07:48,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:07:48,646 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 11:07:49,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:07:49,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:07:49,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 11:07:49,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:07:49,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 11:07:49,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 11:07:50,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:07:50,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:07:50,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:07:50,718 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:07:50,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:07:51,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:07:51,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 11:07:51,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 11:07:51,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:07:52,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:52,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:07:52,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:52,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:07:52,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 11:07:52,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 11:07:52,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 11:07:52,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:52,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 11:07:52,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 11:07:53,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:53,617 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 11:07:53,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:07:53,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:54,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:07:54,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 11:07:54,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:07:54,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:54,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:54,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:07:54,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:07:54,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:07:55,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:55,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:07:55,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:07:56,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 11:07:56,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:07:56,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:07:57,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:57,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:07:57,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:07:58,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:07:59,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:07:59,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:07:59,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:07:59,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:08:00,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:00,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:08:00,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 11:08:01,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:01,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:01,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:02,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 11:08:02,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:02,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:08:03,349 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 11:08:03,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:03,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:08:03,771 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 11:08:03,866 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 11:08:03,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:03,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:04,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:08:04,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:04,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:04,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:04,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 11:08:04,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:04,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:04,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:04,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 11:08:04,925 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 11:08:04,932 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 11:08:04,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 11:08:05,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:05,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:08:06,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:06,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:06,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 11:08:06,670 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 11:08:06,681 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:07,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:07,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:07,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:07,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 11:08:07,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 11:08:07,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 11:08:07,896 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 11:08:08,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:08:08,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:08,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 11:08:08,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:08,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:08,769 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 11:08:09,150 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:09,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 11:08:09,236 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 11:08:09,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 11:08:09,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 11:08:09,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 11:08:09,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:09,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:10,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:10,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:10,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 11:08:10,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 11:08:10,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:10,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:08:10,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:10,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:11,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:11,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 11:08:11,182 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 11:08:11,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:12,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:13,259 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 11:08:13,674 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:08:14,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:14,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:08:14,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 11:08:14,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:08:14,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:08:14,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:08:14,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:08:15,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 11:08:15,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 11:08:15,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 11:08:15,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:15,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 11:08:15,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:08:16,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:08:16,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 11:08:16,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:17,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:17,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:08:17,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:17,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:19,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:19,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:08:19,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:08:19,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:19,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 11:08:19,845 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:08:19,978 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:20,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:20,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:08:20,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:08:21,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:21,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:21,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:08:21,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:21,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:22,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:08:22,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 11:08:22,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 11:08:22,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:08:22,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:22,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 11:08:23,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:08:23,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:23,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 11:08:23,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:23,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:23,738 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:08:23,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:24,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:24,421 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:08:24,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 11:08:24,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:08:24,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:25,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:25,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:25,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 11:08:26,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:08:26,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 11:08:26,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:08:27,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:27,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 11:08:27,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 11:08:27,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:28,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:28,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:28,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:08:28,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:28,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:28,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:30,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:30,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:30,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:30,794 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:08:30,931 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:08:32,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:08:32,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:08:33,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:08:33,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:08:33,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 11:08:33,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:33,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:34,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:34,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:34,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:34,536 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 11:08:34,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:08:34,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:35,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:08:35,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:08:35,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:35,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:08:35,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:36,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 11:08:36,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 11:08:36,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 11:08:36,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 11:08:37,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 11:08:37,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:08:37,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:37,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:38,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:39,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:39,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:39,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:08:39,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:08:39,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:39,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:39,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:08:40,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:40,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 11:08:40,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 11:08:40,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:08:41,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 11:08:41,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 11:08:41,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:42,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 11:08:42,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:08:42,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:42,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:42,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:08:43,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 11:08:43,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:43,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:43,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:08:43,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:08:43,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 11:08:44,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 11:08:44,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:08:44,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 11:08:44,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 11:08:44,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:08:44,731 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:44,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:44,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:45,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:08:45,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:08:45,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 11:08:46,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:46,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 11:08:46,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 11:08:46,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:08:46,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 11:08:46,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:08:46,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:47,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:47,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:47,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:08:47,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:47,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:08:47,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:48,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:48,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:08:48,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:08:48,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:48,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 11:08:48,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:08:49,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:08:49,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:49,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:50,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 11:08:50,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:51,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:51,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:08:51,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:52,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 11:08:52,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:08:53,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:53,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:53,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:08:53,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:08:54,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 11:08:54,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:08:55,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:55,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:08:55,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:55,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 11:08:55,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:55,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 11:08:55,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:55,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:56,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:08:56,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:08:56,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:56,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 11:08:56,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 11:08:56,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 11:08:56,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:56,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:56,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:56,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:08:58,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:08:58,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:08:59,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:59,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:08:59,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:08:59,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:08:59,526 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:08:59,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 11:09:00,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:09:00,619 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 11:09:00,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:00,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 11:09:00,831 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:09:00,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 11:09:00,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 11:09:00,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:01,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:09:01,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:09:01,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:01,708 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 11:09:02,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:02,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 11:09:02,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:02,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:09:02,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 11:09:02,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:03,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:09:03,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:04,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:04,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:04,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:04,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:09:05,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 11:09:05,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 11:09:06,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 11:09:06,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 11:09:06,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:07,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:09:07,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:07,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 11:09:07,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:07,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:07,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:09:07,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:09:07,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:09:07,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:08,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:08,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:08,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:09:08,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:09:08,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 11:09:08,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:09:08,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 11:09:09,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:10,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:10,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:10,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:10,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:09:11,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 11:09:11,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 11:09:11,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:11,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:09:11,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:12,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:13,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:09:13,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 11:09:13,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:14,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 11:09:14,869 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:09:15,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:15,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 11:09:15,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:16,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:16,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 11:09:16,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:16,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:16,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:17,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:09:17,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 11:09:17,191 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 11:09:17,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:17,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:17,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:17,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 11:09:17,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:18,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 11:09:18,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:18,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:20,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:20,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:09:20,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 11:09:20,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:20,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 11:09:21,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:09:21,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:21,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:21,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 11:09:22,317 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:09:22,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 11:09:22,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:23,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:23,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:23,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:23,230 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 11:09:23,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 11:09:23,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 11:09:24,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:24,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:24,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:09:24,816 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 11:09:24,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:25,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:09:25,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:09:25,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 11:09:25,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 11:09:25,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:26,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:09:26,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:26,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:09:26,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 11:09:27,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 11:09:27,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:27,779 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:27,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 11:09:27,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:28,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:28,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:09:28,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:28,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:09:28,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:28,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 11:09:29,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 11:09:29,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 11:09:29,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:09:29,569 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:29,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 11:09:30,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:30,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:31,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:09:31,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:31,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:31,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 11:09:31,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:31,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:09:31,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:09:32,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:32,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:32,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:09:33,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:33,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 11:09:33,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:33,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:33,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:34,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:34,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:09:34,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:09:34,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:09:34,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:35,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 11:09:35,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 11:09:35,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:35,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:09:35,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:09:35,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:35,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:09:35,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:09:35,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:36,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:36,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:36,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:37,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 11:09:37,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:37,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:37,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:09:37,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:38,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:38,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:09:38,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:38,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:09:38,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:09:38,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:09:38,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:38,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:38,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:39,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 11:09:39,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 11:09:39,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:39,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:40,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:40,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:40,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:41,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 11:09:41,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:42,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:09:42,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:09:42,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:42,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:09:43,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:09:43,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:09:43,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 11:09:43,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:43,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:09:44,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:09:44,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:09:44,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 11:09:44,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:09:44,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:45,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:45,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 11:09:45,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 11:09:45,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:09:46,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:46,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 11:09:47,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:09:47,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:09:47,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:09:47,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 11:09:47,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:47,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 11:09:47,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:47,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:47,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 11:09:49,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:49,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:09:49,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:50,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 11:09:50,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:09:51,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:51,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:51,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:09:51,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 11:09:52,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 11:09:52,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 11:09:52,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 11:09:52,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:09:52,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:09:52,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:09:53,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:53,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:09:53,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 11:09:54,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 11:09:54,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:09:54,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:09:54,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:09:54,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:09:54,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:55,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:55,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 11:09:55,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:09:55,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:09:55,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 11:09:55,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 11:09:56,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 11:09:56,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:09:56,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:09:56,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:09:56,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:09:56,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 11:09:56,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:09:56,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 11:09:57,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:09:57,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:09:57,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:09:57,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 11:09:58,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:09:58,171 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 11:09:58,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 11:09:59,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:09:59,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:09:59,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 11:09:59,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:09:59,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:09:59,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:00,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:00,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:10:00,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:00,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 11:10:01,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:10:01,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:01,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:10:01,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 11:10:01,974 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 11:10:02,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:10:02,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 11:10:02,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:02,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:10:02,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:10:02,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:02,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:02,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:10:02,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:10:03,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:03,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:03,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:03,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:10:04,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:04,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:10:04,802 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:04,869 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:05,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:10:05,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 11:10:05,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 11:10:05,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:05,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:10:05,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:10:06,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:10:06,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:06,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:10:06,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:07,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:10:07,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:10:07,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:07,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:08,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 11:10:08,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:10:08,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:10:08,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:09,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:10:09,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:10:09,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:09,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:10:09,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:10:09,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:09,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:10:09,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:09,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 11:10:10,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:11,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 11:10:11,039 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:10:11,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:11,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:10:12,089 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 11:10:12,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 11:10:12,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:12,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 11:10:12,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:10:12,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:10:12,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 11:10:13,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:13,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:10:13,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 11:10:13,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:13,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:14,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 11:10:14,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 11:10:14,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:10:14,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 11:10:14,432 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:10:14,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:14,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:10:14,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 11:10:14,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 11:10:14,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 11:10:14,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:15,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:15,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 11:10:15,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:10:15,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:15,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:15,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 11:10:15,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 11:10:16,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:16,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:10:16,766 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 11:10:17,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:10:17,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:17,319 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:17,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 11:10:17,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:17,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:17,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:18,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 11:10:18,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:10:18,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:10:18,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:19,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 11:10:19,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:21,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:21,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:21,869 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:10:21,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:21,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:10:21,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:10:22,089 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:22,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:22,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 11:10:22,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:10:23,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:23,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:10:23,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 11:10:23,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:23,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:23,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:10:24,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:10:24,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:10:25,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 11:10:25,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:10:25,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:10:25,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 11:10:25,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:10:25,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:25,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:25,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:10:25,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 11:10:26,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 11:10:26,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:26,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:10:26,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:26,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 11:10:27,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:28,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 11:10:28,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:10:28,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:10:28,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:28,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:10:28,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:10:29,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:29,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:29,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:29,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:10:29,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 11:10:29,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:10:29,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:10:30,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:30,179 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 11:10:30,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:10:30,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:30,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:30,513 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 11:10:30,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:10:30,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 11:10:30,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:31,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:31,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:31,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 11:10:31,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 11:10:31,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:31,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:32,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:10:32,392 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 11:10:32,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:10:33,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 11:10:33,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 11:10:33,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:33,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:10:33,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:33,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 11:10:33,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 11:10:34,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:10:35,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:10:35,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:35,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 11:10:35,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:36,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:36,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 11:10:36,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:36,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:36,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 11:10:36,957 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 11:10:37,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:37,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 11:10:37,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 11:10:37,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:38,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:38,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 11:10:38,882 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 11:10:38,904 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 11:10:39,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 11:10:39,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:39,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 11:10:39,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 11:10:40,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 11:10:40,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:10:41,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 11:10:41,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:10:41,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 11:10:42,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:10:42,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:42,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:10:42,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:10:42,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:10:42,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:10:42,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 11:10:42,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 11:10:42,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 11:10:42,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:10:42,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 11:10:43,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:43,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 11:10:43,390 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:43,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:44,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:10:44,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 11:10:44,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:44,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:10:44,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:10:44,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:44,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:44,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:10:44,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:10:44,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 11:10:45,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:10:45,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:10:45,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:10:45,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 11:10:45,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:10:46,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:10:46,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 11:10:47,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:48,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:48,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:49,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:49,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:49,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 11:10:50,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:10:50,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:10:50,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:10:50,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:50,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:50,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 11:10:51,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:51,728 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:10:52,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:10:52,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:10:52,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:10:52,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:10:52,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:10:52,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:52,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:10:52,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:10:53,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:53,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 11:10:54,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:10:54,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:54,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:54,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:10:54,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:55,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 11:10:55,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:10:55,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:55,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 11:10:56,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:10:56,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:10:56,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 11:10:56,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 11:10:56,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 11:10:56,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:10:56,920 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 11:10:56,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:10:57,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:57,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:10:57,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 11:10:57,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:10:57,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:58,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 11:10:58,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 11:10:58,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 11:10:58,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 11:10:58,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:10:59,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:10:59,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:10:59,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 11:10:59,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:10:59,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:10:59,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:10:59,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:11:00,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:11:00,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:11:01,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:11:01,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:11:01,526 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:11:02,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:11:02,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 11:11:02,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:11:02,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:11:02,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:11:02,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:11:02,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:11:03,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:11:03,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:11:03,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:11:03,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:11:04,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:11:04,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:11:04,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:11:05,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:11:05,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 11:11:05,343 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:11:05,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:11:09,126 INFO [scaling.py:1022] (3/4) Whitening: name=None, num_groups=8, num_channels=256, metric=9.96 vs. limit=3.0 2023-09-28 11:11:28,545 INFO [train.py:1379] (3/4) Maximum memory allocated so far is 20251MB 2023-09-28 11:11:31,714 INFO [train.py:1379] (3/4) Maximum memory allocated so far is 20251MB 2023-09-28 11:11:36,035 INFO [train.py:1379] (3/4) Maximum memory allocated so far is 20251MB 2023-09-28 11:11:39,571 INFO [train.py:1379] (3/4) Maximum memory allocated so far is 20251MB 2023-09-28 11:11:51,708 INFO [train.py:1379] (3/4) Maximum memory allocated so far is 20251MB 2023-09-28 11:11:58,977 INFO [train.py:1379] (3/4) Maximum memory allocated so far is 20251MB 2023-09-28 11:12:16,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:12:16,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 11:12:16,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 11:12:16,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:12:17,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:12:17,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:12:17,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:12:17,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:12:17,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:12:17,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:12:17,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:12:18,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:12:18,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 11:12:18,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 11:12:18,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 11:12:18,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:12:18,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 11:12:19,079 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 11:12:19,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:12:19,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:12:19,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:12:19,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:12:20,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:12:20,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:12:20,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:12:20,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:12:20,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:12:20,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:12:20,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:12:20,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:12:20,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:12:22,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 11:12:22,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:12:22,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:12:22,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 11:12:22,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 11:12:22,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:12:23,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:12:23,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 11:12:23,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 11:12:23,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:12:23,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:12:24,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:12:24,419 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 11:12:24,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 11:12:24,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:12:24,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:12:24,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 11:12:24,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 11:12:25,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 11:12:26,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:12:30,489 INFO [train.py:1039] (3/4) Epoch 1, batch 0, loss[loss=9.359, simple_loss=8.502, pruned_loss=8.56, over 24262.00 frames. ], tot_loss[loss=9.359, simple_loss=8.502, pruned_loss=8.56, over 24262.00 frames. ], batch size: 56, lr: 2.25e-02, grad_scale: 1.0 2023-09-28 11:12:30,490 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-28 11:12:44,948 INFO [train.py:1071] (3/4) Epoch 1, validation: loss=9.318, simple_loss=8.466, pruned_loss=8.496, over 1125622.00 frames. 2023-09-28 11:12:44,950 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 20251MB 2023-09-28 11:12:47,905 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.31 vs. limit=7.5 2023-09-28 11:12:50,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 11:12:50,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:12:52,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:12:55,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=0.0, ans=0.5 2023-09-28 11:12:59,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:12:59,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:13:01,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:13:01,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 11:13:02,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 11:13:05,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=66.66666666666667, ans=5.041666666666667 2023-09-28 11:13:06,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:13:06,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:13:10,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:13:11,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:13:11,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:13:11,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:13:12,484 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.75 vs. limit=5.016666666666667 2023-09-28 11:13:13,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 11:13:14,595 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=20.10 vs. limit=7.55 2023-09-28 11:13:16,010 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=50.02 vs. limit=4.026666666666666 2023-09-28 11:13:17,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:13:26,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:13:26,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:13:28,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 11:13:33,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:13:33,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:13:36,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:13:38,731 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=483.51 vs. limit=7.575 2023-09-28 11:13:42,416 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=294.19 vs. limit=7.575 2023-09-28 11:13:43,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:13:43,862 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=506.28 vs. limit=7.65 2023-09-28 11:13:47,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:13:51,896 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=284.11 vs. limit=7.575 2023-09-28 11:13:54,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 11:13:57,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 11:13:57,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:13:57,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:13:59,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:14:00,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:14:02,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 11:14:03,264 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=63.07 vs. limit=7.7 2023-09-28 11:14:04,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:14:04,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:14:04,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=266.6666666666667, ans=0.8906666666666667 2023-09-28 11:14:07,738 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:14:11,230 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 11:14:15,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:14:16,933 INFO [train.py:1039] (3/4) Epoch 1, batch 50, loss[loss=1.253, simple_loss=1.115, pruned_loss=1.235, over 23364.00 frames. ], tot_loss[loss=3.846, simple_loss=3.537, pruned_loss=3.027, over 1080042.23 frames. ], batch size: 119, lr: 2.48e-02, grad_scale: 0.25 2023-09-28 11:14:19,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:14:21,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:14:21,841 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=33.48 vs. limit=7.625 2023-09-28 11:14:22,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 11:14:22,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:14:22,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:14:26,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:14:26,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:14:30,579 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=44.06 vs. limit=7.75 2023-09-28 11:14:30,630 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=19.75 vs. limit=7.75 2023-09-28 11:14:30,902 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=400.61 vs. limit=7.75 2023-09-28 11:14:31,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:14:35,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 11:14:35,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:14:36,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.min_positive, batch_count=400.0, ans=0.04875 2023-09-28 11:14:36,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=400.0, ans=0.0975 2023-09-28 11:14:44,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:14:44,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 11:14:46,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 11:14:50,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:14:51,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:14:51,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:14:51,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:14:52,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten.whitening_limit, batch_count=466.6666666666667, ans=7.675 2023-09-28 11:14:53,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:14:53,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:14:53,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:14:56,208 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=207.94 vs. limit=5.233333333333333 2023-09-28 11:15:02,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:15:04,542 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:15:04,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:15:04,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 11:15:06,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:15:08,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:15:08,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 11:15:09,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:15:10,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 11:15:14,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=533.3333333333334, ans=0.475 2023-09-28 11:15:19,001 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=217.36 vs. limit=5.266666666666667 2023-09-28 11:15:19,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:15:19,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:15:21,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:15:23,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:15:23,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:15:25,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 11:15:25,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 11:15:27,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:15:27,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:15:32,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:15:32,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:15:32,546 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=600.0, ans=0.471875 2023-09-28 11:15:34,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 11:15:35,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 11:15:35,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=600.0, ans=0.879 2023-09-28 11:15:36,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 11:15:36,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:15:38,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:15:40,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 11:15:40,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 11:15:40,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:15:42,117 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:15:42,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:15:44,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:15:46,694 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=76.65 vs. limit=7.95 2023-09-28 11:15:47,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:15:49,921 INFO [train.py:1039] (3/4) Epoch 1, batch 100, loss[loss=1.17, simple_loss=1.019, pruned_loss=1.216, over 23443.00 frames. ], tot_loss[loss=2.422, simple_loss=2.198, pruned_loss=2.078, over 1891375.30 frames. ], batch size: 285, lr: 2.70e-02, grad_scale: 0.5 2023-09-28 11:15:51,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:15:55,754 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 2.173e+02 3.855e+02 5.319e+03 2.503e+05, threshold=7.710e+02, percent-clipped=0.0 2023-09-28 11:15:55,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:15:57,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 11:15:59,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:16:02,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:16:04,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:16:04,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:16:04,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:16:04,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:16:06,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 11:16:06,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=733.3333333333334, ans=0.7573333333333333 2023-09-28 11:16:08,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:16:08,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:16:08,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:16:08,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:16:14,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 11:16:16,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:16:18,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:16:18,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:16:20,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:16:21,439 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.38 vs. limit=3.11 2023-09-28 11:16:25,375 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 11:16:25,414 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 11:16:27,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:16:27,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:16:31,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:16:35,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:16:39,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:16:46,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:16:46,280 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 11:16:48,974 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.26 vs. limit=8.15 2023-09-28 11:16:49,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 11:16:54,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:16:54,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:16:56,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:16:57,442 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=217.46 vs. limit=7.825 2023-09-28 11:17:01,704 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:17:05,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:17:05,855 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=6.28 vs. limit=4.373333333333333 2023-09-28 11:17:09,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:17:11,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:17:11,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:17:12,370 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.03 vs. limit=8.2 2023-09-28 11:17:13,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:17:13,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:17:13,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:17:14,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 11:17:14,859 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 11:17:17,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:17:17,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:17:18,126 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.30 vs. limit=8.2 2023-09-28 11:17:19,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:19,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:17:19,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 11:17:19,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:17:20,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:17:20,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:20,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:17:22,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:17:22,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:17:22,893 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=1000.0, ans=0.09375 2023-09-28 11:17:24,263 INFO [train.py:1039] (3/4) Epoch 1, batch 150, loss[loss=1.063, simple_loss=0.9001, pruned_loss=1.174, over 24272.00 frames. ], tot_loss[loss=1.852, simple_loss=1.657, pruned_loss=1.698, over 2523603.44 frames. ], batch size: 77, lr: 2.93e-02, grad_scale: 0.5 2023-09-28 11:17:24,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:17:25,171 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.47 vs. limit=8.25 2023-09-28 11:17:27,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:17:29,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:17:29,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:17:29,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:29,901 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=1000.0, ans=0.26749999999999996 2023-09-28 11:17:30,118 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 11:17:32,788 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=19.46 vs. limit=8.25 2023-09-28 11:17:36,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:17:36,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:36,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1000.0, ans=0.29 2023-09-28 11:17:41,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:17:42,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:45,414 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=23.28 vs. limit=7.9 2023-09-28 11:17:47,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 11:17:47,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 11:17:47,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 11:17:50,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:17:50,932 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:17:54,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:17:54,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:17:55,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:17:55,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:57,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:17:57,922 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 11:18:01,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:18:07,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:18:11,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:18:11,390 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 11:18:15,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:18:15,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:18:15,421 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:18:17,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:18:17,396 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=1133.3333333333333, ans=5.708333333333333 2023-09-28 11:18:18,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:18:22,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:18:22,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:18:22,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 11:18:31,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:18:31,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:18:32,068 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=12.71 vs. limit=5.6 2023-09-28 11:18:33,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:18:33,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:18:37,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:18:38,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 11:18:39,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=1266.6666666666667, ans=0.07150000000000001 2023-09-28 11:18:40,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:18:43,603 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.43 vs. limit=5.316666666666666 2023-09-28 11:18:44,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:18:46,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:18:47,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:18:47,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 11:18:49,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=1266.6666666666667, ans=7.975 2023-09-28 11:18:50,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:18:50,204 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 11:18:51,043 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.27 vs. limit=8.45 2023-09-28 11:18:54,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:18:58,377 INFO [train.py:1039] (3/4) Epoch 1, batch 200, loss[loss=0.9731, simple_loss=0.8205, pruned_loss=1.018, over 24346.00 frames. ], tot_loss[loss=1.535, simple_loss=1.359, pruned_loss=1.451, over 3004748.83 frames. ], batch size: 74, lr: 3.15e-02, grad_scale: 1.0 2023-09-28 11:18:59,219 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=16.43 vs. limit=8.0 2023-09-28 11:19:00,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:19:01,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:19:03,420 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 9.506e+01 1.160e+02 1.347e+02 1.565e+02 3.276e+02, threshold=2.693e+02, percent-clipped=0.0 2023-09-28 11:19:05,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 11:19:05,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:19:05,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:19:05,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=1333.3333333333333, ans=0.4375 2023-09-28 11:19:08,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=1333.3333333333333, ans=0.4375 2023-09-28 11:19:09,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 11:19:11,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:19:12,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:19:13,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:19:15,417 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.82 vs. limit=8.55 2023-09-28 11:19:18,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:19:18,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:19:18,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:19:19,058 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=32.45 vs. limit=8.025 2023-09-28 11:19:19,203 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=25.47 vs. limit=8.55 2023-09-28 11:19:23,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.out_whiten.whitening_limit, batch_count=1400.0, ans=4.28 2023-09-28 11:19:26,632 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.18 vs. limit=8.55 2023-09-28 11:19:36,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=1466.6666666666667, ans=0.2853333333333333 2023-09-28 11:19:37,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=1466.6666666666667, ans=0.43125 2023-09-28 11:19:40,317 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=14.13 vs. limit=8.05 2023-09-28 11:19:46,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:19:46,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:19:48,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:19:48,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:19:50,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:19:50,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:19:52,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:19:52,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:19:52,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:19:52,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:19:53,288 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=24.99 vs. limit=8.075 2023-09-28 11:19:53,571 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=28.65 vs. limit=8.075 2023-09-28 11:19:54,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 11:19:55,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 11:19:55,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:19:56,983 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=51.92 vs. limit=8.075 2023-09-28 11:20:00,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:20:08,376 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=11.27 vs. limit=8.075 2023-09-28 11:20:09,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:20:10,209 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=10.43 vs. limit=5.383333333333333 2023-09-28 11:20:11,761 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=14.13 vs. limit=8.1 2023-09-28 11:20:13,286 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=37.07 vs. limit=8.7 2023-09-28 11:20:18,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:18,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:20:22,738 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=33.35 vs. limit=8.1 2023-09-28 11:20:27,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:29,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 11:20:29,108 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:20:29,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:20:29,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:20:29,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:20:30,796 INFO [train.py:1039] (3/4) Epoch 1, batch 250, loss[loss=0.8855, simple_loss=0.7433, pruned_loss=0.8871, over 24313.00 frames. ], tot_loss[loss=1.34, simple_loss=1.174, pruned_loss=1.284, over 3376522.87 frames. ], batch size: 61, lr: 3.38e-02, grad_scale: 1.0 2023-09-28 11:20:31,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 11:20:32,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:20:32,812 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 11:20:33,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:34,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=1666.6666666666667, ans=0.421875 2023-09-28 11:20:36,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:20:38,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:40,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:20:41,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:20:43,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:20:45,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:20:51,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:20:52,533 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=141.51 vs. limit=8.15 2023-09-28 11:20:59,716 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=32.01 vs. limit=8.8 2023-09-28 11:21:02,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:21:06,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:21:07,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:21:13,880 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=24.05 vs. limit=8.85 2023-09-28 11:21:15,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:21:15,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:21:17,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:21:17,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:21:17,658 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=1800.0, ans=0.282 2023-09-28 11:21:19,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:21:19,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:21:19,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:21:23,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:21:23,348 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=1800.0, ans=0.415625 2023-09-28 11:21:25,514 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=156.49 vs. limit=8.2 2023-09-28 11:21:26,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 11:21:26,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:21:27,152 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=6.71 vs. limit=4.746666666666667 2023-09-28 11:21:29,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:21:29,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:21:29,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:21:30,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:21:30,539 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=17.81 vs. limit=8.2 2023-09-28 11:21:32,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:21:32,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:21:34,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:21:35,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:21:36,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:21:36,727 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=7.61 vs. limit=4.746666666666667 2023-09-28 11:21:42,313 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=26.42 vs. limit=8.2 2023-09-28 11:21:43,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:21:44,303 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.57 vs. limit=5.483333333333333 2023-09-28 11:21:46,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:21:50,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:21:56,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:21:57,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:22:02,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=2000.0, ans=0.40625 2023-09-28 11:22:03,735 INFO [train.py:1039] (3/4) Epoch 1, batch 300, loss[loss=0.872, simple_loss=0.7295, pruned_loss=0.8381, over 23404.00 frames. ], tot_loss[loss=1.205, simple_loss=1.047, pruned_loss=1.16, over 3662759.06 frames. ], batch size: 105, lr: 3.60e-02, grad_scale: 2.0 2023-09-28 11:22:03,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 11:22:03,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:22:05,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:22:07,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 11:22:07,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:22:09,483 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 8.573e+01 1.074e+02 1.349e+02 1.820e+02 4.135e+02, threshold=2.699e+02, percent-clipped=10.0 2023-09-28 11:22:09,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:22:09,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 11:22:10,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=2000.0, ans=0.40625 2023-09-28 11:22:13,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:22:13,594 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:22:17,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:22:17,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 11:22:19,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:22:20,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:22:20,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 11:22:20,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:22:26,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:22:27,003 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=27.24 vs. limit=8.275 2023-09-28 11:22:32,590 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:22:32,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 11:22:36,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 11:22:37,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:22:39,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:22:41,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:22:41,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 11:22:42,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:22:43,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:22:45,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=2133.3333333333335, ans=0.4 2023-09-28 11:22:47,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:22:48,151 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=40.56 vs. limit=8.3 2023-09-28 11:22:48,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:22:53,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:22:53,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 11:22:56,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:22:58,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:22:58,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=2200.0, ans=0.27799999999999997 2023-09-28 11:22:59,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 11:23:01,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:23:02,385 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=6.87 vs. limit=4.88 2023-09-28 11:23:08,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:23:10,250 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=13.43 vs. limit=8.325 2023-09-28 11:23:11,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:23:11,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 11:23:16,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:23:16,490 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:23:19,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:23:20,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:23:21,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 11:23:21,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:23:21,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:23:25,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 11:23:27,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:23:27,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:23:30,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:23:30,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:23:31,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:23:35,021 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=58.06 vs. limit=8.375 2023-09-28 11:23:35,586 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=14.65 vs. limit=8.375 2023-09-28 11:23:36,558 INFO [train.py:1039] (3/4) Epoch 1, batch 350, loss[loss=0.9127, simple_loss=0.7515, pruned_loss=0.8804, over 24608.00 frames. ], tot_loss[loss=1.113, simple_loss=0.9583, pruned_loss=1.071, over 3887623.72 frames. ], batch size: 68, lr: 3.83e-02, grad_scale: 2.0 2023-09-28 11:23:38,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:23:38,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 11:23:40,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:23:41,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=2333.3333333333335, ans=8.375 2023-09-28 11:23:41,281 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=13.49 vs. limit=5.583333333333333 2023-09-28 11:23:41,470 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.55 vs. limit=8.375 2023-09-28 11:23:43,310 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=33.40 vs. limit=8.375 2023-09-28 11:23:45,277 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=5.65 vs. limit=4.933333333333334 2023-09-28 11:23:48,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:23:48,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=2333.3333333333335, ans=0.08541666666666667 2023-09-28 11:23:49,374 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=29.51 vs. limit=8.375 2023-09-28 11:23:52,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:23:52,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:23:54,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=2400.0, ans=0.3875 2023-09-28 11:23:57,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 11:23:57,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:23:57,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 11:23:59,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=2400.0, ans=0.3875 2023-09-28 11:24:00,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:24:00,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 11:24:03,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:24:03,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=2400.0, ans=0.3875 2023-09-28 11:24:04,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 11:24:08,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:24:10,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:24:10,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:24:12,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:24:12,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:24:12,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:24:12,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:24:14,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:24:14,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:24:14,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:24:22,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=2466.6666666666665, ans=0.384375 2023-09-28 11:24:24,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:24:24,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:24:26,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:24:26,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:24:31,923 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=59.66 vs. limit=8.45 2023-09-28 11:24:32,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 11:24:32,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:24:35,641 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=19.54 vs. limit=8.45 2023-09-28 11:24:40,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:24:40,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:24:42,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:24:43,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 11:24:46,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:24:46,452 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 11:24:48,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 11:24:48,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:24:50,882 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=23.47 vs. limit=6.3 2023-09-28 11:24:53,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:24:53,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 11:24:53,953 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=25.91 vs. limit=8.475 2023-09-28 11:24:54,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:24:57,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:24:59,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:25:00,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:25:00,974 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:25:03,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:25:05,937 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=24.92 vs. limit=5.65 2023-09-28 11:25:08,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:25:10,565 INFO [train.py:1039] (3/4) Epoch 1, batch 400, loss[loss=0.8122, simple_loss=0.6719, pruned_loss=0.7402, over 23789.00 frames. ], tot_loss[loss=1.048, simple_loss=0.8941, pruned_loss=1.002, over 4065707.49 frames. ], batch size: 179, lr: 4.05e-02, grad_scale: 4.0 2023-09-28 11:25:10,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:25:11,552 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=19.50 vs. limit=8.5 2023-09-28 11:25:12,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 11:25:12,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:25:12,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:25:14,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:25:14,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:25:15,789 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 9.874e+01 1.367e+02 1.651e+02 2.389e+02 7.473e+02, threshold=3.302e+02, percent-clipped=14.0 2023-09-28 11:25:17,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:25:19,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:25:22,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 11:25:23,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 11:25:23,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:25:25,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 11:25:25,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:25:26,619 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.82 vs. limit=6.333333333333333 2023-09-28 11:25:29,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:25:29,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:25:30,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 11:25:31,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:25:31,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:25:31,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:25:31,348 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=2733.3333333333335, ans=0.0385 2023-09-28 11:25:33,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:25:35,750 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 11:25:37,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 11:25:42,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:25:44,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:25:44,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=2733.3333333333335, ans=0.5 2023-09-28 11:25:45,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 11:25:46,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 11:25:49,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:25:50,622 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.23 vs. limit=9.6 2023-09-28 11:25:51,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:25:57,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 11:26:02,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:26:04,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 11:26:08,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:26:09,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:26:09,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 11:26:15,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:26:17,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:26:19,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:26:22,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:26:22,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 11:26:24,350 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:26:24,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=2933.3333333333335, ans=0.03399999999999999 2023-09-28 11:26:27,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 11:26:29,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:26:29,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:26:33,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 11:26:35,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:26:37,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:26:37,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:26:40,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 11:26:40,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:26:40,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:26:40,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=3000.0, ans=0.359375 2023-09-28 11:26:42,309 INFO [train.py:1039] (3/4) Epoch 1, batch 450, loss[loss=0.9356, simple_loss=0.7597, pruned_loss=0.8634, over 24644.00 frames. ], tot_loss[loss=1.004, simple_loss=0.8491, pruned_loss=0.9513, over 4208940.24 frames. ], batch size: 68, lr: 4.28e-02, grad_scale: 4.0 2023-09-28 11:26:42,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:26:42,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 11:26:42,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:26:44,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:26:46,803 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=51.26 vs. limit=8.625 2023-09-28 11:26:48,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:26:52,936 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=3000.0, ans=0.359375 2023-09-28 11:26:54,891 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=18.12 vs. limit=8.625 2023-09-28 11:26:56,539 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=151.40 vs. limit=9.75 2023-09-28 11:26:56,826 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=13.99 vs. limit=8.625 2023-09-28 11:26:57,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:26:59,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:27:01,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 11:27:02,254 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.94 vs. limit=5.226666666666667 2023-09-28 11:27:03,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 11:27:08,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:27:10,245 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.08 vs. limit=6.533333333333333 2023-09-28 11:27:11,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:27:13,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:27:19,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:27:19,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:27:21,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 11:27:22,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 11:27:24,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 11:27:25,455 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.12 vs. limit=6.566666666666666 2023-09-28 11:27:26,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:27:28,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:27:29,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:27:31,160 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 11:27:31,846 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=19.79 vs. limit=8.675 2023-09-28 11:27:32,538 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 11:27:32,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:27:34,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:27:35,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:27:38,548 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.81 vs. limit=5.8 2023-09-28 11:27:38,765 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.90 vs. limit=3.48 2023-09-28 11:27:39,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:27:39,485 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:27:41,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 11:27:41,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 11:27:44,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:27:46,534 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:27:48,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:27:49,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 11:27:53,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:27:56,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 11:27:57,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 11:27:59,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:28:01,294 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=3266.6666666666665, ans=0.0775 2023-09-28 11:28:04,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:28:05,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:28:06,072 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.37 vs. limit=9.95 2023-09-28 11:28:09,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:28:09,110 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 11:28:12,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:28:13,986 INFO [train.py:1039] (3/4) Epoch 1, batch 500, loss[loss=0.8726, simple_loss=0.7061, pruned_loss=0.7834, over 24343.00 frames. ], tot_loss[loss=0.9723, simple_loss=0.8159, pruned_loss=0.9117, over 4309360.44 frames. ], batch size: 61, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:28:14,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:28:14,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:28:15,761 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 11:28:15,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 11:28:15,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:28:16,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=3333.3333333333335, ans=0.34375 2023-09-28 11:28:18,469 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=19.65 vs. limit=8.75 2023-09-28 11:28:19,322 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 9.903e+01 1.529e+02 1.913e+02 2.430e+02 4.167e+02, threshold=3.825e+02, percent-clipped=6.0 2023-09-28 11:28:19,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:28:26,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 11:28:27,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:28:30,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=3400.0, ans=8.775 2023-09-28 11:28:31,581 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:28:31,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:28:33,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:28:33,822 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=7.03 vs. limit=5.85 2023-09-28 11:28:37,390 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=3400.0, ans=0.340625 2023-09-28 11:28:41,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=3400.0, ans=6.7 2023-09-28 11:28:47,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:28:48,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:28:48,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:28:48,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:28:48,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 11:28:48,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:28:49,258 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.75 vs. limit=5.866666666666666 2023-09-28 11:28:52,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:28:53,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:28:53,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:28:53,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:28:53,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 11:28:55,200 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.81 vs. limit=10.1 2023-09-28 11:28:55,915 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 11:28:59,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:28:59,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:29:01,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:29:02,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:29:02,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:29:04,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 11:29:04,671 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=3466.6666666666665, ans=0.07 2023-09-28 11:29:08,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:29:10,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:29:14,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:29:19,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:29:20,981 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.92 vs. limit=6.766666666666667 2023-09-28 11:29:26,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:29:27,606 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.25 vs. limit=10.2 2023-09-28 11:29:30,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 11:29:30,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:29:31,353 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.73 vs. limit=8.85 2023-09-28 11:29:31,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:29:35,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 11:29:35,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:29:36,184 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.56 vs. limit=6.8 2023-09-28 11:29:36,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:29:43,571 INFO [train.py:1039] (3/4) Epoch 1, batch 550, loss[loss=0.8334, simple_loss=0.6744, pruned_loss=0.7239, over 23269.00 frames. ], tot_loss[loss=0.9487, simple_loss=0.7902, pruned_loss=0.8773, over 4403254.25 frames. ], batch size: 105, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:29:43,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 11:29:45,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 11:29:45,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:29:45,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 11:29:47,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:29:47,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:29:49,267 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:29:49,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:29:49,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:29:51,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:29:53,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:29:56,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 11:29:56,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:30:00,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:00,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:30:01,529 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.34 vs. limit=10.3 2023-09-28 11:30:04,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:30:04,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=3733.3333333333335, ans=0.26266666666666666 2023-09-28 11:30:05,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:30:09,775 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.28 vs. limit=5.933333333333334 2023-09-28 11:30:11,462 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=14.91 vs. limit=8.9 2023-09-28 11:30:12,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 11:30:12,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 11:30:14,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:30:19,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:30:21,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:30:21,963 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=28.24 vs. limit=8.925 2023-09-28 11:30:22,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:30:27,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:27,085 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 11:30:27,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:30:28,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 11:30:31,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=3800.0, ans=0.262 2023-09-28 11:30:32,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:30:35,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:30:35,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:30:37,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:38,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 11:30:40,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 11:30:40,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:30:40,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:30:42,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:30:42,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:30:45,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:30:45,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:30:48,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:30:50,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:51,021 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=16.81 vs. limit=8.95 2023-09-28 11:30:51,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 11:30:51,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:30:53,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:30:54,729 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.36 vs. limit=6.966666666666667 2023-09-28 11:30:55,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:30:55,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:30:55,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=3933.3333333333335, ans=0.05249999999999999 2023-09-28 11:30:57,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:30:58,845 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 11:30:59,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=3933.3333333333335, ans=0.315625 2023-09-28 11:31:04,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=3933.3333333333335, ans=0.7623333333333333 2023-09-28 11:31:08,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 11:31:12,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 11:31:14,899 INFO [train.py:1039] (3/4) Epoch 1, batch 600, loss[loss=0.7987, simple_loss=0.6465, pruned_loss=0.6731, over 23533.00 frames. ], tot_loss[loss=0.9288, simple_loss=0.7688, pruned_loss=0.8447, over 4471441.64 frames. ], batch size: 134, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:31:14,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:31:15,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:31:15,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:31:15,747 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=14.01 vs. limit=9.0 2023-09-28 11:31:21,140 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=19.18 vs. limit=9.0 2023-09-28 11:31:21,658 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.125e+02 1.678e+02 2.306e+02 3.262e+02 8.742e+02, threshold=4.612e+02, percent-clipped=14.0 2023-09-28 11:31:23,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:31:25,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:31:26,838 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 11:31:28,597 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:31:30,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:31:32,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:31:36,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 11:31:37,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:31:44,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 11:31:46,177 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=4066.6666666666665, ans=0.20933333333333332 2023-09-28 11:31:46,654 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=6.69 vs. limit=5.626666666666667 2023-09-28 11:31:46,722 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.93 vs. limit=10.55 2023-09-28 11:31:47,667 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=4066.6666666666665, ans=0.7576666666666667 2023-09-28 11:31:49,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:31:49,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:31:49,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:31:56,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:31:56,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:31:57,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:32:00,692 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.78 vs. limit=9.05 2023-09-28 11:32:01,633 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=4133.333333333333, ans=0.30625 2023-09-28 11:32:06,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:32:10,814 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=18.99 vs. limit=10.65 2023-09-28 11:32:11,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:32:11,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:32:11,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:32:15,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=4200.0, ans=0.792 2023-09-28 11:32:17,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 11:32:23,864 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.36 vs. limit=9.075 2023-09-28 11:32:24,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:32:24,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:32:29,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 11:32:29,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:32:33,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 11:32:33,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:32:33,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:32:35,924 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.48 vs. limit=10.7 2023-09-28 11:32:42,895 INFO [train.py:1039] (3/4) Epoch 1, batch 650, loss[loss=0.7963, simple_loss=0.6484, pruned_loss=0.6448, over 23736.00 frames. ], tot_loss[loss=0.9065, simple_loss=0.748, pruned_loss=0.8067, over 4531049.29 frames. ], batch size: 179, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:32:42,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 11:32:45,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 11:32:46,488 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.43 vs. limit=9.125 2023-09-28 11:32:48,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:32:48,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:32:52,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:32:52,642 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=15.22 vs. limit=10.75 2023-09-28 11:32:53,110 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=17.05 vs. limit=9.125 2023-09-28 11:32:55,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 11:32:56,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:33:03,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:33:03,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:33:05,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:33:09,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 11:33:10,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:33:10,869 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:33:15,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:33:15,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 11:33:19,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:33:19,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:19,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:33:21,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:22,278 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.83 vs. limit=6.116666666666667 2023-09-28 11:33:23,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:33:26,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:33:26,430 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 11:33:26,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:33:26,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:33:26,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=4466.666666666667, ans=0.0 2023-09-28 11:33:31,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:31,598 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=4466.666666666667, ans=0.290625 2023-09-28 11:33:33,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:33:33,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:33:33,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:33:35,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 11:33:37,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:33:37,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:33:39,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:33:39,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:33:40,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:33:42,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 11:33:42,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 11:33:42,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:42,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:33:42,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:33:44,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:33:46,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:33:50,348 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=17.13 vs. limit=9.2 2023-09-28 11:33:53,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:33:53,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:33:53,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=4600.0, ans=0.284375 2023-09-28 11:33:54,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:33:58,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:33:59,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:33:59,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:34:05,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:34:05,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:34:07,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:34:07,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:34:13,654 INFO [train.py:1039] (3/4) Epoch 1, batch 700, loss[loss=0.8428, simple_loss=0.7012, pruned_loss=0.6366, over 24695.00 frames. ], tot_loss[loss=0.8765, simple_loss=0.7226, pruned_loss=0.7615, over 4555705.31 frames. ], batch size: 73, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:34:15,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 11:34:16,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 11:34:20,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 11:34:20,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:34:21,822 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.160e+02 1.725e+02 2.743e+02 3.715e+02 1.987e+03, threshold=5.486e+02, percent-clipped=15.0 2023-09-28 11:34:22,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:34:25,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 11:34:27,683 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.42 vs. limit=9.25 2023-09-28 11:34:30,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:34:32,893 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.99 vs. limit=6.183333333333334 2023-09-28 11:34:32,929 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.78 vs. limit=11.05 2023-09-28 11:34:33,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:34:35,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:34:35,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:34:36,621 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=18.31 vs. limit=9.275 2023-09-28 11:34:37,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:34:40,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:34:41,588 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=24.62 vs. limit=9.275 2023-09-28 11:34:44,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 11:34:44,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:34:46,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 11:34:51,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 11:34:55,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:34:57,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:34:58,282 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.18 vs. limit=9.3 2023-09-28 11:34:58,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:35:02,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:35:04,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 11:35:09,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:35:11,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:35:11,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 11:35:14,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:35:16,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:35:17,200 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.73 vs. limit=11.15 2023-09-28 11:35:19,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=4866.666666666667, ans=0.7296666666666667 2023-09-28 11:35:20,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:35:27,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:35:28,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 11:35:31,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 11:35:31,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 11:35:33,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:35:34,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:35:36,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:35:39,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:35:39,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 11:35:41,516 INFO [train.py:1039] (3/4) Epoch 1, batch 750, loss[loss=0.7009, simple_loss=0.5757, pruned_loss=0.5326, over 22814.00 frames. ], tot_loss[loss=0.8451, simple_loss=0.6976, pruned_loss=0.7144, over 4602547.40 frames. ], batch size: 322, lr: 4.49e-02, grad_scale: 4.0 2023-09-28 11:35:44,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 11:35:44,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 11:35:45,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 11:35:46,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 11:35:46,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 11:35:46,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:35:48,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 11:35:49,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:35:49,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:35:51,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:35:53,309 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:35:53,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:35:54,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:35:56,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:35:58,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:36:04,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:36:06,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:36:07,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:36:07,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 11:36:09,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:36:11,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:36:12,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:36:14,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:36:14,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=5133.333333333333, ans=0.24866666666666665 2023-09-28 11:36:16,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 11:36:16,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:36:19,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 11:36:19,681 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 11:36:19,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 11:36:19,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:36:19,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 11:36:20,330 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.33 vs. limit=9.425 2023-09-28 11:36:22,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:36:31,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:36:31,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:36:31,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:36:32,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:36:35,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:36:36,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 11:36:38,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:36:38,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 11:36:39,171 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.10 vs. limit=6.3 2023-09-28 11:36:40,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:36:42,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:36:42,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 11:36:44,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:36:50,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:36:52,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:36:52,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:36:56,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:37:00,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 11:37:00,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:37:02,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:37:04,171 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:37:04,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:37:07,770 INFO [train.py:1039] (3/4) Epoch 1, batch 800, loss[loss=0.7194, simple_loss=0.6108, pruned_loss=0.5014, over 24647.00 frames. ], tot_loss[loss=0.8122, simple_loss=0.6729, pruned_loss=0.6664, over 4636392.29 frames. ], batch size: 68, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:37:08,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:37:10,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:37:16,636 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 4.125e+02 6.476e+02 9.801e+02 2.445e+03, threshold=1.295e+03, percent-clipped=55.0 2023-09-28 11:37:19,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:37:19,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:37:19,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=5333.333333333333, ans=0.044444444444444446 2023-09-28 11:37:19,648 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=5333.333333333333, ans=0.7133333333333334 2023-09-28 11:37:21,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:37:21,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:37:23,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:37:23,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:37:26,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:37:30,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:37:30,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:37:33,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 11:37:35,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:37:35,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:37:35,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:37:36,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:37:36,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 11:37:36,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:37:38,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 11:37:40,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:37:44,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:37:47,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:37:47,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:37:47,838 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.03 vs. limit=9.55 2023-09-28 11:37:50,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:37:52,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:37:56,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:37:57,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:37:57,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 11:38:01,324 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 11:38:01,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 11:38:01,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:38:01,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:38:03,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:38:03,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:38:09,849 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 11:38:09,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 11:38:11,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:38:12,892 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.85 vs. limit=5.0 2023-09-28 11:38:13,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:38:18,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:38:22,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:38:23,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 11:38:23,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:38:24,647 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.81 vs. limit=7.8 2023-09-28 11:38:27,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 11:38:29,375 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.07 vs. limit=9.6 2023-09-28 11:38:33,477 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=18.29 vs. limit=11.7 2023-09-28 11:38:34,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:38:36,585 INFO [train.py:1039] (3/4) Epoch 1, batch 850, loss[loss=0.6041, simple_loss=0.5167, pruned_loss=0.4087, over 23852.00 frames. ], tot_loss[loss=0.7783, simple_loss=0.6483, pruned_loss=0.6184, over 4663061.69 frames. ], batch size: 195, lr: 4.49e-02, grad_scale: 8.0 2023-09-28 11:38:36,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=5666.666666666667, ans=0.234375 2023-09-28 11:38:38,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:38:40,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 11:38:40,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:38:40,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:38:40,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 11:38:40,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:38:41,207 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.84 vs. limit=6.416666666666667 2023-09-28 11:38:43,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:38:45,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:38:45,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:38:46,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:38:48,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 11:38:50,026 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 11:38:50,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 11:38:51,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:38:51,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:38:53,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:38:53,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:38:54,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:39:01,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:39:01,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:39:03,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 11:39:06,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 11:39:10,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:39:10,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 11:39:16,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 11:39:16,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 11:39:19,991 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 11:39:20,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:39:20,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:39:20,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 11:39:23,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:39:24,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:39:25,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 11:39:26,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:39:28,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:39:28,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=5866.666666666667, ans=0.6946666666666667 2023-09-28 11:39:29,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:39:31,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:39:33,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:39:35,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 11:39:35,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 11:39:42,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:39:42,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:39:42,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:39:42,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:39:44,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:39:45,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:39:47,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:39:49,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:39:49,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:39:51,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:39:51,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=5933.333333333333, ans=0.221875 2023-09-28 11:40:00,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 11:40:00,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=5933.333333333333, ans=0.8093333333333333 2023-09-28 11:40:01,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:40:03,137 INFO [train.py:1039] (3/4) Epoch 1, batch 900, loss[loss=0.6541, simple_loss=0.5552, pruned_loss=0.4431, over 23765.00 frames. ], tot_loss[loss=0.7471, simple_loss=0.6264, pruned_loss=0.5747, over 4680813.93 frames. ], batch size: 212, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:40:03,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 11:40:03,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:40:03,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:40:06,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 11:40:10,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:40:12,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:40:13,965 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 3.628e+02 6.882e+02 1.109e+03 2.718e+03, threshold=1.376e+03, percent-clipped=19.0 2023-09-28 11:40:14,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 11:40:17,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:40:18,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 11:40:18,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 11:40:19,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:40:19,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:40:21,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 11:40:21,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:40:33,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=6066.666666666667, ans=0.04138888888888889 2023-09-28 11:40:36,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:40:36,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:40:36,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:40:38,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:40:43,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 11:40:46,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:40:52,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:40:54,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:40:54,198 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 11:40:55,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 11:41:01,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 11:41:02,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:41:02,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:41:09,816 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:41:09,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:41:11,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 11:41:13,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:41:13,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 11:41:16,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:41:16,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:41:17,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:41:17,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:41:22,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 11:41:23,478 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 11:41:23,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=6266.666666666667, ans=0.04055555555555555 2023-09-28 11:41:24,325 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.00 vs. limit=9.85 2023-09-28 11:41:26,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 11:41:26,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 11:41:29,592 INFO [train.py:1039] (3/4) Epoch 1, batch 950, loss[loss=0.6327, simple_loss=0.5584, pruned_loss=0.3921, over 24303.00 frames. ], tot_loss[loss=0.7159, simple_loss=0.6045, pruned_loss=0.5333, over 4695737.86 frames. ], batch size: 74, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:41:29,687 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:41:33,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 11:41:37,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=6333.333333333333, ans=12.25 2023-09-28 11:41:38,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:41:42,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:41:42,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:41:43,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:41:46,839 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 11:41:47,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=6400.0, ans=0.009478260869565217 2023-09-28 11:41:51,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:41:53,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:41:53,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:41:53,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:41:53,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 11:41:55,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:41:56,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:41:57,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 11:41:59,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:42:04,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:42:04,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:42:04,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:42:05,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 11:42:07,681 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 11:42:08,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=6466.666666666667, ans=9.041666666666668 2023-09-28 11:42:11,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:42:12,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:42:18,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:42:18,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:42:20,571 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=6533.333333333333, ans=0.6713333333333333 2023-09-28 11:42:21,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 11:42:23,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 11:42:23,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 11:42:25,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:42:25,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:42:25,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:42:30,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 11:42:32,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:42:33,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:42:34,681 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.31 vs. limit=12.4 2023-09-28 11:42:35,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:42:35,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 11:42:35,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:42:35,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:42:36,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 11:42:41,192 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=6600.0, ans=0.04949747468305833 2023-09-28 11:42:42,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:42:46,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:42:51,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:42:53,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 11:42:53,171 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 11:42:56,627 INFO [train.py:1039] (3/4) Epoch 1, batch 1000, loss[loss=0.5299, simple_loss=0.4602, pruned_loss=0.3358, over 23488.00 frames. ], tot_loss[loss=0.6847, simple_loss=0.5821, pruned_loss=0.495, over 4685635.74 frames. ], batch size: 285, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:42:58,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:43:00,865 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=3.57 vs. limit=6.666666666666667 2023-09-28 11:43:01,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 11:43:03,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:43:03,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=6666.666666666667, ans=0.1875 2023-09-28 11:43:06,719 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.970e+02 4.014e+02 6.511e+02 1.253e+03 2.271e+03, threshold=1.302e+03, percent-clipped=16.0 2023-09-28 11:43:08,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:43:09,356 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.99 vs. limit=12.5 2023-09-28 11:43:10,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 11:43:10,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 11:43:12,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=6733.333333333333, ans=0.23266666666666666 2023-09-28 11:43:15,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:43:15,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:43:15,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:43:19,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 11:43:25,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 11:43:27,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 11:43:27,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:43:28,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 11:43:29,559 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.73 vs. limit=12.6 2023-09-28 11:43:32,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 11:43:32,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 11:43:32,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:43:34,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:43:43,252 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.33 vs. limit=12.6 2023-09-28 11:43:43,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:43:44,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:43:45,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:43:45,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:43:45,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 11:43:47,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:43:47,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:43:49,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:43:49,384 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 11:43:50,134 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.17 vs. limit=6.716666666666667 2023-09-28 11:43:52,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 11:43:54,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 11:43:56,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 11:43:57,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:44:03,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:44:04,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:44:04,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:44:06,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:44:09,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 11:44:09,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:44:11,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 11:44:11,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 11:44:12,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:44:12,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:44:15,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:44:15,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=6933.333333333333, ans=0.00936231884057971 2023-09-28 11:44:16,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:44:17,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=6933.333333333333, ans=0.175 2023-09-28 11:44:20,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:44:21,953 INFO [train.py:1039] (3/4) Epoch 1, batch 1050, loss[loss=0.4774, simple_loss=0.402, pruned_loss=0.316, over 19344.00 frames. ], tot_loss[loss=0.6543, simple_loss=0.5602, pruned_loss=0.4597, over 4678465.33 frames. ], batch size: 388, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:44:25,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:44:26,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:44:28,025 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=5.89 vs. limit=5.4 2023-09-28 11:44:28,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:44:30,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:44:31,023 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.23 vs. limit=12.75 2023-09-28 11:44:33,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:44:35,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 11:44:36,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 11:44:40,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:44:40,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:44:40,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:44:42,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:44:42,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 11:44:43,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:44:43,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 11:44:44,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=7066.666666666667, ans=0.07 2023-09-28 11:44:47,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:44:47,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 11:44:47,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:44:56,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:44:58,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:44:58,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:45:01,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 11:45:01,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 11:45:01,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:45:05,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 11:45:08,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 11:45:09,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:45:12,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 11:45:14,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:45:14,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:45:16,546 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:45:16,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=7200.0, ans=0.16249999999999998 2023-09-28 11:45:19,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:45:22,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 11:45:25,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 11:45:25,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 11:45:26,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:45:26,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:45:28,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 11:45:33,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:45:35,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:45:35,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:45:36,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:45:36,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:45:40,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:45:40,718 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 11:45:43,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 11:45:43,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 11:45:43,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 11:45:45,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:45:46,848 INFO [train.py:1039] (3/4) Epoch 1, batch 1100, loss[loss=0.5331, simple_loss=0.4534, pruned_loss=0.3435, over 19076.00 frames. ], tot_loss[loss=0.6304, simple_loss=0.5439, pruned_loss=0.4302, over 4684479.45 frames. ], batch size: 388, lr: 4.48e-02, grad_scale: 8.0 2023-09-28 11:45:48,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:45:48,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=7333.333333333333, ans=0.035 2023-09-28 11:45:54,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=7333.333333333333, ans=0.22666666666666668 2023-09-28 11:45:55,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:45:58,781 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.864e+02 4.555e+02 7.978e+02 1.389e+03 3.645e+03, threshold=1.596e+03, percent-clipped=29.0 2023-09-28 11:46:00,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:46:00,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 11:46:00,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:46:02,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 11:46:04,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:46:07,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 11:46:08,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:46:10,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:46:10,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 11:46:12,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 11:46:13,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:46:13,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:46:17,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:46:20,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 11:46:23,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:46:24,595 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.80 vs. limit=10.3 2023-09-28 11:46:27,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 11:46:28,897 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 11:46:28,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:46:32,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:46:32,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:46:34,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:46:34,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 11:46:34,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:46:34,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 11:46:36,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:46:36,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:46:36,703 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=7533.333333333333, ans=0.22466666666666668 2023-09-28 11:46:37,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 11:46:42,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 11:46:43,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 11:46:44,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=7533.333333333333, ans=0.14687499999999998 2023-09-28 11:46:45,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:46:46,613 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.45 vs. limit=10.325 2023-09-28 11:46:51,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:46:51,779 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.21 vs. limit=13.15 2023-09-28 11:46:54,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 11:46:54,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 11:46:56,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:46:59,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:46:59,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:47:00,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 11:47:03,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:47:03,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=7600.0, ans=0.634 2023-09-28 11:47:05,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:47:06,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 11:47:06,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:47:08,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 11:47:09,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:47:09,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:47:10,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:47:12,087 INFO [train.py:1039] (3/4) Epoch 1, batch 1150, loss[loss=0.5577, simple_loss=0.4883, pruned_loss=0.3397, over 23592.00 frames. ], tot_loss[loss=0.6097, simple_loss=0.5298, pruned_loss=0.4049, over 4686538.14 frames. ], batch size: 256, lr: 4.47e-02, grad_scale: 4.0 2023-09-28 11:47:12,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=7666.666666666667, ans=0.0 2023-09-28 11:47:15,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:47:18,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:47:20,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:47:21,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:47:21,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 11:47:21,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:47:25,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 11:47:26,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:47:26,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:47:32,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 11:47:35,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:47:40,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:47:41,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:47:41,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 11:47:43,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:47:43,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:47:44,474 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=5.20 vs. limit=5.5600000000000005 2023-09-28 11:47:46,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 11:47:46,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=7800.0, ans=0.034166666666666665 2023-09-28 11:47:48,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:47:51,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:47:59,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:48:05,751 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.95 vs. limit=10.45 2023-09-28 11:48:07,805 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:48:07,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 11:48:09,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:48:09,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:48:16,048 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 11:48:18,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:48:26,774 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 11:48:29,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:48:32,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:48:32,943 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 11:48:33,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:48:34,812 INFO [train.py:1039] (3/4) Epoch 1, batch 1200, loss[loss=0.4976, simple_loss=0.4472, pruned_loss=0.2884, over 23335.00 frames. ], tot_loss[loss=0.5885, simple_loss=0.516, pruned_loss=0.3799, over 4700304.77 frames. ], batch size: 119, lr: 4.47e-02, grad_scale: 8.0 2023-09-28 11:48:37,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:48:41,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 11:48:41,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:48:42,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=8000.0, ans=0.125 2023-09-28 11:48:43,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:48:43,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:48:44,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:48:46,296 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.850e+02 4.760e+02 7.806e+02 1.164e+03 2.947e+03, threshold=1.561e+03, percent-clipped=14.0 2023-09-28 11:48:46,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:48:48,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:48:50,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:48:50,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:48:51,843 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 11:48:54,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 11:49:00,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:49:01,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:49:03,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:49:06,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:49:06,752 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 11:49:08,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:49:14,296 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.60 vs. limit=9.066666666666666 2023-09-28 11:49:18,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 11:49:18,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:49:18,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 11:49:19,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:49:23,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 11:49:24,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 11:49:25,118 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=8200.0, ans=0.125 2023-09-28 11:49:26,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:49:28,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:49:28,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:49:30,137 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.21 vs. limit=9.1 2023-09-28 11:49:30,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 11:49:32,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:49:32,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:49:34,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:49:34,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 11:49:35,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:49:35,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:49:36,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 11:49:38,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:49:38,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:49:39,592 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.50 vs. limit=7.05 2023-09-28 11:49:43,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:49:45,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:49:48,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 11:49:53,495 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 11:49:55,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:49:58,119 INFO [train.py:1039] (3/4) Epoch 1, batch 1250, loss[loss=0.5182, simple_loss=0.4811, pruned_loss=0.283, over 24398.00 frames. ], tot_loss[loss=0.5732, simple_loss=0.5065, pruned_loss=0.3608, over 4716460.34 frames. ], batch size: 77, lr: 4.47e-02, grad_scale: 4.0 2023-09-28 11:49:58,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:49:59,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:50:01,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:50:04,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 11:50:08,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:50:09,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:50:09,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 11:50:11,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:50:12,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:50:15,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 11:50:16,607 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.71 vs. limit=4.26 2023-09-28 11:50:18,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:50:19,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:50:19,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:50:21,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 11:50:26,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 11:50:26,079 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:50:26,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:50:27,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:50:29,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:50:30,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:50:32,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:50:38,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 11:50:38,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 11:50:41,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:50:41,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 11:50:41,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:50:42,871 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 11:50:42,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:50:42,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:50:47,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:50:51,832 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.60 vs. limit=10.7 2023-09-28 11:50:52,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:50:52,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:50:54,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 11:50:54,261 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 11:50:55,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 11:50:58,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:50:59,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=8533.333333333334, ans=0.21466666666666667 2023-09-28 11:51:00,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 11:51:00,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:51:04,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 11:51:04,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:51:07,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 11:51:07,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 11:51:07,902 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 11:51:10,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 11:51:10,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:51:12,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 11:51:14,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:51:17,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:51:18,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:51:20,569 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 11:51:21,989 INFO [train.py:1039] (3/4) Epoch 1, batch 1300, loss[loss=0.5329, simple_loss=0.4821, pruned_loss=0.3029, over 24636.00 frames. ], tot_loss[loss=0.558, simple_loss=0.4966, pruned_loss=0.3432, over 4723222.62 frames. ], batch size: 65, lr: 4.47e-02, grad_scale: 8.0 2023-09-28 11:51:23,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:51:23,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 11:51:30,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:51:31,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 11:51:32,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:51:34,990 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.019e+02 3.707e+02 6.388e+02 1.142e+03 3.121e+03, threshold=1.278e+03, percent-clipped=13.0 2023-09-28 11:51:35,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:51:38,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 11:51:38,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 11:51:43,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:51:45,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:51:47,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 11:51:50,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:51:54,961 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.25 vs. limit=14.1 2023-09-28 11:51:55,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:51:55,974 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:51:57,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:51:58,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:52:00,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 11:52:00,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 11:52:00,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 11:52:02,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=8800.0, ans=0.125 2023-09-28 11:52:08,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:52:08,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 11:52:10,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 11:52:10,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 11:52:11,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:52:14,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:52:15,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 11:52:16,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:52:16,489 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 11:52:20,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:52:22,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:52:22,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:52:26,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 11:52:27,772 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 11:52:29,383 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 11:52:32,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:52:36,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 11:52:39,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:52:44,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=9000.0, ans=0.21000000000000002 2023-09-28 11:52:45,149 INFO [train.py:1039] (3/4) Epoch 1, batch 1350, loss[loss=0.4987, simple_loss=0.4434, pruned_loss=0.2897, over 23597.00 frames. ], tot_loss[loss=0.5428, simple_loss=0.4857, pruned_loss=0.3277, over 4721571.83 frames. ], batch size: 256, lr: 4.46e-02, grad_scale: 4.0 2023-09-28 11:52:46,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 11:52:49,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:52:51,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:52:55,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:52:55,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:52:57,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:52:59,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:53:03,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 11:53:05,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 11:53:05,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:53:06,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:53:10,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 11:53:11,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:53:13,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:53:13,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 11:53:16,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 11:53:17,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 11:53:19,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:53:19,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 11:53:30,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:53:38,494 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.29 vs. limit=14.4 2023-09-28 11:53:39,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:53:39,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:53:41,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 11:53:43,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:53:45,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 11:53:45,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 11:53:46,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:53:49,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:53:50,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 11:53:52,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=9266.666666666666, ans=0.125 2023-09-28 11:53:53,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 11:54:00,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 11:54:00,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=9266.666666666666, ans=0.125 2023-09-28 11:54:02,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 11:54:09,185 INFO [train.py:1039] (3/4) Epoch 1, batch 1400, loss[loss=0.4419, simple_loss=0.4237, pruned_loss=0.2281, over 24323.00 frames. ], tot_loss[loss=0.5277, simple_loss=0.4756, pruned_loss=0.3124, over 4720480.37 frames. ], batch size: 61, lr: 4.46e-02, grad_scale: 8.0 2023-09-28 11:54:09,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 11:54:11,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:54:16,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:54:16,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:54:22,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 11:54:23,718 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.036e+02 3.566e+02 5.835e+02 9.354e+02 4.572e+03, threshold=1.167e+03, percent-clipped=13.0 2023-09-28 11:54:23,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 11:54:33,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:54:35,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:54:36,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:54:37,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 11:54:40,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:54:43,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 11:54:51,065 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=9466.666666666666, ans=0.025 2023-09-28 11:54:52,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:54:52,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:54:57,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 11:54:58,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:54:58,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 11:55:00,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:55:00,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:55:02,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 11:55:02,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:55:02,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:55:05,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 11:55:05,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:55:10,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:55:10,880 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=7.05 vs. limit=7.383333333333334 2023-09-28 11:55:14,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:55:20,177 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.59 vs. limit=11.1 2023-09-28 11:55:23,314 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.16 vs. limit=11.1 2023-09-28 11:55:23,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 11:55:25,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 11:55:25,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:55:28,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 11:55:30,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:55:31,865 INFO [train.py:1039] (3/4) Epoch 1, batch 1450, loss[loss=0.5138, simple_loss=0.4579, pruned_loss=0.2949, over 22833.00 frames. ], tot_loss[loss=0.5136, simple_loss=0.4665, pruned_loss=0.2982, over 4726968.38 frames. ], batch size: 322, lr: 4.46e-02, grad_scale: 8.0 2023-09-28 11:55:31,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:55:35,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 11:55:36,779 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:55:36,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:55:36,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 11:55:42,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:55:44,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 11:55:44,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:55:44,591 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 11:55:44,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=9666.666666666666, ans=0.125 2023-09-28 11:55:46,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 11:55:48,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 11:55:50,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:55:50,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:55:50,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 11:55:50,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=9733.333333333334, ans=0.125 2023-09-28 11:55:53,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:55:54,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 11:55:56,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 11:55:56,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:55:56,958 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.01 vs. limit=14.8 2023-09-28 11:55:57,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:55:59,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:56:00,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:56:04,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 11:56:04,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:56:07,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:56:07,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:56:08,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 11:56:08,974 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 11:56:10,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:56:10,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:56:13,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 11:56:18,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:56:21,228 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 11:56:23,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:56:25,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 11:56:27,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:56:29,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 11:56:29,933 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=16.04 vs. limit=14.9 2023-09-28 11:56:33,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:56:35,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 11:56:36,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 11:56:38,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:56:41,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:56:41,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:56:43,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 11:56:45,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 11:56:45,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 11:56:46,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:56:48,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 11:56:49,155 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=24.04 vs. limit=14.95 2023-09-28 11:56:54,939 INFO [train.py:1039] (3/4) Epoch 1, batch 1500, loss[loss=0.4748, simple_loss=0.4546, pruned_loss=0.246, over 24652.00 frames. ], tot_loss[loss=0.5014, simple_loss=0.4589, pruned_loss=0.2861, over 4728213.25 frames. ], batch size: 68, lr: 4.46e-02, grad_scale: 8.0 2023-09-28 11:56:55,624 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.47 vs. limit=8.0 2023-09-28 11:56:59,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 11:56:59,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 11:56:59,427 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 11:57:00,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:57:02,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:57:02,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 11:57:04,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 11:57:04,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=10000.0, ans=0.008695652173913044 2023-09-28 11:57:06,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 11:57:06,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 11:57:06,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:57:07,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 11:57:10,820 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.048e+02 3.499e+02 5.909e+02 9.288e+02 2.563e+03, threshold=1.182e+03, percent-clipped=18.0 2023-09-28 11:57:10,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:57:12,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:57:18,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:57:18,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 11:57:18,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:57:18,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 11:57:19,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=10066.666666666666, ans=0.125 2023-09-28 11:57:20,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:57:23,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 11:57:23,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=10066.666666666666, ans=0.125 2023-09-28 11:57:26,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 11:57:28,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:57:28,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 11:57:32,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 11:57:35,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:57:37,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:57:37,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:57:39,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 11:57:39,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:57:39,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:57:40,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=10133.333333333334, ans=0.125 2023-09-28 11:57:41,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 11:57:42,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:57:47,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 11:57:47,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 11:57:53,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 11:57:55,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 11:57:59,711 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 11:58:01,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:58:01,201 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 11:58:02,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:58:04,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:58:06,053 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 11:58:06,208 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 11:58:09,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 11:58:11,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:58:17,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:58:17,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:58:18,381 INFO [train.py:1039] (3/4) Epoch 1, batch 1550, loss[loss=0.4372, simple_loss=0.4079, pruned_loss=0.235, over 23822.00 frames. ], tot_loss[loss=0.493, simple_loss=0.4539, pruned_loss=0.2772, over 4730080.44 frames. ], batch size: 179, lr: 4.45e-02, grad_scale: 8.0 2023-09-28 11:58:18,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 11:58:18,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:58:18,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 11:58:20,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 11:58:21,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 11:58:21,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:58:23,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 11:58:23,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 11:58:25,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:58:26,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:58:26,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:58:26,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 11:58:29,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:58:29,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:58:31,416 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 11:58:31,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=10333.333333333334, ans=0.19666666666666666 2023-09-28 11:58:32,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:58:32,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 11:58:32,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 11:58:33,875 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.08 vs. limit=11.4 2023-09-28 11:58:35,399 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.64 vs. limit=15.3 2023-09-28 11:58:36,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 11:58:36,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 11:58:37,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 11:58:37,629 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 11:58:39,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 11:58:39,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 11:58:39,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:58:42,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:58:43,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=10400.0, ans=0.125 2023-09-28 11:58:47,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 11:58:49,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 11:58:49,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 11:58:50,390 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.63 vs. limit=10.2 2023-09-28 11:58:58,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:59:02,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 11:59:02,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 11:59:02,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 11:59:03,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 11:59:08,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=10533.333333333334, ans=0.19466666666666665 2023-09-28 11:59:09,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 11:59:11,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:59:15,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 11:59:18,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 11:59:18,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 11:59:20,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 11:59:20,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:59:20,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 11:59:21,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:59:23,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 11:59:23,413 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 11:59:25,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:59:31,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 11:59:36,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:59:37,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 11:59:39,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 11:59:41,397 INFO [train.py:1039] (3/4) Epoch 1, batch 1600, loss[loss=0.4181, simple_loss=0.4182, pruned_loss=0.2039, over 24314.00 frames. ], tot_loss[loss=0.4838, simple_loss=0.4485, pruned_loss=0.2682, over 4722965.25 frames. ], batch size: 61, lr: 4.45e-02, grad_scale: 16.0 2023-09-28 11:59:43,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 11:59:44,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 11:59:44,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 11:59:44,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 11:59:44,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 11:59:49,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 11:59:49,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 11:59:51,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 11:59:54,592 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=9.78 vs. limit=10.333333333333332 2023-09-28 11:59:55,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 11:59:56,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 11:59:58,020 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.152e+02 3.597e+02 5.871e+02 8.452e+02 2.438e+03, threshold=1.174e+03, percent-clipped=11.0 2023-09-28 11:59:58,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 11:59:58,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:00:02,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:00:05,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:00:08,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 12:00:11,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:00:13,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 12:00:13,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:00:14,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 12:00:19,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 12:00:19,903 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=10800.0, ans=0.522 2023-09-28 12:00:19,983 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=10800.0, ans=0.0 2023-09-28 12:00:28,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:00:28,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 12:00:28,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:00:30,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:00:30,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:00:31,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=10866.666666666666, ans=0.125 2023-09-28 12:00:35,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 12:00:40,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 12:00:40,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:00:41,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:00:41,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:00:43,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:00:45,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:00:45,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:00:48,064 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:00:50,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=10933.333333333334, ans=0.5173333333333334 2023-09-28 12:00:55,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:00:55,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:00:57,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 12:00:57,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:00:58,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=10933.333333333334, ans=0.125 2023-09-28 12:00:59,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 12:01:04,475 INFO [train.py:1039] (3/4) Epoch 1, batch 1650, loss[loss=0.4539, simple_loss=0.4437, pruned_loss=0.2293, over 23516.00 frames. ], tot_loss[loss=0.4791, simple_loss=0.4462, pruned_loss=0.2627, over 4713358.46 frames. ], batch size: 93, lr: 4.45e-02, grad_scale: 8.0 2023-09-28 12:01:05,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:01:08,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:01:08,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:01:08,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 12:01:08,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 12:01:08,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 12:01:09,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 12:01:14,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:01:16,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:01:16,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:01:16,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:01:18,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:01:21,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 12:01:22,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:01:22,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:01:22,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:01:24,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:01:24,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 12:01:25,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 12:01:26,508 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=6.36 vs. limit=8.426666666666666 2023-09-28 12:01:33,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:01:34,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:01:42,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 12:01:43,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:01:47,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 12:01:48,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:01:50,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:01:50,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:01:52,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:01:53,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:01:55,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:01:59,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:01:59,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:01:59,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:02:01,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:02:01,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:02:03,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:02:06,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:02:06,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 12:02:08,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:02:08,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 12:02:09,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 12:02:09,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 12:02:11,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:02:13,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:02:13,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:02:13,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:02:13,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 12:02:15,094 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.13 vs. limit=11.725 2023-09-28 12:02:18,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:02:20,398 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:02:20,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:02:20,819 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:02:21,464 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.87 vs. limit=15.95 2023-09-28 12:02:24,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 12:02:28,778 INFO [train.py:1039] (3/4) Epoch 1, batch 1700, loss[loss=0.3923, simple_loss=0.3884, pruned_loss=0.1954, over 24575.00 frames. ], tot_loss[loss=0.4716, simple_loss=0.4416, pruned_loss=0.2558, over 4700674.05 frames. ], batch size: 60, lr: 4.44e-02, grad_scale: 8.0 2023-09-28 12:02:29,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:02:29,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:02:29,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 12:02:30,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:02:30,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:02:30,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:02:33,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:02:33,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:02:33,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 12:02:37,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:02:39,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=11333.333333333334, ans=0.09899494936611666 2023-09-28 12:02:45,392 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.253e+02 3.835e+02 6.904e+02 1.046e+03 2.238e+03, threshold=1.381e+03, percent-clipped=16.0 2023-09-28 12:02:46,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:02:49,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:02:49,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=11400.0, ans=0.01916666666666667 2023-09-28 12:02:57,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:02:57,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:02:59,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:02:59,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:03:02,259 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 12:03:05,345 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:03:05,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:03:06,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:03:08,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 12:03:10,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 12:03:10,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 12:03:12,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:03:13,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 12:03:15,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:03:17,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=11533.333333333334, ans=0.018611111111111106 2023-09-28 12:03:24,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:03:24,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:03:24,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:03:26,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:03:26,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 12:03:27,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:03:29,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:03:29,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 12:03:31,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:03:31,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:03:31,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:03:31,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:03:34,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:03:34,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:03:36,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:03:36,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:03:36,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:03:41,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:03:41,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 12:03:44,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:03:46,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:03:47,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 12:03:52,712 INFO [train.py:1039] (3/4) Epoch 1, batch 1750, loss[loss=0.4469, simple_loss=0.43, pruned_loss=0.231, over 23276.00 frames. ], tot_loss[loss=0.462, simple_loss=0.4354, pruned_loss=0.2481, over 4686570.24 frames. ], batch size: 105, lr: 4.44e-02, grad_scale: 8.0 2023-09-28 12:03:56,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:03:58,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:03:59,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:04:01,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 12:04:01,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:04:04,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:04:04,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=11666.666666666666, ans=0.125 2023-09-28 12:04:05,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:04:08,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 12:04:11,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:04:13,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 12:04:14,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:04:16,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:04:19,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 12:04:20,153 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.29 vs. limit=4.76 2023-09-28 12:04:21,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 12:04:22,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:04:22,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 12:04:33,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:04:34,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:04:34,900 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:04:35,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=11800.0, ans=0.125 2023-09-28 12:04:39,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:04:39,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:04:41,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:04:43,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:04:46,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:04:47,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:04:47,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 12:04:49,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:04:51,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 12:04:53,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:04:54,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:04:56,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:05:01,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:05:01,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 12:05:03,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:05:06,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:05:11,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:05:13,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:05:15,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:05:15,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 12:05:15,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:05:17,320 INFO [train.py:1039] (3/4) Epoch 1, batch 1800, loss[loss=0.4335, simple_loss=0.4139, pruned_loss=0.2262, over 23670.00 frames. ], tot_loss[loss=0.4527, simple_loss=0.43, pruned_loss=0.2402, over 4697978.89 frames. ], batch size: 149, lr: 4.44e-02, grad_scale: 8.0 2023-09-28 12:05:17,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:05:17,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:05:17,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:05:17,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:05:17,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:05:20,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:05:21,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=12000.0, ans=0.00826086956521739 2023-09-28 12:05:21,776 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.58 vs. limit=12.0 2023-09-28 12:05:22,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:05:24,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 12:05:26,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=12000.0, ans=0.00826086956521739 2023-09-28 12:05:27,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:05:30,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:05:32,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:05:32,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=12066.666666666666, ans=0.07 2023-09-28 12:05:33,427 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.239e+02 3.495e+02 5.189e+02 7.461e+02 1.869e+03, threshold=1.038e+03, percent-clipped=4.0 2023-09-28 12:05:35,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:05:36,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:05:39,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:05:41,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:05:42,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:05:42,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 12:05:44,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:05:45,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=12066.666666666666, ans=0.0 2023-09-28 12:05:48,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:05:52,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 12:05:54,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 12:05:54,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 12:05:55,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:05:55,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:05:55,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:05:57,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:06:05,723 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 12:06:07,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:06:08,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:06:09,441 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.66 vs. limit=12.075 2023-09-28 12:06:10,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 12:06:10,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 12:06:11,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:06:14,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:06:14,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:06:19,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 12:06:21,245 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=12200.0, ans=0.178 2023-09-28 12:06:27,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:06:29,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 12:06:29,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:06:29,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:06:29,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:06:30,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 12:06:34,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:06:34,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:06:36,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=12266.666666666666, ans=0.125 2023-09-28 12:06:37,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 12:06:37,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:06:39,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:06:40,792 INFO [train.py:1039] (3/4) Epoch 1, batch 1850, loss[loss=0.3989, simple_loss=0.4077, pruned_loss=0.1927, over 24654.00 frames. ], tot_loss[loss=0.4478, simple_loss=0.4279, pruned_loss=0.2356, over 4694659.84 frames. ], batch size: 68, lr: 4.43e-02, grad_scale: 8.0 2023-09-28 12:06:40,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:06:40,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:06:42,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:06:43,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:06:44,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:06:44,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:06:48,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:06:48,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:06:53,557 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=12333.333333333334, ans=0.125 2023-09-28 12:06:56,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:06:56,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 12:07:00,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 12:07:04,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 12:07:07,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:07:07,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 12:07:07,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 12:07:17,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:07:19,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 12:07:23,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:07:23,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:07:29,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 12:07:29,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:07:29,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:07:30,182 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=12533.333333333334, ans=0.17466666666666666 2023-09-28 12:07:31,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:07:34,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:07:37,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:07:40,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:07:40,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:07:40,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 12:07:40,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:07:43,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:07:43,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:07:46,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 12:07:46,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:07:51,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:07:53,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:07:53,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 12:07:53,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 12:07:55,393 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 12:07:55,530 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 12:07:57,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:07:57,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:07:58,502 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.07 vs. limit=11.3 2023-09-28 12:07:59,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:07:59,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:08:00,598 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 12:08:00,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:08:01,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:08:03,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:08:04,752 INFO [train.py:1039] (3/4) Epoch 1, batch 1900, loss[loss=0.4115, simple_loss=0.42, pruned_loss=0.2, over 24642.00 frames. ], tot_loss[loss=0.4455, simple_loss=0.4274, pruned_loss=0.233, over 4694331.84 frames. ], batch size: 68, lr: 4.43e-02, grad_scale: 8.0 2023-09-28 12:08:04,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:08:06,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:08:06,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 12:08:08,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:08:08,143 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 12:08:08,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:08:09,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:08:16,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:08:16,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:08:18,029 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 12:08:18,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 12:08:18,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=12666.666666666666, ans=0.013888888888888895 2023-09-28 12:08:20,938 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.193e+02 3.536e+02 5.623e+02 9.146e+02 3.125e+03, threshold=1.125e+03, percent-clipped=17.0 2023-09-28 12:08:21,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:08:21,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:08:21,268 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 12:08:22,650 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 12:08:29,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 12:08:29,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=12733.333333333334, ans=0.125 2023-09-28 12:08:31,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:08:36,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 12:08:38,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 12:08:48,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 12:08:50,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=12800.0, ans=0.013333333333333336 2023-09-28 12:08:51,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 12:08:51,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:08:51,646 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=12800.0, ans=0.172 2023-09-28 12:08:52,834 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 12:08:52,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 12:08:52,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 12:08:53,269 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=12866.666666666666, ans=0.4496666666666667 2023-09-28 12:08:54,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 12:08:54,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:08:57,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 12:09:00,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:09:01,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=12866.666666666666, ans=0.0 2023-09-28 12:09:05,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:09:05,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 12:09:08,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:09:13,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 12:09:13,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:09:13,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=12933.333333333334, ans=0.012777777777777777 2023-09-28 12:09:17,192 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=9.46 vs. limit=11.466666666666667 2023-09-28 12:09:20,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:09:20,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:09:20,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:09:20,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:09:24,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:09:24,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 12:09:24,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:09:27,561 INFO [train.py:1039] (3/4) Epoch 1, batch 1950, loss[loss=0.5009, simple_loss=0.4607, pruned_loss=0.2709, over 22691.00 frames. ], tot_loss[loss=0.4421, simple_loss=0.4265, pruned_loss=0.2297, over 4696067.91 frames. ], batch size: 322, lr: 4.43e-02, grad_scale: 8.0 2023-09-28 12:09:27,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:09:27,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:09:30,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:09:30,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:09:30,724 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:09:32,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:09:34,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:09:37,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:09:37,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:09:37,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:09:42,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 12:09:42,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 12:09:42,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:09:44,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:09:45,015 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.95 vs. limit=12.4 2023-09-28 12:09:47,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:09:47,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:09:47,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:09:50,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:09:53,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:09:53,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:09:53,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:09:53,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:09:55,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=13066.666666666666, ans=0.008028985507246377 2023-09-28 12:09:58,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:10:00,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:10:00,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:10:01,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 12:10:01,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 12:10:03,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:10:03,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:10:03,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:10:08,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:10:11,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:10:14,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:10:17,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:10:19,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:10:19,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 12:10:20,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:10:22,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=13200.0, ans=0.125 2023-09-28 12:10:25,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:10:25,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:10:26,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:10:27,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=13200.0, ans=0.438 2023-09-28 12:10:34,913 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:10:36,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:10:38,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:10:40,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:10:42,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:10:42,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:10:43,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 12:10:43,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:10:45,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:10:47,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 12:10:51,425 INFO [train.py:1039] (3/4) Epoch 1, batch 2000, loss[loss=0.3854, simple_loss=0.3916, pruned_loss=0.1895, over 21389.00 frames. ], tot_loss[loss=0.4379, simple_loss=0.4242, pruned_loss=0.2264, over 4690956.21 frames. ], batch size: 47, lr: 4.42e-02, grad_scale: 16.0 2023-09-28 12:10:51,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:10:56,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:10:56,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:10:57,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:10:57,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:10:59,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=13333.333333333334, ans=0.16666666666666666 2023-09-28 12:11:00,933 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:11:05,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 12:11:05,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:11:05,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=13400.0, ans=0.125 2023-09-28 12:11:06,950 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.094e+02 3.925e+02 5.056e+02 7.202e+02 2.152e+03, threshold=1.011e+03, percent-clipped=10.0 2023-09-28 12:11:08,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:11:10,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 12:11:12,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:11:12,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:11:15,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:11:15,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 12:11:16,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:19,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:19,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:20,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 12:11:20,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:11:22,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 12:11:22,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:11:25,265 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=13466.666666666666, ans=0.09899494936611666 2023-09-28 12:11:26,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=13466.666666666666, ans=0.125 2023-09-28 12:11:28,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:11:29,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 12:11:29,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:31,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:11:31,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=13466.666666666666, ans=0.16533333333333333 2023-09-28 12:11:32,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:11:32,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 12:11:35,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 12:11:35,922 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:11:35,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:11:40,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:11:42,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:11:42,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:11:44,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:11:46,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:11:46,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:11:47,657 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:11:47,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:11:49,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:11:49,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=13533.333333333334, ans=0.125 2023-09-28 12:11:52,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:11:53,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 12:12:01,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 12:12:03,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:05,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:05,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:12:09,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:11,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:12:11,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:12,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 12:12:12,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:12:14,462 INFO [train.py:1039] (3/4) Epoch 1, batch 2050, loss[loss=0.4249, simple_loss=0.4151, pruned_loss=0.2173, over 23334.00 frames. ], tot_loss[loss=0.4327, simple_loss=0.4218, pruned_loss=0.2223, over 4685503.54 frames. ], batch size: 119, lr: 4.42e-02, grad_scale: 16.0 2023-09-28 12:12:14,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:14,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=13666.666666666666, ans=0.16333333333333333 2023-09-28 12:12:16,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:19,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:12:19,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:23,323 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=13666.666666666666, ans=0.00972222222222223 2023-09-28 12:12:24,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:12:26,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:12:26,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:12:27,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:12:30,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=13733.333333333334, ans=0.007884057971014493 2023-09-28 12:12:31,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 12:12:31,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:12:32,197 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.97 vs. limit=17.8 2023-09-28 12:12:34,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:12:34,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:12:42,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:12:42,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:43,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 12:12:46,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:12:47,233 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:12:49,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 12:12:50,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:12:53,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:12:55,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:12:55,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:12:56,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:12:56,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:12:58,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:12:59,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:13:03,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:13:05,868 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:13:07,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:13:09,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:13:12,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:13:17,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=13866.666666666666, ans=0.125 2023-09-28 12:13:20,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:13:20,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 12:13:25,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=13933.333333333334, ans=0.07 2023-09-28 12:13:26,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:13:27,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=13933.333333333334, ans=0.125 2023-09-28 12:13:28,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:13:29,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:13:32,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 12:13:35,616 INFO [train.py:1039] (3/4) Epoch 1, batch 2100, loss[loss=0.3479, simple_loss=0.3838, pruned_loss=0.156, over 24494.00 frames. ], tot_loss[loss=0.4252, simple_loss=0.4176, pruned_loss=0.2168, over 4689232.06 frames. ], batch size: 66, lr: 4.42e-02, grad_scale: 16.0 2023-09-28 12:13:38,004 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 12:13:38,004 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:13:38,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:13:38,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:13:39,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:13:39,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 12:13:41,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 12:13:43,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:13:47,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:13:47,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:13:49,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:13:51,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:13:51,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 12:13:52,522 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.170e+02 3.843e+02 5.173e+02 8.078e+02 2.053e+03, threshold=1.035e+03, percent-clipped=17.0 2023-09-28 12:13:52,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:13:52,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 12:13:52,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 12:13:54,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:13:54,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:13:54,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 12:13:56,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 12:14:01,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 12:14:01,440 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:14:04,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:14:04,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:14:08,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:14:08,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 12:14:10,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:14:10,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 12:14:12,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 12:14:12,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:14:13,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 12:14:13,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 12:14:13,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 12:14:16,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:14:19,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:14:21,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:14:23,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:14:24,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:14:27,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:14:27,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 12:14:27,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:14:27,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:14:29,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:14:29,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 12:14:29,995 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=14200.0, ans=0.125 2023-09-28 12:14:30,267 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=16.57 vs. limit=18.15 2023-09-28 12:14:31,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 12:14:32,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 12:14:37,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:14:39,089 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:14:39,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 12:14:46,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:14:48,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:14:49,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:14:49,794 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:14:49,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 12:14:51,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:14:52,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:14:52,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:14:54,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:14:54,427 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:14:56,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 12:14:58,997 INFO [train.py:1039] (3/4) Epoch 1, batch 2150, loss[loss=0.4148, simple_loss=0.4136, pruned_loss=0.208, over 23360.00 frames. ], tot_loss[loss=0.4167, simple_loss=0.413, pruned_loss=0.2105, over 4694579.55 frames. ], batch size: 105, lr: 4.41e-02, grad_scale: 16.0 2023-09-28 12:14:59,143 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 12:14:59,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:15:02,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:15:02,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:15:02,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:15:03,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:15:10,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 12:15:10,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:15:11,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:15:13,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:15:13,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:15,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:15:20,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:15:20,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:15:20,869 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:15:22,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=14400.0, ans=0.125 2023-09-28 12:15:27,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:27,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 12:15:31,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:15:32,775 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:15:34,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:34,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:15:35,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:35,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:15:37,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:15:37,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:15:37,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:15:39,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 12:15:40,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:15:40,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:15:42,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:15:43,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:15:44,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:15:47,590 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:15:47,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:15:49,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:15:49,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 12:15:49,310 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:15:49,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=14533.333333333334, ans=0.006111111111111109 2023-09-28 12:15:53,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:15:53,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:54,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=14533.333333333334, ans=0.007710144927536232 2023-09-28 12:15:56,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:15:56,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 12:15:56,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:15:59,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:15:59,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 12:16:00,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=14533.333333333334, ans=0.15466666666666667 2023-09-28 12:16:01,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 12:16:01,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:16:01,473 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 12:16:01,826 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=14533.333333333334, ans=0.125 2023-09-28 12:16:02,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:16:04,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:16:04,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 12:16:04,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:16:05,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 12:16:05,916 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 12:16:05,916 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 12:16:05,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 12:16:08,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:16:10,454 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:16:10,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:16:10,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:16:12,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 12:16:13,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:16:13,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:16:14,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=14600.0, ans=0.005833333333333336 2023-09-28 12:16:20,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:16:20,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 12:16:22,792 INFO [train.py:1039] (3/4) Epoch 1, batch 2200, loss[loss=0.3898, simple_loss=0.3982, pruned_loss=0.1907, over 23156.00 frames. ], tot_loss[loss=0.4125, simple_loss=0.4112, pruned_loss=0.2071, over 4704440.42 frames. ], batch size: 105, lr: 4.41e-02, grad_scale: 16.0 2023-09-28 12:16:24,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:16:31,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:16:33,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:16:33,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:16:33,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:16:36,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:16:37,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:16:37,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 12:16:39,265 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.312e+02 4.143e+02 6.351e+02 9.037e+02 1.826e+03, threshold=1.270e+03, percent-clipped=17.0 2023-09-28 12:16:41,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 12:16:41,977 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.85 vs. limit=8.683333333333334 2023-09-28 12:16:44,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 12:16:50,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 12:16:51,122 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=14733.333333333334, ans=0.15266666666666667 2023-09-28 12:16:52,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:16:54,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:16:55,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:16:56,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=14800.0, ans=0.15200000000000002 2023-09-28 12:16:58,429 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.59 vs. limit=18.6 2023-09-28 12:17:00,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:17:00,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 12:17:05,915 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.53 vs. limit=13.05 2023-09-28 12:17:06,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:17:08,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:17:08,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 12:17:08,561 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=14800.0, ans=0.125 2023-09-28 12:17:11,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:17:13,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:17:16,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:17:17,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:17:20,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 12:17:22,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:17:23,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 12:17:25,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:17:25,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:17:25,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:17:27,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:17:28,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:17:28,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:17:28,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:17:30,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:17:32,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:17:33,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:17:35,636 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 12:17:37,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:17:39,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:17:41,095 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 12:17:43,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=14933.333333333334, ans=0.125 2023-09-28 12:17:44,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:17:44,953 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 12:17:46,349 INFO [train.py:1039] (3/4) Epoch 1, batch 2250, loss[loss=0.4042, simple_loss=0.3979, pruned_loss=0.2052, over 23204.00 frames. ], tot_loss[loss=0.4071, simple_loss=0.4088, pruned_loss=0.2029, over 4720347.49 frames. ], batch size: 119, lr: 4.40e-02, grad_scale: 16.0 2023-09-28 12:17:46,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 12:17:46,545 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 12:17:47,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:17:48,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 12:17:49,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:17:51,291 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 12:17:51,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:17:54,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:18:00,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:18:03,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:18:05,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:18:07,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:18:07,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:18:10,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=15066.666666666666, ans=0.125 2023-09-28 12:18:11,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 12:18:11,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:18:11,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:18:14,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 12:18:14,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:18:15,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:18:18,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:18:23,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:18:24,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 12:18:26,170 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:18:26,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 12:18:27,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:18:31,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:18:32,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:18:34,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:18:34,837 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=15200.0, ans=0.125 2023-09-28 12:18:36,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:18:36,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:18:39,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:18:39,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:18:43,823 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=15200.0, ans=0.007565217391304348 2023-09-28 12:18:45,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:18:46,853 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.18 vs. limit=18.9 2023-09-28 12:18:47,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 12:18:52,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:18:52,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:18:53,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:18:56,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=15266.666666666666, ans=0.0030555555555555544 2023-09-28 12:18:59,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 12:19:00,693 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.64 vs. limit=10.106666666666666 2023-09-28 12:19:02,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 12:19:02,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 12:19:02,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:19:04,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:19:07,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 12:19:09,233 INFO [train.py:1039] (3/4) Epoch 1, batch 2300, loss[loss=0.4074, simple_loss=0.404, pruned_loss=0.2054, over 23388.00 frames. ], tot_loss[loss=0.4063, simple_loss=0.4087, pruned_loss=0.2021, over 4715818.26 frames. ], batch size: 119, lr: 4.40e-02, grad_scale: 16.0 2023-09-28 12:19:10,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:19:10,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:19:16,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:19:16,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:19:20,870 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 12:19:24,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:19:27,602 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.211e+02 3.558e+02 5.040e+02 6.600e+02 1.327e+03, threshold=1.008e+03, percent-clipped=3.0 2023-09-28 12:19:30,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:19:30,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 12:19:32,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:19:32,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:19:32,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 12:19:35,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:19:37,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:19:37,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:19:41,824 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:19:43,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:19:43,933 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=15466.666666666666, ans=0.125 2023-09-28 12:19:48,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:19:53,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:19:53,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:19:57,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:20:00,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:20:02,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:20:03,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:20:03,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:20:03,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 12:20:04,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=15533.333333333334, ans=0.14466666666666667 2023-09-28 12:20:07,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 12:20:07,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:20:08,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:20:08,728 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:20:08,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:20:10,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 12:20:10,283 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:20:10,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 12:20:10,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:20:10,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:20:11,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 12:20:17,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:20:21,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:20:26,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:20:26,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:20:29,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:20:32,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 12:20:32,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:20:33,941 INFO [train.py:1039] (3/4) Epoch 1, batch 2350, loss[loss=0.4096, simple_loss=0.4268, pruned_loss=0.1962, over 24363.00 frames. ], tot_loss[loss=0.4032, simple_loss=0.4073, pruned_loss=0.1996, over 4716832.34 frames. ], batch size: 77, lr: 4.40e-02, grad_scale: 16.0 2023-09-28 12:20:34,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:20:34,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 12:20:39,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:20:39,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 12:20:45,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 12:20:49,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:20:54,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:20:54,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:20:55,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:20:56,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:20:56,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 12:20:58,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:21:01,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 12:21:06,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:21:09,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:21:09,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:21:12,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:21:12,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 12:21:12,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:21:12,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=15800.0, ans=0.05385000000000001 2023-09-28 12:21:15,163 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=14.70 vs. limit=19.35 2023-09-28 12:21:15,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:21:17,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:21:17,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:21:21,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:21:23,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 12:21:23,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:21:26,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:21:26,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:21:27,617 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.84 vs. limit=13.45 2023-09-28 12:21:28,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 12:21:30,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:21:33,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 12:21:33,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:21:37,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 12:21:37,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=15866.666666666666, ans=0.125 2023-09-28 12:21:41,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 12:21:41,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:21:41,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 12:21:43,259 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 12:21:43,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 12:21:44,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 12:21:47,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:21:48,762 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=14.92 vs. limit=19.45 2023-09-28 12:21:53,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:21:55,463 INFO [train.py:1039] (3/4) Epoch 1, batch 2400, loss[loss=0.4077, simple_loss=0.4114, pruned_loss=0.202, over 23288.00 frames. ], tot_loss[loss=0.4011, simple_loss=0.4062, pruned_loss=0.1981, over 4719518.63 frames. ], batch size: 105, lr: 4.39e-02, grad_scale: 32.0 2023-09-28 12:21:59,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:21:59,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:22:01,088 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 12:22:01,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 12:22:09,028 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:22:09,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:22:13,387 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.157e+02 3.788e+02 5.121e+02 7.907e+02 1.984e+03, threshold=1.024e+03, percent-clipped=10.0 2023-09-28 12:22:13,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 12:22:13,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:22:15,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:22:15,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 12:22:21,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:22:24,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 12:22:27,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:22:31,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=16133.333333333334, ans=0.05 2023-09-28 12:22:32,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 12:22:37,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:22:38,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:22:40,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=16133.333333333334, ans=0.125 2023-09-28 12:22:43,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:22:45,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 12:22:45,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:22:52,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:22:54,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:22:54,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=16200.0, ans=10.0 2023-09-28 12:22:56,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:22:57,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:22:57,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 12:22:57,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:22:57,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:22:57,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:22:57,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 12:23:02,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:23:03,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:23:03,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 12:23:05,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 12:23:07,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:23:07,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:23:08,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 12:23:09,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 12:23:09,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 12:23:09,065 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 12:23:12,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 12:23:12,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:23:14,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:23:14,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:23:15,604 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 12:23:17,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:23:17,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 12:23:18,625 INFO [train.py:1039] (3/4) Epoch 1, batch 2450, loss[loss=0.3876, simple_loss=0.3778, pruned_loss=0.1987, over 22737.00 frames. ], tot_loss[loss=0.395, simple_loss=0.4023, pruned_loss=0.1939, over 4719604.95 frames. ], batch size: 322, lr: 4.39e-02, grad_scale: 32.0 2023-09-28 12:23:21,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:23:21,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:23:25,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:23:25,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:23:27,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 12:23:31,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:23:32,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=16333.333333333334, ans=0.32833333333333337 2023-09-28 12:23:33,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:23:35,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:23:36,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:23:36,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:23:36,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 12:23:42,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:23:44,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:23:44,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:23:46,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=16400.0, ans=10.0 2023-09-28 12:23:49,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 12:23:51,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:23:51,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:23:52,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:23:54,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 12:23:55,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=16466.666666666668, ans=0.007289855072463768 2023-09-28 12:23:57,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:24:03,653 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=16466.666666666668, ans=0.0 2023-09-28 12:24:04,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:24:05,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=16466.666666666668, ans=0.1353333333333333 2023-09-28 12:24:06,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:24:06,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:24:07,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:24:07,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:24:09,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:24:09,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 12:24:09,957 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=16533.333333333332, ans=0.0072753623188405794 2023-09-28 12:24:12,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:24:14,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:24:18,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:24:18,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:24:19,881 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=16533.333333333332, ans=0.13466666666666668 2023-09-28 12:24:22,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:24:22,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=16600.0, ans=0.0 2023-09-28 12:24:24,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 12:24:24,727 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:24:26,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:24:26,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 12:24:26,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:24:27,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:24:32,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:24:34,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:24:35,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:24:38,299 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.44 vs. limit=9.15 2023-09-28 12:24:39,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 12:24:40,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:24:42,481 INFO [train.py:1039] (3/4) Epoch 1, batch 2500, loss[loss=0.3947, simple_loss=0.4135, pruned_loss=0.188, over 24513.00 frames. ], tot_loss[loss=0.3916, simple_loss=0.3997, pruned_loss=0.1918, over 4712906.74 frames. ], batch size: 66, lr: 4.38e-02, grad_scale: 32.0 2023-09-28 12:24:46,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=16666.666666666668, ans=0.125 2023-09-28 12:24:47,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:24:57,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:24:58,651 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.092e+02 3.311e+02 4.772e+02 6.840e+02 1.468e+03, threshold=9.543e+02, percent-clipped=7.0 2023-09-28 12:24:58,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:25:00,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:25:00,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 12:25:08,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:25:08,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:25:09,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 12:25:09,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 12:25:10,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 12:25:12,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:25:14,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:25:14,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 12:25:14,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:25:14,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 12:25:16,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:25:19,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=16800.0, ans=0.0072173913043478265 2023-09-28 12:25:20,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:25:22,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:25:24,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:25:24,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 12:25:26,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:25:26,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=16800.0, ans=0.0 2023-09-28 12:25:29,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:25:32,406 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:25:37,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:25:40,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:25:45,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 12:25:46,272 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.88 vs. limit=5.529999999999999 2023-09-28 12:25:46,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 12:25:47,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=16866.666666666668, ans=0.0 2023-09-28 12:25:47,800 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.48 vs. limit=5.529999999999999 2023-09-28 12:25:48,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:25:48,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 12:25:50,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:25:50,043 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:25:50,217 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 12:25:50,218 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 12:25:50,226 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 12:25:52,370 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.42 vs. limit=13.85 2023-09-28 12:25:54,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:25:56,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 12:25:56,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 12:25:56,809 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.23 vs. limit=20.2 2023-09-28 12:25:57,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:25:59,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 12:26:02,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 12:26:06,329 INFO [train.py:1039] (3/4) Epoch 1, batch 2550, loss[loss=0.3148, simple_loss=0.3517, pruned_loss=0.139, over 24334.00 frames. ], tot_loss[loss=0.3885, simple_loss=0.3983, pruned_loss=0.1894, over 4708078.17 frames. ], batch size: 56, lr: 4.38e-02, grad_scale: 32.0 2023-09-28 12:26:06,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:26:06,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:26:08,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:26:09,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:26:11,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 12:26:12,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:26:16,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 12:26:18,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:26:18,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=17000.0, ans=0.125 2023-09-28 12:26:20,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:26:21,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:26:21,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 12:26:23,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 12:26:23,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:26:23,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:26:27,559 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:26:27,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 12:26:27,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 12:26:27,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:26:27,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 12:26:30,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=17066.666666666668, ans=0.0 2023-09-28 12:26:41,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:26:47,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:26:47,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:26:47,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:26:49,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 12:26:55,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:26:58,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 12:26:58,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:26:58,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:26:59,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:26:59,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:27:02,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:27:04,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:27:09,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:27:09,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 12:27:09,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:27:09,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:27:09,855 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=17200.0, ans=0.125 2023-09-28 12:27:11,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:27:13,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:27:15,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:27:21,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:27:23,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:27:26,866 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 12:27:27,090 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=17266.666666666668, ans=0.007115942028985507 2023-09-28 12:27:30,292 INFO [train.py:1039] (3/4) Epoch 1, batch 2600, loss[loss=0.3717, simple_loss=0.381, pruned_loss=0.1812, over 23771.00 frames. ], tot_loss[loss=0.3876, simple_loss=0.3976, pruned_loss=0.1888, over 4709343.65 frames. ], batch size: 179, lr: 4.37e-02, grad_scale: 16.0 2023-09-28 12:27:31,849 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 12:27:31,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:27:31,949 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 12:27:33,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 12:27:33,482 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 12:27:36,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:27:36,559 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 12:27:38,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 12:27:39,554 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 12:27:41,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:27:44,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 12:27:45,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 12:27:47,477 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.295e+02 3.359e+02 4.665e+02 7.266e+02 2.532e+03, threshold=9.331e+02, percent-clipped=13.0 2023-09-28 12:27:47,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:27:47,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 12:27:50,712 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.71 vs. limit=14.025 2023-09-28 12:27:51,301 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 12:27:51,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 12:28:01,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:28:01,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:28:01,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:28:01,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 12:28:03,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:28:09,700 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 12:28:14,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:28:15,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:28:15,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 12:28:17,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:28:17,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:28:17,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 12:28:19,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=17533.333333333332, ans=0.28633333333333344 2023-09-28 12:28:21,342 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=15.13 vs. limit=20.65 2023-09-28 12:28:21,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:28:21,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:28:25,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:28:29,266 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 12:28:29,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:28:30,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:28:36,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:28:36,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:28:36,869 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 12:28:38,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:28:39,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:28:41,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:28:46,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 12:28:47,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:28:47,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:28:52,360 INFO [train.py:1039] (3/4) Epoch 1, batch 2650, loss[loss=0.369, simple_loss=0.4015, pruned_loss=0.1682, over 24496.00 frames. ], tot_loss[loss=0.3839, simple_loss=0.3966, pruned_loss=0.1857, over 4719034.76 frames. ], batch size: 66, lr: 4.37e-02, grad_scale: 16.0 2023-09-28 12:28:53,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 12:28:53,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:28:54,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 12:28:54,141 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 12:28:54,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:28:57,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:29:00,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=17666.666666666668, ans=0.125 2023-09-28 12:29:01,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:29:01,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:29:05,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:29:06,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 12:29:06,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:29:06,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:29:09,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 12:29:11,453 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 12:29:14,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:29:16,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 12:29:16,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:29:16,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 12:29:20,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:29:20,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:29:22,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:29:22,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:29:29,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 12:29:29,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 12:29:34,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:29:37,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 12:29:37,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:29:39,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:29:39,931 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:29:41,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:29:41,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:29:43,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:29:43,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=17866.666666666668, ans=0.125 2023-09-28 12:29:46,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:29:47,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:29:47,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:29:50,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:29:52,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:29:53,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:29:53,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:29:54,222 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=12.97 vs. limit=14.2 2023-09-28 12:29:55,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:29:55,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 12:29:57,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=17933.333333333332, ans=0.006971014492753624 2023-09-28 12:29:59,220 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.50 vs. limit=20.95 2023-09-28 12:29:59,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:29:59,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:29:59,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:30:01,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 12:30:01,835 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:30:03,859 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.05 vs. limit=13.966666666666665 2023-09-28 12:30:04,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:30:04,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:30:06,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:30:08,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:30:10,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:30:10,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:30:13,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:30:13,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 12:30:15,474 INFO [train.py:1039] (3/4) Epoch 1, batch 2700, loss[loss=0.363, simple_loss=0.3679, pruned_loss=0.179, over 23690.00 frames. ], tot_loss[loss=0.3819, simple_loss=0.3957, pruned_loss=0.184, over 4724870.77 frames. ], batch size: 149, lr: 4.36e-02, grad_scale: 16.0 2023-09-28 12:30:16,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=18000.0, ans=0.07 2023-09-28 12:30:17,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:30:19,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 12:30:19,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=18000.0, ans=0.125 2023-09-28 12:30:20,066 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.81 vs. limit=21.0 2023-09-28 12:30:22,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:30:22,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:30:22,377 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:30:23,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:30:23,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:30:23,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:30:25,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 12:30:25,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 12:30:25,465 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:30:25,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=18000.0, ans=0.125 2023-09-28 12:30:27,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:30:28,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:30:30,111 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:30:32,942 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.245e+02 3.486e+02 4.470e+02 6.707e+02 1.380e+03, threshold=8.939e+02, percent-clipped=9.0 2023-09-28 12:30:33,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:30:36,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 12:30:36,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:30:41,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:30:41,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:30:48,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:30:48,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:30:48,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:30:49,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:30:52,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:30:55,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:30:55,856 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:30:55,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:30:56,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=18133.333333333332, ans=0.125 2023-09-28 12:31:00,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:31:00,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:31:10,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:31:10,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:31:16,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:31:16,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:31:17,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=18200.0, ans=0.0 2023-09-28 12:31:22,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:31:22,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:31:24,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:31:24,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:31:26,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:31:26,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:31:27,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:31:31,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:31:31,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:31:33,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=18266.666666666668, ans=0.125 2023-09-28 12:31:34,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 12:31:35,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:31:37,359 INFO [train.py:1039] (3/4) Epoch 1, batch 2750, loss[loss=0.3554, simple_loss=0.3649, pruned_loss=0.1729, over 23730.00 frames. ], tot_loss[loss=0.3787, simple_loss=0.3937, pruned_loss=0.1818, over 4730392.79 frames. ], batch size: 164, lr: 4.36e-02, grad_scale: 16.0 2023-09-28 12:31:37,459 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:31:37,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 12:31:39,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 12:31:39,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:31:42,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=18333.333333333332, ans=0.125 2023-09-28 12:31:43,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:31:43,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:31:45,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:31:45,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:31:45,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:31:50,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:31:50,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 12:31:50,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:31:50,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:31:50,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 12:31:50,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:31:52,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:31:59,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 12:32:02,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:32:02,298 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:32:03,794 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:32:03,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 12:32:05,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:32:07,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:32:07,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:32:07,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:32:12,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:32:12,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 12:32:12,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:32:13,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:32:15,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 12:32:21,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:32:24,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:32:25,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:32:30,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:32:30,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:32:30,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:32:31,312 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=18533.333333333332, ans=0.006840579710144928 2023-09-28 12:32:37,060 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:32:37,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:32:37,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 12:32:43,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:32:45,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 12:32:50,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 12:32:53,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:32:53,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 12:32:54,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:32:56,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:32:57,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 12:32:57,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:32:59,459 INFO [train.py:1039] (3/4) Epoch 1, batch 2800, loss[loss=0.3594, simple_loss=0.3683, pruned_loss=0.1752, over 23587.00 frames. ], tot_loss[loss=0.3743, simple_loss=0.3909, pruned_loss=0.1789, over 4738283.45 frames. ], batch size: 149, lr: 4.36e-02, grad_scale: 32.0 2023-09-28 12:33:01,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 12:33:01,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:33:02,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:33:04,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 12:33:04,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:33:04,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:33:05,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:33:05,810 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 12:33:05,811 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 12:33:10,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:33:11,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:33:11,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:33:17,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:33:17,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=18733.333333333332, ans=0.125 2023-09-28 12:33:18,625 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.102e+02 3.169e+02 4.499e+02 7.440e+02 2.031e+03, threshold=8.997e+02, percent-clipped=14.0 2023-09-28 12:33:18,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 12:33:20,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 12:33:23,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 12:33:24,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:33:24,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:33:24,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:33:28,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=18733.333333333332, ans=0.2443333333333334 2023-09-28 12:33:29,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:33:29,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=18733.333333333332, ans=0.2443333333333334 2023-09-28 12:33:31,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:33:31,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 12:33:31,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:33:39,481 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.48 vs. limit=14.4 2023-09-28 12:33:39,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:33:41,559 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:33:44,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:33:44,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:33:46,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:33:51,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:33:51,427 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 12:33:52,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:33:52,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:33:52,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:33:58,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:33:58,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:34:03,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:34:04,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:34:04,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:34:04,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:34:05,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:34:07,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:34:09,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:34:09,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 12:34:09,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:34:10,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:34:10,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:34:12,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 12:34:14,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:34:14,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:34:14,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:34:16,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 12:34:22,204 INFO [train.py:1039] (3/4) Epoch 1, batch 2850, loss[loss=0.3282, simple_loss=0.3572, pruned_loss=0.1496, over 24449.00 frames. ], tot_loss[loss=0.3702, simple_loss=0.3883, pruned_loss=0.176, over 4743536.72 frames. ], batch size: 58, lr: 4.35e-02, grad_scale: 32.0 2023-09-28 12:34:22,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:34:22,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:34:22,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:34:24,728 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:34:29,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:34:29,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:34:29,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:34:29,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=19000.0, ans=0.235 2023-09-28 12:34:31,950 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=7.05 vs. limit=14.625 2023-09-28 12:34:32,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:34:33,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:34:34,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:34:35,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 12:34:41,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 12:34:41,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:34:43,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 12:34:43,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:34:46,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 12:34:47,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 12:34:49,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:35:00,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:35:03,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:35:03,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:35:05,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 12:35:05,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:35:06,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:35:08,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:35:09,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 12:35:12,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:35:14,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:35:14,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:35:16,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:35:18,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:35:18,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:35:20,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:35:22,247 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:35:25,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:35:25,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:35:25,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:35:26,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:35:31,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=19266.666666666668, ans=0.125 2023-09-28 12:35:33,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:35:35,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 12:35:35,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 12:35:36,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:35:36,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:35:38,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 12:35:38,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:35:39,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:35:39,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:35:41,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:35:41,388 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 12:35:41,466 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 12:35:41,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:35:41,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:35:44,385 INFO [train.py:1039] (3/4) Epoch 1, batch 2900, loss[loss=0.3581, simple_loss=0.3706, pruned_loss=0.1728, over 23743.00 frames. ], tot_loss[loss=0.3705, simple_loss=0.3885, pruned_loss=0.1762, over 4746878.71 frames. ], batch size: 212, lr: 4.35e-02, grad_scale: 32.0 2023-09-28 12:35:46,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 12:35:48,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:35:48,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:35:50,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 12:35:53,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:35:55,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 12:35:55,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 12:35:56,550 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=4.58 vs. limit=14.75 2023-09-28 12:35:57,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:35:57,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:36:00,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:36:00,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:36:02,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=19400.0, ans=0.10600000000000001 2023-09-28 12:36:03,250 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.312e+02 3.297e+02 4.561e+02 6.852e+02 1.887e+03, threshold=9.123e+02, percent-clipped=12.0 2023-09-28 12:36:04,932 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:36:06,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:36:07,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:36:07,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=19400.0, ans=0.22099999999999997 2023-09-28 12:36:08,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 12:36:08,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:36:10,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:36:12,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=19400.0, ans=0.10600000000000001 2023-09-28 12:36:14,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 12:36:16,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 12:36:19,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:36:19,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 12:36:19,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:36:22,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:36:22,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 12:36:24,552 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=19466.666666666668, ans=0.035 2023-09-28 12:36:26,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:36:26,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:36:26,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=19466.666666666668, ans=0.21866666666666668 2023-09-28 12:36:28,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=19466.666666666668, ans=0.125 2023-09-28 12:36:31,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:36:33,309 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:36:33,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 12:36:34,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 12:36:34,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:36:39,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:36:43,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 12:36:44,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:36:49,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:36:56,565 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=19600.0, ans=0.125 2023-09-28 12:36:59,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:36:59,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:37:03,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 12:37:04,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:37:04,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 12:37:04,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:37:06,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:37:08,496 INFO [train.py:1039] (3/4) Epoch 1, batch 2950, loss[loss=0.3444, simple_loss=0.3755, pruned_loss=0.1566, over 24463.00 frames. ], tot_loss[loss=0.3697, simple_loss=0.3883, pruned_loss=0.1756, over 4746716.61 frames. ], batch size: 63, lr: 4.34e-02, grad_scale: 32.0 2023-09-28 12:37:13,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:37:13,893 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.75 vs. limit=14.875 2023-09-28 12:37:14,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 12:37:15,528 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.83 vs. limit=11.866666666666667 2023-09-28 12:37:16,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:37:16,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:37:18,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:37:19,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:37:21,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 12:37:22,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 12:37:24,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:37:24,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:37:29,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:37:31,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:37:34,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:37:34,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:37:36,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=19733.333333333332, ans=0.20933333333333337 2023-09-28 12:37:38,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:37:38,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:37:39,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:37:39,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=19800.0, ans=0.0 2023-09-28 12:37:41,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:37:41,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:37:44,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 12:37:49,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 12:37:49,271 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 12:37:50,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:37:52,219 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 12:37:53,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 12:37:54,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:37:55,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:37:55,691 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 12:37:55,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 12:37:58,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 12:37:58,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:37:58,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:38:03,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:38:04,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:38:04,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:38:04,851 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 12:38:04,912 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:38:04,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 12:38:12,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:38:13,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:38:15,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 12:38:15,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:38:17,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 12:38:19,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:38:21,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:38:22,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:38:23,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:38:23,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 12:38:25,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:38:26,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:38:26,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:38:26,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:38:28,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:38:30,186 INFO [train.py:1039] (3/4) Epoch 1, batch 3000, loss[loss=0.3735, simple_loss=0.3851, pruned_loss=0.181, over 23203.00 frames. ], tot_loss[loss=0.368, simple_loss=0.3877, pruned_loss=0.1741, over 4748338.96 frames. ], batch size: 119, lr: 4.34e-02, grad_scale: 32.0 2023-09-28 12:38:30,187 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-28 12:38:44,452 INFO [train.py:1071] (3/4) Epoch 1, validation: loss=0.4132, simple_loss=0.3632, pruned_loss=0.2317, over 1125622.00 frames. 2023-09-28 12:38:44,453 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-28 12:38:44,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:38:44,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:38:44,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 12:38:48,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:38:49,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:38:51,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 12:38:51,909 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.94 vs. limit=10.0 2023-09-28 12:38:54,495 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 12:38:55,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 12:38:57,657 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:38:59,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:38:59,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 12:38:59,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:39:01,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=20066.666666666668, ans=0.006507246376811594 2023-09-28 12:39:02,621 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.319e+02 3.597e+02 4.607e+02 6.753e+02 1.897e+03, threshold=9.214e+02, percent-clipped=10.0 2023-09-28 12:39:07,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:39:07,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=20066.666666666668, ans=0.125 2023-09-28 12:39:12,447 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=20066.666666666668, ans=0.125 2023-09-28 12:39:15,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:39:23,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 12:39:23,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:39:27,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:39:27,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:39:28,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:39:31,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:39:31,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 12:39:33,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 12:39:35,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:39:35,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 12:39:37,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:39:38,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:39:39,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:39:39,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:39:43,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:39:43,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:39:43,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:39:43,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=20200.0, ans=0.2 2023-09-28 12:39:46,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:39:46,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 12:39:47,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:39:49,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:39:49,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:39:53,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:39:55,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:39:56,126 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 12:39:57,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 12:39:57,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:39:57,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 12:39:59,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:39:59,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 12:40:02,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:40:03,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 12:40:03,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 12:40:05,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 12:40:05,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 12:40:07,411 INFO [train.py:1039] (3/4) Epoch 1, batch 3050, loss[loss=0.3161, simple_loss=0.3566, pruned_loss=0.1378, over 24350.00 frames. ], tot_loss[loss=0.3684, simple_loss=0.3886, pruned_loss=0.1741, over 4734629.61 frames. ], batch size: 61, lr: 4.33e-02, grad_scale: 32.0 2023-09-28 12:40:07,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:40:09,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:40:09,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 12:40:09,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:10,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:40:12,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 12:40:13,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:40:15,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:40:15,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:40:20,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:22,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 12:40:31,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 12:40:31,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 12:40:32,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:40:37,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:40:39,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:39,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:40:41,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:40:43,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=20466.666666666668, ans=0.1 2023-09-28 12:40:44,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:40:45,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:40:45,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:40:46,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:40:46,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:40:47,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:49,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:40:50,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:40:51,235 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=20466.666666666668, ans=0.125 2023-09-28 12:40:52,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 12:40:52,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:40:52,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:40:57,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:40:57,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 12:40:59,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:40:59,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:41:05,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:41:05,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:41:12,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:41:14,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:41:14,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:41:16,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:41:17,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 12:41:17,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:41:19,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 12:41:19,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:41:19,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:41:21,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 12:41:24,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:41:27,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=20600.0, ans=0.5 2023-09-28 12:41:28,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:41:30,060 INFO [train.py:1039] (3/4) Epoch 1, batch 3100, loss[loss=0.3178, simple_loss=0.369, pruned_loss=0.1334, over 24669.00 frames. ], tot_loss[loss=0.3682, simple_loss=0.3888, pruned_loss=0.1739, over 4729883.42 frames. ], batch size: 73, lr: 4.33e-02, grad_scale: 32.0 2023-09-28 12:41:32,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:41:35,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 12:41:37,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 12:41:40,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 12:41:40,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 12:41:42,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:41:47,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:41:47,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:41:48,536 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.389e+02 3.317e+02 4.517e+02 6.154e+02 1.203e+03, threshold=9.035e+02, percent-clipped=5.0 2023-09-28 12:41:50,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 12:41:54,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:41:58,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 12:41:59,623 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.94 vs. limit=22.5 2023-09-28 12:42:03,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 12:42:04,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:04,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:42:04,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:42:04,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 12:42:07,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:42:07,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 12:42:07,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:42:08,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:42:10,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 12:42:10,866 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=24.74 vs. limit=22.5 2023-09-28 12:42:12,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:42:13,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=20800.0, ans=0.2 2023-09-28 12:42:15,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:42:17,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 12:42:17,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 12:42:18,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:20,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:42:23,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:42:24,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:24,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:42:25,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:42:25,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:42:26,521 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.92 vs. limit=22.5 2023-09-28 12:42:28,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:42:28,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:42:28,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:28,673 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 12:42:28,964 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=20866.666666666668, ans=0.006333333333333333 2023-09-28 12:42:30,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=20866.666666666668, ans=0.125 2023-09-28 12:42:31,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:42:33,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 12:42:35,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=20933.333333333332, ans=0.0 2023-09-28 12:42:36,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:42:36,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 12:42:36,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:42:37,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:38,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 12:42:51,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 12:42:53,225 INFO [train.py:1039] (3/4) Epoch 1, batch 3150, loss[loss=0.3744, simple_loss=0.3798, pruned_loss=0.1846, over 23812.00 frames. ], tot_loss[loss=0.364, simple_loss=0.3849, pruned_loss=0.1716, over 4713107.77 frames. ], batch size: 212, lr: 4.32e-02, grad_scale: 32.0 2023-09-28 12:42:54,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:42:56,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:42:57,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:42:57,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:42:57,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 12:42:57,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=21000.0, ans=0.125 2023-09-28 12:43:00,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:43:00,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 12:43:01,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 12:43:03,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=21000.0, ans=0.0 2023-09-28 12:43:04,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:43:06,230 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 12:43:09,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 12:43:10,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:43:12,189 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 12:43:13,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 12:43:13,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 12:43:15,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 12:43:15,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 12:43:15,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:43:15,991 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:43:16,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:43:17,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 12:43:21,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:43:21,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:43:22,065 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=21066.666666666668, ans=0.125 2023-09-28 12:43:23,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:43:24,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 12:43:28,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 12:43:28,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:43:31,765 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:43:31,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:43:33,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 12:43:34,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 12:43:36,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:43:36,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 12:43:36,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 12:43:37,102 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.58 vs. limit=15.0 2023-09-28 12:43:37,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:43:37,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:43:39,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:43:39,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 12:43:39,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 12:43:40,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 12:43:40,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:43:42,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:43:42,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:43:42,599 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 12:43:44,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:43:44,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=21200.0, ans=0.1 2023-09-28 12:43:45,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 12:43:45,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:43:47,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 12:43:50,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 12:43:53,078 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:43:53,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:43:54,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 12:43:56,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 12:43:56,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:43:58,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:44:01,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:44:01,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:44:04,009 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=24.36 vs. limit=22.5 2023-09-28 12:44:06,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:44:06,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:44:09,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 12:44:09,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=21266.666666666668, ans=0.125 2023-09-28 12:44:16,259 INFO [train.py:1039] (3/4) Epoch 1, batch 3200, loss[loss=0.3415, simple_loss=0.3795, pruned_loss=0.1518, over 24329.00 frames. ], tot_loss[loss=0.3623, simple_loss=0.3844, pruned_loss=0.1701, over 4725766.24 frames. ], batch size: 61, lr: 4.32e-02, grad_scale: 32.0 2023-09-28 12:44:16,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:44:16,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 12:44:20,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:44:23,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:44:23,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 12:44:26,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:44:30,377 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:44:32,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=21400.0, ans=0.0 2023-09-28 12:44:33,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:44:35,028 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.369e+02 3.557e+02 4.560e+02 5.822e+02 1.709e+03, threshold=9.121e+02, percent-clipped=8.0 2023-09-28 12:44:35,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=21400.0, ans=0.125 2023-09-28 12:44:39,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=21400.0, ans=0.006217391304347826 2023-09-28 12:44:42,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:44:50,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 12:44:51,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:44:55,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 12:44:57,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 12:45:00,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:45:00,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:45:02,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:45:04,570 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten.whitening_limit, batch_count=21466.666666666668, ans=15.0 2023-09-28 12:45:05,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 12:45:08,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 12:45:08,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=21533.333333333332, ans=0.125 2023-09-28 12:45:10,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 12:45:15,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 12:45:16,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:45:21,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=21600.0, ans=0.035 2023-09-28 12:45:21,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=21600.0, ans=0.0061739130434782605 2023-09-28 12:45:21,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=21600.0, ans=0.0 2023-09-28 12:45:22,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:45:22,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 12:45:22,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:45:24,412 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 12:45:24,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 12:45:29,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:45:32,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 12:45:32,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 12:45:34,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 12:45:36,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 12:45:38,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:45:39,874 INFO [train.py:1039] (3/4) Epoch 1, batch 3250, loss[loss=0.3869, simple_loss=0.4175, pruned_loss=0.1781, over 24315.00 frames. ], tot_loss[loss=0.3615, simple_loss=0.3838, pruned_loss=0.1697, over 4715437.09 frames. ], batch size: 77, lr: 4.31e-02, grad_scale: 32.0 2023-09-28 12:45:40,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:45:40,161 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 12:45:41,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:45:41,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:45:41,681 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 12:45:46,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:45:48,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=21666.666666666668, ans=0.0 2023-09-28 12:45:49,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:45:59,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:45:59,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 12:45:59,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:46:00,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:46:00,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:46:00,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:46:02,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 12:46:04,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:46:04,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:46:06,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:46:06,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:46:06,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:46:06,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:46:06,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=21733.333333333332, ans=0.1 2023-09-28 12:46:11,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:46:13,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:46:14,050 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=13.67 vs. limit=15.0 2023-09-28 12:46:14,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:46:14,943 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:46:16,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:46:16,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:46:17,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:46:23,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 12:46:23,656 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:46:23,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=21800.0, ans=0.125 2023-09-28 12:46:24,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:46:24,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:46:26,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:46:26,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:46:32,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:46:41,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:46:41,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:46:41,051 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 12:46:41,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:46:41,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 12:46:42,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:46:44,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 12:46:45,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 12:46:45,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:46:47,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:46:49,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:46:49,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 12:46:49,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:46:52,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:46:52,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:46:54,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 12:46:54,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:46:57,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:46:57,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 12:46:59,589 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=21933.333333333332, ans=0.125 2023-09-28 12:47:00,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:47:00,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 12:47:02,062 INFO [train.py:1039] (3/4) Epoch 1, batch 3300, loss[loss=0.3513, simple_loss=0.3722, pruned_loss=0.1652, over 23450.00 frames. ], tot_loss[loss=0.3614, simple_loss=0.3837, pruned_loss=0.1696, over 4723117.35 frames. ], batch size: 134, lr: 4.31e-02, grad_scale: 32.0 2023-09-28 12:47:02,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 12:47:03,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 12:47:03,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:47:05,580 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=22000.0, ans=0.00608695652173913 2023-09-28 12:47:06,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:47:09,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:47:09,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:11,239 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=15.04 vs. limit=15.0 2023-09-28 12:47:12,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 12:47:12,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 12:47:14,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:47:16,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:47:20,469 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.229e+02 3.284e+02 5.093e+02 6.809e+02 1.583e+03, threshold=1.019e+03, percent-clipped=11.0 2023-09-28 12:47:24,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 12:47:25,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:47:25,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:47:27,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:27,290 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 12:47:27,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:47:28,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 12:47:30,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 12:47:30,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:47:30,403 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 12:47:35,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:47:35,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 12:47:37,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:37,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 12:47:38,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 12:47:38,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:40,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:47:41,636 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 12:47:43,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 12:47:45,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:47:46,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 12:47:48,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:47:51,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 12:47:51,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:47:54,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:47:54,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:47:54,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:47:54,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:47:57,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:47:58,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:47:58,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:48:00,208 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 12:48:01,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 12:48:02,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=22200.0, ans=0.0 2023-09-28 12:48:04,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 12:48:04,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:48:04,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:48:07,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:48:07,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:48:07,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:48:08,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:08,724 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:48:10,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:48:11,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:48:15,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 12:48:15,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:48:16,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:19,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 12:48:20,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:48:21,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:48:23,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:48:23,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:48:24,664 INFO [train.py:1039] (3/4) Epoch 1, batch 3350, loss[loss=0.3687, simple_loss=0.404, pruned_loss=0.1667, over 24542.00 frames. ], tot_loss[loss=0.3613, simple_loss=0.3841, pruned_loss=0.1693, over 4717999.38 frames. ], batch size: 71, lr: 4.30e-02, grad_scale: 16.0 2023-09-28 12:48:24,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:48:26,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:48:28,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:48:30,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:30,931 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:48:33,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:48:35,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:48:35,571 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=22333.333333333332, ans=0.125 2023-09-28 12:48:36,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:48:39,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 12:48:39,325 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 12:48:40,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:48:43,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 12:48:43,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 12:48:45,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=22400.0, ans=0.2 2023-09-28 12:48:46,805 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 12:48:46,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:48:48,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:48:48,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 12:48:48,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:48,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:48:51,610 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=12.52 vs. limit=15.0 2023-09-28 12:48:52,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:55,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:48:55,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:48:55,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:48:59,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:49:03,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:49:03,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:49:08,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:49:08,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:49:11,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:49:11,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:49:14,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:49:16,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 12:49:17,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 12:49:17,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 12:49:17,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:49:18,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 12:49:18,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=22533.333333333332, ans=0.0 2023-09-28 12:49:20,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:49:21,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:49:22,529 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.89 vs. limit=15.0 2023-09-28 12:49:28,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=22533.333333333332, ans=0.1 2023-09-28 12:49:29,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:49:31,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 12:49:32,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 12:49:32,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:49:32,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=22600.0, ans=0.125 2023-09-28 12:49:35,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:49:39,830 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=22600.0, ans=0.0 2023-09-28 12:49:41,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:49:44,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 12:49:44,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:49:44,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:49:48,231 INFO [train.py:1039] (3/4) Epoch 1, batch 3400, loss[loss=0.3953, simple_loss=0.4011, pruned_loss=0.1947, over 23775.00 frames. ], tot_loss[loss=0.3615, simple_loss=0.3848, pruned_loss=0.1691, over 4726872.17 frames. ], batch size: 212, lr: 4.29e-02, grad_scale: 16.0 2023-09-28 12:49:48,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:49:48,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 12:49:49,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:49:50,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 12:49:52,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:49:52,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:49:52,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=22666.666666666668, ans=0.2 2023-09-28 12:49:53,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 12:49:53,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:49:55,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 12:49:59,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 12:49:59,524 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 12:49:59,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:50:03,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:50:03,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 12:50:04,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:50:06,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 12:50:07,751 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.237e+02 3.122e+02 3.897e+02 5.653e+02 2.230e+03, threshold=7.795e+02, percent-clipped=8.0 2023-09-28 12:50:08,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=22733.333333333332, ans=0.125 2023-09-28 12:50:11,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:50:11,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=22733.333333333332, ans=0.125 2023-09-28 12:50:12,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 12:50:17,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:50:19,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:50:19,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:50:21,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 12:50:24,651 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.64 vs. limit=15.0 2023-09-28 12:50:28,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:50:34,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 12:50:40,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:50:41,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:50:41,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 12:50:41,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:50:42,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:50:42,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:50:43,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:50:48,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:50:51,080 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=13.97 vs. limit=15.0 2023-09-28 12:50:52,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:50:52,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:50:58,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:51:00,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 12:51:05,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 12:51:09,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 12:51:10,442 INFO [train.py:1039] (3/4) Epoch 1, batch 3450, loss[loss=0.3664, simple_loss=0.3761, pruned_loss=0.1784, over 23635.00 frames. ], tot_loss[loss=0.3597, simple_loss=0.3834, pruned_loss=0.168, over 4727678.18 frames. ], batch size: 149, lr: 4.29e-02, grad_scale: 16.0 2023-09-28 12:51:12,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 12:51:13,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=23000.0, ans=0.005869565217391305 2023-09-28 12:51:14,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:51:16,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:51:16,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 12:51:18,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:51:22,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:51:26,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:51:28,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:51:29,087 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.00 vs. limit=15.0 2023-09-28 12:51:29,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:51:29,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:51:32,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:51:37,359 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=23066.666666666668, ans=0.125 2023-09-28 12:51:38,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 12:51:40,245 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=23066.666666666668, ans=0.125 2023-09-28 12:51:41,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=23133.333333333332, ans=0.125 2023-09-28 12:51:44,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 12:51:44,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 12:51:44,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:51:46,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:51:50,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 12:51:51,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:51:56,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:51:56,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:51:58,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 12:51:59,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:52:03,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 12:52:03,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:52:04,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:52:05,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=23200.0, ans=0.0 2023-09-28 12:52:08,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:52:10,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 12:52:11,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:52:17,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:52:19,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:52:23,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:52:27,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:52:27,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:52:29,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:52:29,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:52:32,291 INFO [train.py:1039] (3/4) Epoch 1, batch 3500, loss[loss=0.3878, simple_loss=0.3953, pruned_loss=0.1901, over 23773.00 frames. ], tot_loss[loss=0.357, simple_loss=0.381, pruned_loss=0.1666, over 4716337.16 frames. ], batch size: 164, lr: 4.28e-02, grad_scale: 16.0 2023-09-28 12:52:34,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:52:38,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:52:39,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 12:52:41,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 12:52:45,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 12:52:48,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:52:48,954 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 12:52:51,801 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.155e+02 3.379e+02 4.182e+02 5.188e+02 1.059e+03, threshold=8.364e+02, percent-clipped=3.0 2023-09-28 12:52:55,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:52:55,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:52:55,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=23400.0, ans=0.125 2023-09-28 12:52:57,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:52:57,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:52:57,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 12:52:59,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:52:59,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:52:59,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 12:53:00,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:02,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:53:04,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:53:07,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:08,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 12:53:08,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:53:12,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:53:12,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=23466.666666666668, ans=0.0 2023-09-28 12:53:14,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 12:53:15,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:17,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:53:17,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:53:20,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 12:53:20,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 12:53:20,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 12:53:22,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:53:23,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:25,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:53:25,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 12:53:29,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 12:53:30,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:53:32,065 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=4.10 vs. limit=15.0 2023-09-28 12:53:35,712 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:53:37,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 12:53:37,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 12:53:37,247 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:53:40,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:53:40,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:53:41,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:46,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 12:53:46,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:53:48,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:53:48,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=23600.0, ans=0.125 2023-09-28 12:53:50,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 12:53:51,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 12:53:53,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:53:55,360 INFO [train.py:1039] (3/4) Epoch 1, batch 3550, loss[loss=0.3179, simple_loss=0.3623, pruned_loss=0.1367, over 24338.00 frames. ], tot_loss[loss=0.3552, simple_loss=0.3794, pruned_loss=0.1655, over 4721093.02 frames. ], batch size: 61, lr: 4.28e-02, grad_scale: 16.0 2023-09-28 12:53:55,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:53:55,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:53:57,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:54:00,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 12:54:05,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=23666.666666666668, ans=0.125 2023-09-28 12:54:09,255 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=23666.666666666668, ans=0.2 2023-09-28 12:54:10,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:54:12,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 12:54:12,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=23733.333333333332, ans=0.125 2023-09-28 12:54:12,557 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=3.53 vs. limit=12.0 2023-09-28 12:54:15,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:54:16,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 12:54:18,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:54:18,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:54:18,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 12:54:21,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=23733.333333333332, ans=0.125 2023-09-28 12:54:23,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:54:23,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:54:23,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:54:23,499 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 12:54:24,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 12:54:33,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 12:54:33,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 12:54:34,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:54:34,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:54:36,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:54:36,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 12:54:36,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:54:38,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:54:38,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 12:54:42,278 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.70 vs. limit=15.0 2023-09-28 12:54:43,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:54:44,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:54:46,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:54:47,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 12:54:49,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:54:50,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 12:54:50,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 12:54:53,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 12:54:53,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:54:58,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 12:55:00,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:55:04,358 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.34 vs. limit=10.0 2023-09-28 12:55:07,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:55:07,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 12:55:07,989 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.64 vs. limit=15.0 2023-09-28 12:55:08,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:55:14,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:55:14,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 12:55:17,953 INFO [train.py:1039] (3/4) Epoch 1, batch 3600, loss[loss=0.3386, simple_loss=0.3589, pruned_loss=0.1591, over 23610.00 frames. ], tot_loss[loss=0.3529, simple_loss=0.378, pruned_loss=0.1639, over 4725953.03 frames. ], batch size: 149, lr: 4.27e-02, grad_scale: 32.0 2023-09-28 12:55:21,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 12:55:21,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:55:22,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 12:55:25,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:55:26,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:55:26,386 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=24000.0, ans=0.1 2023-09-28 12:55:27,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:55:30,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:55:32,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:55:32,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=24066.666666666668, ans=0.05 2023-09-28 12:55:33,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 12:55:36,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 12:55:37,413 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.159e+02 2.926e+02 4.483e+02 7.377e+02 1.636e+03, threshold=8.966e+02, percent-clipped=15.0 2023-09-28 12:55:37,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:55:37,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 12:55:41,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:55:42,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:55:45,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:55:49,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:55:50,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 12:55:51,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:55:51,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 12:55:51,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 12:55:54,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:55:56,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 12:55:56,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:55:57,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:55:59,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:55:59,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 12:56:01,163 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=24133.333333333332, ans=0.04949747468305833 2023-09-28 12:56:06,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:56:07,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 12:56:09,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 12:56:12,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:56:14,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=24200.0, ans=0.125 2023-09-28 12:56:17,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:56:21,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:56:21,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=24200.0, ans=0.04949747468305833 2023-09-28 12:56:23,636 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.80 vs. limit=15.0 2023-09-28 12:56:27,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 12:56:27,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 12:56:27,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 12:56:29,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 12:56:31,894 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 12:56:34,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:56:35,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:56:36,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 12:56:36,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:56:36,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:56:36,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:56:38,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 12:56:39,578 INFO [train.py:1039] (3/4) Epoch 1, batch 3650, loss[loss=0.3621, simple_loss=0.3765, pruned_loss=0.1739, over 23402.00 frames. ], tot_loss[loss=0.3534, simple_loss=0.379, pruned_loss=0.1639, over 4725793.66 frames. ], batch size: 119, lr: 4.27e-02, grad_scale: 32.0 2023-09-28 12:56:39,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 12:56:43,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:56:43,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 12:56:50,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 12:56:53,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:56:57,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 12:56:59,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 12:57:03,302 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.37 vs. limit=15.0 2023-09-28 12:57:03,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:57:03,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 12:57:03,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 12:57:05,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=24400.0, ans=0.125 2023-09-28 12:57:06,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 12:57:06,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:57:08,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 12:57:09,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 12:57:09,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:57:09,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 12:57:11,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:57:12,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:57:12,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:57:14,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:57:18,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 12:57:19,621 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 12:57:19,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:57:22,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 12:57:24,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:57:24,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:57:27,020 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=24466.666666666668, ans=0.125 2023-09-28 12:57:30,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=24533.333333333332, ans=0.0 2023-09-28 12:57:33,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 12:57:33,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=24533.333333333332, ans=0.125 2023-09-28 12:57:35,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:57:35,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 12:57:35,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 12:57:35,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 12:57:38,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:57:40,310 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:57:41,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:57:41,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:57:43,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 12:57:46,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:57:46,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:57:49,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=24600.0, ans=0.125 2023-09-28 12:57:52,479 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 12:57:56,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:57:56,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:57:58,208 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 12:57:58,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:58:00,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 12:58:02,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:58:03,213 INFO [train.py:1039] (3/4) Epoch 1, batch 3700, loss[loss=0.3584, simple_loss=0.3966, pruned_loss=0.1602, over 24012.00 frames. ], tot_loss[loss=0.3534, simple_loss=0.3797, pruned_loss=0.1636, over 4732912.96 frames. ], batch size: 80, lr: 4.26e-02, grad_scale: 32.0 2023-09-28 12:58:04,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 12:58:04,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:58:07,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 12:58:10,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:58:10,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 12:58:11,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:58:11,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 12:58:13,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:58:14,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 12:58:14,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 12:58:15,195 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=24666.666666666668, ans=0.125 2023-09-28 12:58:17,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 12:58:18,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=24733.333333333332, ans=0.2 2023-09-28 12:58:22,279 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.123e+02 3.422e+02 4.027e+02 5.760e+02 1.496e+03, threshold=8.053e+02, percent-clipped=7.0 2023-09-28 12:58:22,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:58:23,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:58:25,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 12:58:25,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:58:25,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 12:58:28,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:58:28,562 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 12:58:39,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 12:58:40,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 12:58:41,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 12:58:41,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 12:58:41,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:58:43,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=24800.0, ans=0.0 2023-09-28 12:58:44,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:58:44,783 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=24800.0, ans=0.125 2023-09-28 12:58:44,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=24800.0, ans=0.125 2023-09-28 12:58:46,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 12:58:47,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:58:49,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 12:58:50,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:58:52,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 12:58:53,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 12:58:58,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:58:58,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 12:58:58,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:58:58,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 12:59:03,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 12:59:03,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 12:59:06,027 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.67 vs. limit=10.0 2023-09-28 12:59:06,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:59:08,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 12:59:09,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 12:59:09,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 12:59:09,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:59:11,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:59:13,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 12:59:15,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 12:59:16,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 12:59:16,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 12:59:18,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:59:19,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 12:59:19,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 12:59:22,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 12:59:24,345 INFO [train.py:1039] (3/4) Epoch 1, batch 3750, loss[loss=0.3188, simple_loss=0.3623, pruned_loss=0.1376, over 24355.00 frames. ], tot_loss[loss=0.3531, simple_loss=0.3799, pruned_loss=0.1632, over 4742319.55 frames. ], batch size: 74, lr: 4.26e-02, grad_scale: 32.0 2023-09-28 12:59:24,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 12:59:24,843 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 12:59:25,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 12:59:27,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 12:59:29,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 12:59:32,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 12:59:32,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 12:59:33,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 12:59:35,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:59:37,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 12:59:39,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 12:59:41,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:59:45,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 12:59:46,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 12:59:49,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 12:59:51,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 12:59:53,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 12:59:53,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 12:59:54,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 12:59:54,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 12:59:55,923 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=13.99 vs. limit=15.0 2023-09-28 12:59:59,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 13:00:01,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=25133.333333333332, ans=0.125 2023-09-28 13:00:03,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 13:00:03,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:00:03,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:00:05,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:00:07,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=25133.333333333332, ans=0.1 2023-09-28 13:00:12,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:00:12,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 13:00:17,266 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.32 vs. limit=15.0 2023-09-28 13:00:18,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 13:00:20,008 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=25200.0, ans=0.0 2023-09-28 13:00:21,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:00:21,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=25200.0, ans=0.125 2023-09-28 13:00:21,504 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=25200.0, ans=0.1 2023-09-28 13:00:25,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:00:25,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:00:29,348 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.22 vs. limit=15.0 2023-09-28 13:00:30,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:00:34,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 13:00:34,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:00:36,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:00:38,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:00:38,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=25266.666666666668, ans=0.125 2023-09-28 13:00:38,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=25266.666666666668, ans=0.125 2023-09-28 13:00:39,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 13:00:45,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=25333.333333333332, ans=0.0 2023-09-28 13:00:46,959 INFO [train.py:1039] (3/4) Epoch 1, batch 3800, loss[loss=0.3697, simple_loss=0.3974, pruned_loss=0.171, over 23918.00 frames. ], tot_loss[loss=0.3532, simple_loss=0.3799, pruned_loss=0.1633, over 4743713.00 frames. ], batch size: 86, lr: 4.25e-02, grad_scale: 32.0 2023-09-28 13:00:49,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:00:49,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=25333.333333333332, ans=0.0 2023-09-28 13:00:49,511 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=25333.333333333332, ans=0.1 2023-09-28 13:00:52,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:00:54,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 13:00:55,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 13:00:55,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:00:57,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:01:00,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 13:01:01,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 13:01:01,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:01:02,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:01:03,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:01:05,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:01:06,430 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.982e+02 3.306e+02 4.581e+02 6.587e+02 1.016e+03, threshold=9.163e+02, percent-clipped=14.0 2023-09-28 13:01:06,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:01:06,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 13:01:11,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 13:01:12,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:01:14,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:01:16,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=25400.0, ans=0.1 2023-09-28 13:01:17,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:01:17,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:01:19,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 13:01:19,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:01:23,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:01:25,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:01:29,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 13:01:31,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 13:01:33,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:01:39,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:01:43,415 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.70 vs. limit=15.0 2023-09-28 13:01:43,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:01:45,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 13:01:49,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 13:01:50,029 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=15.85 vs. limit=22.5 2023-09-28 13:01:50,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:01:52,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:01:54,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:01:54,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 13:02:00,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 13:02:00,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 13:02:00,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:02:01,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:02:06,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:02:07,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:02:09,339 INFO [train.py:1039] (3/4) Epoch 1, batch 3850, loss[loss=0.3522, simple_loss=0.3838, pruned_loss=0.1603, over 23721.00 frames. ], tot_loss[loss=0.3502, simple_loss=0.3775, pruned_loss=0.1615, over 4750924.94 frames. ], batch size: 85, lr: 4.24e-02, grad_scale: 32.0 2023-09-28 13:02:12,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:02:14,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 13:02:14,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:02:14,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:02:18,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 13:02:20,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:02:23,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 13:02:23,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 13:02:33,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:33,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:02:36,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:02:36,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:02:39,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:41,451 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:02:43,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:02:43,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:02:44,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:02:46,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:02:47,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:47,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:02:49,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 13:02:49,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 13:02:49,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:02:50,132 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.24 vs. limit=12.0 2023-09-28 13:02:50,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:54,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:02:55,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:02:56,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 13:02:59,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 13:02:59,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:03:01,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 13:03:04,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 13:03:11,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:03:11,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:03:11,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=25866.666666666668, ans=0.2 2023-09-28 13:03:16,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:03:16,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 13:03:17,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 13:03:18,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=25933.333333333332, ans=0.125 2023-09-28 13:03:20,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:03:20,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:03:23,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 13:03:23,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 13:03:25,454 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:25,590 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:25,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:03:27,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 13:03:27,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:03:29,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 13:03:29,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:29,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:03:30,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:03:32,688 INFO [train.py:1039] (3/4) Epoch 1, batch 3900, loss[loss=0.3626, simple_loss=0.3732, pruned_loss=0.176, over 23552.00 frames. ], tot_loss[loss=0.3473, simple_loss=0.3748, pruned_loss=0.1599, over 4727669.97 frames. ], batch size: 134, lr: 4.24e-02, grad_scale: 32.0 2023-09-28 13:03:32,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:32,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:03:33,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:03:33,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:03:34,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:03:34,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 13:03:34,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:38,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:03:38,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 13:03:38,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:03:41,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:03:42,035 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=26000.0, ans=0.125 2023-09-28 13:03:44,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 13:03:44,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:45,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=26000.0, ans=0.125 2023-09-28 13:03:46,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:03:47,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 13:03:47,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:03:49,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 13:03:50,221 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.98 vs. limit=15.0 2023-09-28 13:03:50,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:03:51,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 13:03:52,277 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.968e+02 2.956e+02 3.706e+02 4.742e+02 9.282e+02, threshold=7.412e+02, percent-clipped=1.0 2023-09-28 13:03:52,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 13:03:57,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:03:57,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:03:57,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:03:57,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:03:57,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=26066.666666666668, ans=0.025 2023-09-28 13:04:03,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:04:06,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=26133.333333333332, ans=0.0 2023-09-28 13:04:07,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:04:07,965 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=26133.333333333332, ans=0.2 2023-09-28 13:04:08,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=26133.333333333332, ans=0.1 2023-09-28 13:04:09,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:04:09,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:04:11,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:04:13,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten.whitening_limit, batch_count=26133.333333333332, ans=15.0 2023-09-28 13:04:19,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:04:20,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:04:28,117 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.05 vs. limit=15.0 2023-09-28 13:04:28,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:04:29,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:04:40,390 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:04:42,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:04:42,102 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 13:04:44,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 13:04:44,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:04:44,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=26266.666666666668, ans=0.025 2023-09-28 13:04:45,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 13:04:49,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:04:49,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 13:04:55,934 INFO [train.py:1039] (3/4) Epoch 1, batch 3950, loss[loss=0.3656, simple_loss=0.381, pruned_loss=0.175, over 23811.00 frames. ], tot_loss[loss=0.3455, simple_loss=0.3738, pruned_loss=0.1586, over 4733073.92 frames. ], batch size: 179, lr: 4.23e-02, grad_scale: 32.0 2023-09-28 13:04:57,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:05:00,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 13:05:00,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:05:02,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:05:03,399 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.91 vs. limit=8.0 2023-09-28 13:05:03,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:05:09,882 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 13:05:09,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:05:10,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 13:05:12,052 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 13:05:12,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:05:14,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=26400.0, ans=0.125 2023-09-28 13:05:15,803 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.72 vs. limit=15.0 2023-09-28 13:05:16,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:05:16,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:05:16,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:05:19,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 13:05:19,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=26400.0, ans=0.0 2023-09-28 13:05:21,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:05:22,122 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=26400.0, ans=0.2 2023-09-28 13:05:23,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:05:23,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:05:23,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:05:25,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:05:33,400 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:05:37,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:05:37,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:05:42,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 13:05:47,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 13:05:47,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 13:05:47,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:05:49,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:05:53,453 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=26533.333333333332, ans=0.025 2023-09-28 13:05:58,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:05:58,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=26533.333333333332, ans=0.1 2023-09-28 13:05:59,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=26533.333333333332, ans=0.125 2023-09-28 13:06:00,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:06:00,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:06:00,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:06:01,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 13:06:06,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:06:06,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:06:06,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=26600.0, ans=0.2 2023-09-28 13:06:11,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 13:06:20,838 INFO [train.py:1039] (3/4) Epoch 1, batch 4000, loss[loss=0.3565, simple_loss=0.3789, pruned_loss=0.1671, over 23349.00 frames. ], tot_loss[loss=0.3461, simple_loss=0.3747, pruned_loss=0.1587, over 4745120.03 frames. ], batch size: 119, lr: 4.23e-02, grad_scale: 32.0 2023-09-28 13:06:22,847 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=2.518e-03 2023-09-28 13:06:26,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:06:32,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:06:37,990 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.04 vs. limit=12.0 2023-09-28 13:06:38,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:06:38,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:06:40,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:06:40,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 13:06:40,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:06:41,654 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.100e+02 3.230e+02 4.293e+02 5.644e+02 1.126e+03, threshold=8.585e+02, percent-clipped=11.0 2023-09-28 13:06:41,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 13:06:41,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:06:41,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 13:06:43,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:06:44,143 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.58 vs. limit=22.5 2023-09-28 13:06:45,593 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=23.07 vs. limit=22.5 2023-09-28 13:06:45,646 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.14 vs. limit=22.5 2023-09-28 13:06:47,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:06:47,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:06:47,802 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:06:47,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:06:47,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 13:06:49,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:06:50,938 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 13:06:52,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:06:52,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:06:54,506 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=26800.0, ans=0.0 2023-09-28 13:06:55,610 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 13:06:57,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 13:06:57,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:07:05,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 13:07:05,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:07:06,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:07:09,142 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 13:07:09,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:07:10,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 13:07:10,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:07:10,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:07:12,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:07:14,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:07:14,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=26866.666666666668, ans=0.125 2023-09-28 13:07:15,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:07:15,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:07:17,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 13:07:18,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:07:20,063 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 13:07:20,579 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=26866.666666666668, ans=0.0050289855072463766 2023-09-28 13:07:24,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:07:28,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 13:07:29,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:07:31,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:07:33,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:07:33,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:07:38,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:07:38,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=26933.333333333332, ans=0.125 2023-09-28 13:07:40,008 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=26933.333333333332, ans=0.125 2023-09-28 13:07:41,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 13:07:41,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 13:07:43,723 INFO [train.py:1039] (3/4) Epoch 1, batch 4050, loss[loss=0.3119, simple_loss=0.349, pruned_loss=0.1374, over 24550.00 frames. ], tot_loss[loss=0.3451, simple_loss=0.3747, pruned_loss=0.1577, over 4753378.62 frames. ], batch size: 60, lr: 4.22e-02, grad_scale: 32.0 2023-09-28 13:07:45,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:07:45,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:07:46,236 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.93 vs. limit=15.0 2023-09-28 13:07:46,995 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:07:48,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:07:49,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:07:53,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:07:54,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:07:56,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 13:07:57,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:07:58,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=27066.666666666668, ans=0.2 2023-09-28 13:07:59,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:08:02,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:08:04,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:08:04,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=27066.666666666668, ans=0.2 2023-09-28 13:08:07,024 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.87 vs. limit=12.0 2023-09-28 13:08:07,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 13:08:09,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 13:08:09,186 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 13:08:12,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:08:14,270 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=27133.333333333332, ans=0.2 2023-09-28 13:08:21,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 13:08:22,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:08:26,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:08:29,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:08:29,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:08:29,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:08:32,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:08:35,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 13:08:35,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 13:08:35,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=27200.0, ans=0.125 2023-09-28 13:08:37,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:08:39,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 13:08:43,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:08:45,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=27200.0, ans=0.0 2023-09-28 13:08:49,000 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.64 vs. limit=15.0 2023-09-28 13:08:52,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 13:08:52,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:08:52,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:08:56,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 13:08:56,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 13:08:56,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:08:57,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:08:58,670 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.32 vs. limit=15.0 2023-09-28 13:08:59,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:08:59,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:09:02,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=27266.666666666668, ans=0.125 2023-09-28 13:09:05,401 INFO [train.py:1039] (3/4) Epoch 1, batch 4100, loss[loss=0.305, simple_loss=0.3577, pruned_loss=0.1262, over 24474.00 frames. ], tot_loss[loss=0.346, simple_loss=0.3755, pruned_loss=0.1583, over 4756438.75 frames. ], batch size: 66, lr: 4.22e-02, grad_scale: 32.0 2023-09-28 13:09:08,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 13:09:10,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 13:09:11,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=27333.333333333332, ans=0.1 2023-09-28 13:09:13,453 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=27333.333333333332, ans=0.125 2023-09-28 13:09:14,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 13:09:14,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 13:09:14,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:09:16,736 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:09:16,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:09:18,100 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:09:18,238 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 13:09:22,859 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:09:24,288 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.841e+02 2.870e+02 3.486e+02 4.089e+02 6.314e+02, threshold=6.972e+02, percent-clipped=0.0 2023-09-28 13:09:24,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:09:24,482 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:09:24,571 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=27400.0, ans=0.015 2023-09-28 13:09:25,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:09:26,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=27400.0, ans=0.2 2023-09-28 13:09:32,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:09:33,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:09:33,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:09:33,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 13:09:35,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:09:35,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:09:35,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:09:35,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:09:36,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 13:09:38,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:09:38,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 13:09:40,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:09:43,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:09:43,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 13:09:46,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:09:46,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:09:47,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:09:49,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 13:09:49,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:09:50,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:09:55,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 13:09:56,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:09:57,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:09:57,823 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=27533.333333333332, ans=0.0 2023-09-28 13:10:01,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:10:08,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:10:11,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:10:12,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:10:16,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=27600.0, ans=0.0 2023-09-28 13:10:17,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:10:17,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:10:20,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:10:24,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:10:27,502 INFO [train.py:1039] (3/4) Epoch 1, batch 4150, loss[loss=0.4321, simple_loss=0.4097, pruned_loss=0.2272, over 20017.00 frames. ], tot_loss[loss=0.3463, simple_loss=0.3756, pruned_loss=0.1585, over 4735433.22 frames. ], batch size: 388, lr: 4.21e-02, grad_scale: 32.0 2023-09-28 13:10:29,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:10:30,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:10:32,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:10:32,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:10:35,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 13:10:35,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:10:35,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 13:10:37,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 13:10:37,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 13:10:38,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:10:44,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:10:44,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:10:47,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:10:47,554 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:10:49,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 13:10:50,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:10:50,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=27733.333333333332, ans=0.125 2023-09-28 13:10:51,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:10:53,317 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 13:10:58,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:11:03,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:11:03,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 13:11:06,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 13:11:06,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:11:07,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 13:11:07,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:11:07,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:11:11,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:11:13,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:11:16,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 13:11:20,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:11:22,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:11:22,975 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 13:11:24,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:11:25,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 13:11:27,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:11:29,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:11:30,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:11:30,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 13:11:30,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:11:30,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 13:11:32,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 13:11:35,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 13:11:35,399 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:11:37,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:11:37,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 13:11:38,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 13:11:40,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:11:40,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 13:11:40,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:11:43,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:11:43,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 13:11:43,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:11:49,471 INFO [train.py:1039] (3/4) Epoch 1, batch 4200, loss[loss=0.363, simple_loss=0.3889, pruned_loss=0.1686, over 24052.00 frames. ], tot_loss[loss=0.3445, simple_loss=0.374, pruned_loss=0.1575, over 4740117.78 frames. ], batch size: 86, lr: 4.20e-02, grad_scale: 32.0 2023-09-28 13:11:49,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:11:51,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 13:11:52,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:11:54,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:11:54,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:11:56,019 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:11:56,022 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:11:56,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=28000.0, ans=0.1 2023-09-28 13:11:58,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 13:12:00,811 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=28000.0, ans=0.125 2023-09-28 13:12:01,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 13:12:02,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:12:05,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:12:05,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=28066.666666666668, ans=0.125 2023-09-28 13:12:07,353 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.71 vs. limit=15.0 2023-09-28 13:12:08,518 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.106e+02 3.443e+02 4.074e+02 5.504e+02 1.074e+03, threshold=8.148e+02, percent-clipped=10.0 2023-09-28 13:12:08,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:12:08,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=28066.666666666668, ans=0.025 2023-09-28 13:12:13,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 13:12:14,991 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:12:15,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:12:16,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 13:12:16,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:12:18,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:12:18,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:12:19,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:12:22,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:12:23,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 13:12:23,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:12:25,072 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=14.57 vs. limit=22.5 2023-09-28 13:12:28,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 13:12:28,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:12:30,511 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=28133.333333333332, ans=0.0 2023-09-28 13:12:33,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:12:33,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:12:35,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:12:36,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 13:12:36,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:12:36,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=28200.0, ans=0.1 2023-09-28 13:12:38,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:12:41,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=28200.0, ans=0.2 2023-09-28 13:12:44,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:12:46,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:12:46,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=28200.0, ans=0.125 2023-09-28 13:12:48,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=28200.0, ans=0.0 2023-09-28 13:12:52,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:12:55,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 13:12:58,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:13:03,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 13:13:03,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:13:03,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=28266.666666666668, ans=0.125 2023-09-28 13:13:06,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 13:13:11,071 INFO [train.py:1039] (3/4) Epoch 1, batch 4250, loss[loss=0.3612, simple_loss=0.399, pruned_loss=0.1617, over 24295.00 frames. ], tot_loss[loss=0.3416, simple_loss=0.3713, pruned_loss=0.156, over 4735085.02 frames. ], batch size: 77, lr: 4.20e-02, grad_scale: 32.0 2023-09-28 13:13:11,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:13:15,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:13:17,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 13:13:20,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:13:22,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=28333.333333333332, ans=0.125 2023-09-28 13:13:25,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:13:25,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 13:13:27,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:13:29,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:13:33,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:13:37,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:13:38,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:13:40,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:13:41,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:13:41,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:13:43,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:13:43,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:13:46,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:13:48,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:13:49,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 13:13:52,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 13:13:52,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:13:52,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:13:53,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:13:53,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:13:53,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:13:55,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:13:58,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 13:14:00,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:14:04,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:14:05,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:14:07,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 13:14:07,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:14:07,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 13:14:10,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:14:12,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:14:13,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:14:13,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:14:16,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 13:14:18,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:14:19,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:14:23,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:14:23,891 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=28600.0, ans=0.1 2023-09-28 13:14:26,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:14:28,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:14:28,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:14:31,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:14:32,639 INFO [train.py:1039] (3/4) Epoch 1, batch 4300, loss[loss=0.358, simple_loss=0.3789, pruned_loss=0.1685, over 23739.00 frames. ], tot_loss[loss=0.3408, simple_loss=0.3703, pruned_loss=0.1556, over 4720654.69 frames. ], batch size: 164, lr: 4.19e-02, grad_scale: 32.0 2023-09-28 13:14:32,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:14:33,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:14:34,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 13:14:36,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:14:42,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:14:43,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:14:46,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:14:50,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=28733.333333333332, ans=0.125 2023-09-28 13:14:52,615 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.815e+02 2.945e+02 3.341e+02 4.373e+02 7.931e+02, threshold=6.681e+02, percent-clipped=0.0 2023-09-28 13:14:53,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:14:53,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 13:14:54,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:14:54,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=28733.333333333332, ans=0.035 2023-09-28 13:14:59,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:14:59,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:14:59,596 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 13:15:02,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 13:15:04,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:15:06,475 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.76 vs. limit=10.0 2023-09-28 13:15:07,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 13:15:08,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:15:08,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 13:15:11,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 13:15:11,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:15:15,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:15:15,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:15:16,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:15:18,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:15:19,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:15:19,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 13:15:19,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 13:15:22,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:15:23,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=28866.666666666668, ans=0.2 2023-09-28 13:15:25,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:15:25,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 13:15:27,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:15:27,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:15:27,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 13:15:27,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 13:15:29,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 13:15:29,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:15:29,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 13:15:29,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 13:15:34,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:15:36,030 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 13:15:37,554 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:15:39,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:15:39,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:15:40,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 13:15:42,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:15:42,913 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:15:44,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:15:44,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:15:44,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:15:46,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=28933.333333333332, ans=0.125 2023-09-28 13:15:47,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:15:47,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=28933.333333333332, ans=0.004579710144927537 2023-09-28 13:15:50,296 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.88 vs. limit=22.5 2023-09-28 13:15:51,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:15:52,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:15:52,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:15:55,545 INFO [train.py:1039] (3/4) Epoch 1, batch 4350, loss[loss=0.364, simple_loss=0.3871, pruned_loss=0.1705, over 23576.00 frames. ], tot_loss[loss=0.3406, simple_loss=0.3711, pruned_loss=0.1551, over 4732394.40 frames. ], batch size: 232, lr: 4.19e-02, grad_scale: 32.0 2023-09-28 13:15:57,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 13:15:58,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 13:16:02,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:16:05,030 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.22 vs. limit=15.0 2023-09-28 13:16:05,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:16:07,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:16:07,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:16:12,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:16:17,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:16:19,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:16:19,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:16:24,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:16:26,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=29066.666666666668, ans=0.07 2023-09-28 13:16:27,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:16:28,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:16:30,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=29133.333333333332, ans=0.1 2023-09-28 13:16:33,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 13:16:33,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:16:37,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:16:41,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:16:42,460 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=16.72 vs. limit=15.0 2023-09-28 13:16:44,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 13:16:45,278 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=29200.0, ans=0.125 2023-09-28 13:16:48,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:16:50,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:16:55,286 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 13:16:56,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:16:58,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:16:59,785 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 13:16:59,877 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 13:16:59,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:17:01,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:17:02,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:17:02,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:17:03,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:17:03,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:17:05,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 13:17:05,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:05,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:17:05,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:06,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 13:17:07,148 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 13:17:07,155 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 13:17:07,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 13:17:10,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:17:10,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:17:12,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:17:12,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:17:15,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 13:17:18,356 INFO [train.py:1039] (3/4) Epoch 1, batch 4400, loss[loss=0.3342, simple_loss=0.3854, pruned_loss=0.1415, over 23998.00 frames. ], tot_loss[loss=0.3429, simple_loss=0.3723, pruned_loss=0.1567, over 4726401.96 frames. ], batch size: 80, lr: 4.18e-02, grad_scale: 32.0 2023-09-28 13:17:18,454 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 13:17:18,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:25,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:17:25,022 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:26,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:17:28,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 13:17:28,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=29333.333333333332, ans=0.5 2023-09-28 13:17:30,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 13:17:30,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 13:17:30,335 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 13:17:31,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:17:31,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:17:32,122 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=29333.333333333332, ans=0.004492753623188407 2023-09-28 13:17:34,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 13:17:35,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=29400.0, ans=0.1 2023-09-28 13:17:36,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:37,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:17:38,341 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.230e+02 3.318e+02 4.079e+02 5.261e+02 1.011e+03, threshold=8.157e+02, percent-clipped=12.0 2023-09-28 13:17:38,451 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 13:17:41,597 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:17:41,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 13:17:41,710 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 13:17:41,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=29400.0, ans=0.1 2023-09-28 13:17:44,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 13:17:44,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 13:17:45,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 13:17:45,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:17:47,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:17:47,408 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=29400.0, ans=0.2 2023-09-28 13:17:48,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:17:50,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:17:51,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 13:17:51,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 13:17:53,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:17:54,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:17:54,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:17:55,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=29466.666666666668, ans=0.025 2023-09-28 13:17:56,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:17:56,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:17:56,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 13:17:58,705 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 13:18:03,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:18:09,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:18:12,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 13:18:15,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:18:18,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:18:20,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:18:21,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 13:18:21,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:18:22,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:18:22,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:18:23,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:18:28,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 13:18:29,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 13:18:31,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 13:18:31,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:18:31,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 13:18:31,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:18:35,126 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:18:38,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 13:18:41,509 INFO [train.py:1039] (3/4) Epoch 1, batch 4450, loss[loss=0.4534, simple_loss=0.4378, pruned_loss=0.2345, over 19486.00 frames. ], tot_loss[loss=0.3426, simple_loss=0.3727, pruned_loss=0.1562, over 4739104.17 frames. ], batch size: 389, lr: 4.17e-02, grad_scale: 32.0 2023-09-28 13:18:41,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:18:45,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:18:45,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:18:52,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:18:52,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:18:56,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:18:58,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:18:58,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=29733.333333333332, ans=0.125 2023-09-28 13:19:00,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:19:01,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:19:03,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 13:19:03,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:19:05,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:19:05,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:19:05,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:19:08,142 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 13:19:14,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:19:14,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:19:16,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:19:17,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:19:19,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:19:22,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 13:19:23,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=29800.0, ans=0.2 2023-09-28 13:19:24,989 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 13:19:25,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 13:19:25,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:19:26,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:19:28,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 13:19:30,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=29866.666666666668, ans=0.0 2023-09-28 13:19:31,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:19:37,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:19:37,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 13:19:37,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:19:37,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:19:37,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:19:37,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:19:40,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:19:43,129 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.43 vs. limit=15.0 2023-09-28 13:19:44,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 13:19:44,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 13:19:46,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:19:49,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:19:50,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:19:52,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:19:53,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 13:19:55,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:19:59,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 13:20:01,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:20:03,927 INFO [train.py:1039] (3/4) Epoch 1, batch 4500, loss[loss=0.415, simple_loss=0.4041, pruned_loss=0.2129, over 19930.00 frames. ], tot_loss[loss=0.3428, simple_loss=0.3725, pruned_loss=0.1565, over 4725442.67 frames. ], batch size: 389, lr: 4.17e-02, grad_scale: 32.0 2023-09-28 13:20:07,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:20:08,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 13:20:08,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 13:20:09,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=30000.0, ans=0.0 2023-09-28 13:20:10,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:20:15,959 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:20:17,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:20:17,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:20:18,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:20:18,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:20:19,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:20:23,495 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.079e+02 2.950e+02 3.479e+02 4.293e+02 6.506e+02, threshold=6.959e+02, percent-clipped=0.0 2023-09-28 13:20:33,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:20:34,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:20:37,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:20:39,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:20:39,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:20:45,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 13:20:49,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:20:49,898 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.64 vs. limit=15.0 2023-09-28 13:20:51,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=30133.333333333332, ans=0.004318840579710145 2023-09-28 13:20:53,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:20:57,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:20:57,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 13:20:57,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:20:59,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:21:00,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=30200.0, ans=0.004304347826086957 2023-09-28 13:21:02,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:21:02,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:21:05,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:21:05,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 13:21:05,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:21:05,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:21:10,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:21:10,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:21:14,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:21:15,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:21:15,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:21:16,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=30266.666666666668, ans=0.004289855072463768 2023-09-28 13:21:17,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 13:21:20,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 13:21:20,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 13:21:22,112 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=35.11 vs. limit=22.5 2023-09-28 13:21:22,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 13:21:26,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 13:21:27,405 INFO [train.py:1039] (3/4) Epoch 1, batch 4550, loss[loss=0.3428, simple_loss=0.336, pruned_loss=0.1748, over 19085.00 frames. ], tot_loss[loss=0.3419, simple_loss=0.3715, pruned_loss=0.1561, over 4714630.74 frames. ], batch size: 388, lr: 4.16e-02, grad_scale: 32.0 2023-09-28 13:21:27,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:21:27,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=30333.333333333332, ans=0.00427536231884058 2023-09-28 13:21:28,560 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=14.65 vs. limit=15.0 2023-09-28 13:21:32,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:21:32,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:21:37,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:21:40,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:21:41,579 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=16.71 vs. limit=22.5 2023-09-28 13:21:42,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:21:44,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:21:44,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:21:44,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:21:46,657 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:21:48,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:21:51,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:21:52,924 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.85 vs. limit=15.0 2023-09-28 13:21:53,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 13:21:55,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 13:21:55,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:21:56,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 13:21:57,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=30400.0, ans=0.125 2023-09-28 13:21:58,494 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=30400.0, ans=0.125 2023-09-28 13:22:01,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 13:22:02,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:22:05,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=30466.666666666668, ans=0.0 2023-09-28 13:22:07,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 13:22:09,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:22:13,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:13,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:14,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:22:16,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 13:22:17,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=30533.333333333332, ans=0.004231884057971015 2023-09-28 13:22:19,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:22:20,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:20,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:22:22,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:22:23,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=30533.333333333332, ans=0.0 2023-09-28 13:22:24,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 13:22:24,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 13:22:24,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:22:26,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 13:22:28,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 13:22:28,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:22:28,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:22:29,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:22:31,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:31,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:22:31,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 13:22:32,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 13:22:35,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:22:35,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 13:22:35,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 13:22:35,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:22:37,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 13:22:40,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:22:40,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:22:42,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:22:44,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:22:44,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 13:22:45,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:22:48,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:22:51,269 INFO [train.py:1039] (3/4) Epoch 1, batch 4600, loss[loss=0.366, simple_loss=0.3868, pruned_loss=0.1726, over 23257.00 frames. ], tot_loss[loss=0.3402, simple_loss=0.3692, pruned_loss=0.1556, over 4696273.65 frames. ], batch size: 119, lr: 4.15e-02, grad_scale: 32.0 2023-09-28 13:22:51,682 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:22:52,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:22:54,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:22:56,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:22:58,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:22:58,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:22:59,921 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 13:23:02,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:23:04,749 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.13 vs. limit=6.0 2023-09-28 13:23:05,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:23:06,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:23:08,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:08,542 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=30733.333333333332, ans=0.125 2023-09-28 13:23:11,494 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.016e+02 3.238e+02 3.793e+02 5.269e+02 1.285e+03, threshold=7.587e+02, percent-clipped=5.0 2023-09-28 13:23:15,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 13:23:15,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=30733.333333333332, ans=0.125 2023-09-28 13:23:17,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:20,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:25,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:23:25,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:23:31,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 13:23:31,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 13:23:32,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:23:37,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:39,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:23:40,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:23:41,317 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=30866.666666666668, ans=0.1 2023-09-28 13:23:44,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 13:23:44,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 13:23:49,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:23:49,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:23:51,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:23:51,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 13:23:52,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:23:53,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 13:23:53,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:23:55,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:23:56,555 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.00 vs. limit=15.0 2023-09-28 13:23:56,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:23:57,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:23:57,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:23:57,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 13:23:59,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 13:24:00,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 13:24:00,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:24:02,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:24:02,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:24:02,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=30933.333333333332, ans=0.125 2023-09-28 13:24:04,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:24:14,391 INFO [train.py:1039] (3/4) Epoch 1, batch 4650, loss[loss=0.3066, simple_loss=0.3406, pruned_loss=0.1363, over 20956.00 frames. ], tot_loss[loss=0.3393, simple_loss=0.3686, pruned_loss=0.155, over 4700797.65 frames. ], batch size: 46, lr: 4.15e-02, grad_scale: 32.0 2023-09-28 13:24:14,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:24:18,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:24:18,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:24:19,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:24:19,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:24:19,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:24:21,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:24:24,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 13:24:27,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:24:28,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 13:24:28,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:24:30,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 13:24:30,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:24:32,406 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 13:24:32,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 13:24:32,459 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:24:33,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:24:38,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:24:39,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:24:39,811 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 13:24:44,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:24:45,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 13:24:48,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:24:48,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:24:50,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 13:24:51,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:24:52,283 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=31133.333333333332, ans=10.0 2023-09-28 13:24:55,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:24:58,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:25:03,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:25:06,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:25:07,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:25:09,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:25:11,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 13:25:12,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 13:25:14,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 13:25:14,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 13:25:15,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:25:22,165 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=31266.666666666668, ans=0.125 2023-09-28 13:25:23,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:25:23,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:25:23,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 13:25:23,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:25:24,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:25:24,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:25:26,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:25:30,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:25:30,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:25:30,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:25:33,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:25:34,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:25:34,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:25:34,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 13:25:35,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:25:36,400 INFO [train.py:1039] (3/4) Epoch 1, batch 4700, loss[loss=0.3177, simple_loss=0.367, pruned_loss=0.1343, over 24513.00 frames. ], tot_loss[loss=0.3382, simple_loss=0.3685, pruned_loss=0.154, over 4713276.57 frames. ], batch size: 66, lr: 4.14e-02, grad_scale: 32.0 2023-09-28 13:25:36,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 13:25:36,934 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=31333.333333333332, ans=0.004057971014492754 2023-09-28 13:25:44,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:25:45,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:25:45,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=31333.333333333332, ans=0.0 2023-09-28 13:25:45,255 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:25:47,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:25:49,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:25:50,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 13:25:54,524 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten.whitening_limit, batch_count=31400.0, ans=15.0 2023-09-28 13:25:55,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 13:25:56,839 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.950e+02 3.070e+02 3.636e+02 4.699e+02 2.301e+03, threshold=7.272e+02, percent-clipped=9.0 2023-09-28 13:25:56,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 13:26:00,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:26:02,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:26:02,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:26:05,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:26:12,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 13:26:14,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 13:26:17,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:26:22,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 13:26:25,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:26:26,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:30,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 13:26:31,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:26:36,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:26:37,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 13:26:37,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=31533.333333333332, ans=0.1 2023-09-28 13:26:39,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:39,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:26:41,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:26:41,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:26:41,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 13:26:43,255 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 13:26:43,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:26:46,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:46,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:46,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 13:26:49,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:26:50,148 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.33 vs. limit=22.5 2023-09-28 13:26:52,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 13:26:55,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:26:55,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:26:59,830 INFO [train.py:1039] (3/4) Epoch 1, batch 4750, loss[loss=0.3686, simple_loss=0.393, pruned_loss=0.1721, over 23528.00 frames. ], tot_loss[loss=0.34, simple_loss=0.3703, pruned_loss=0.1549, over 4713965.96 frames. ], batch size: 93, lr: 4.14e-02, grad_scale: 32.0 2023-09-28 13:27:03,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:27:03,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:27:04,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 13:27:04,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:27:08,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 13:27:10,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:27:10,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:27:11,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:27:15,164 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=31733.333333333332, ans=0.125 2023-09-28 13:27:16,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 13:27:19,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:27:22,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 13:27:23,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:27:26,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:27:26,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:27:27,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:27:27,142 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 13:27:27,147 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 13:27:33,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 13:27:36,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:27:39,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:27:40,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:27:40,819 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 13:27:40,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:27:44,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:27:46,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:27:49,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 13:27:49,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 13:27:49,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:27:49,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:27:50,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:27:51,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten.whitening_limit, batch_count=31866.666666666668, ans=15.0 2023-09-28 13:27:52,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 13:27:52,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 13:27:54,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 13:27:56,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:28:00,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:28:00,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 13:28:00,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:28:02,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:28:04,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:28:04,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:28:05,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 13:28:10,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:28:10,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 13:28:11,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 13:28:11,914 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 13:28:13,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=31933.333333333332, ans=0.2 2023-09-28 13:28:15,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:28:15,481 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:28:17,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 13:28:21,645 INFO [train.py:1039] (3/4) Epoch 1, batch 4800, loss[loss=0.3249, simple_loss=0.3826, pruned_loss=0.1336, over 24444.00 frames. ], tot_loss[loss=0.3408, simple_loss=0.3714, pruned_loss=0.1551, over 4704363.18 frames. ], batch size: 69, lr: 4.13e-02, grad_scale: 32.0 2023-09-28 13:28:23,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:28:23,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:28:29,427 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:28:30,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:28:30,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:28:31,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 13:28:32,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:28:32,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:28:35,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=32000.0, ans=0.00391304347826087 2023-09-28 13:28:36,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:28:41,494 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.966e+02 2.727e+02 3.187e+02 3.824e+02 7.207e+02, threshold=6.374e+02, percent-clipped=0.0 2023-09-28 13:28:41,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:28:43,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:28:43,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:28:44,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:28:44,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 13:28:44,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:28:46,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:28:49,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:28:54,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:28:54,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:28:56,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:28:57,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 13:28:59,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:29:00,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 13:29:02,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 13:29:02,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:29:02,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:29:03,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:29:03,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:29:03,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:29:06,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:29:06,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:29:10,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:29:16,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:18,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:29:22,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=32200.0, ans=0.1 2023-09-28 13:29:23,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 13:29:23,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:29:23,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:24,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:29:25,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:29:27,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:29:29,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:29:29,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:30,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:29:30,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:29:32,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:29:36,168 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.08 vs. limit=22.5 2023-09-28 13:29:36,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:29:36,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:36,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:29:37,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 13:29:40,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 13:29:40,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:29:40,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:29:40,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:29:40,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:29:43,144 INFO [train.py:1039] (3/4) Epoch 1, batch 4850, loss[loss=0.3393, simple_loss=0.3476, pruned_loss=0.1655, over 22673.00 frames. ], tot_loss[loss=0.3406, simple_loss=0.3708, pruned_loss=0.1552, over 4701974.81 frames. ], batch size: 322, lr: 4.12e-02, grad_scale: 32.0 2023-09-28 13:29:43,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:29:50,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=32333.333333333332, ans=0.125 2023-09-28 13:29:53,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 13:29:55,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:30:01,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:30:02,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 13:30:02,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:30:02,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=32400.0, ans=0.0 2023-09-28 13:30:06,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:30:07,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:30:07,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:30:07,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 13:30:11,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=32400.0, ans=0.125 2023-09-28 13:30:12,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:30:15,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:30:15,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 13:30:15,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:30:15,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 13:30:18,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:30:18,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:30:25,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:30:25,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 13:30:26,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 13:30:27,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:30:36,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:30:36,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 13:30:37,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:30:37,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:30:39,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:30:39,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 13:30:39,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:30:41,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=32533.333333333332, ans=0.0 2023-09-28 13:30:43,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 13:30:43,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:30:44,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:30:45,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 13:30:49,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=32600.0, ans=0.0 2023-09-28 13:30:54,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:31:00,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:31:00,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:31:05,042 INFO [train.py:1039] (3/4) Epoch 1, batch 4900, loss[loss=0.3471, simple_loss=0.3871, pruned_loss=0.1535, over 24677.00 frames. ], tot_loss[loss=0.3404, simple_loss=0.3697, pruned_loss=0.1555, over 4704525.95 frames. ], batch size: 73, lr: 4.12e-02, grad_scale: 32.0 2023-09-28 13:31:05,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 13:31:05,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:31:12,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:31:12,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:31:12,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:31:17,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 13:31:22,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 13:31:23,892 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.271e+02 3.049e+02 3.510e+02 4.577e+02 9.864e+02, threshold=7.020e+02, percent-clipped=4.0 2023-09-28 13:31:25,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 13:31:27,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 13:31:27,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:31:29,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:31:29,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:31:29,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:31:29,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:31:31,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 13:31:35,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 13:31:37,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:31:39,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:31:39,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:31:41,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:31:42,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:31:44,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:31:44,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 13:31:44,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=32800.0, ans=0.125 2023-09-28 13:31:45,094 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.71 vs. limit=10.0 2023-09-28 13:31:47,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:31:47,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=32800.0, ans=0.125 2023-09-28 13:31:48,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:31:50,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 13:31:50,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 13:31:53,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 13:31:55,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:31:55,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:31:57,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:31:57,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:31:57,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 13:31:57,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:31:58,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 13:32:02,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:32:03,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=32866.666666666664, ans=15.0 2023-09-28 13:32:03,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 13:32:06,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:32:09,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 13:32:09,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:32:10,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 13:32:10,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 13:32:18,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:32:20,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:32:21,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 13:32:23,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 13:32:23,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:32:23,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:32:27,209 INFO [train.py:1039] (3/4) Epoch 1, batch 4950, loss[loss=0.2991, simple_loss=0.3472, pruned_loss=0.1256, over 24293.00 frames. ], tot_loss[loss=0.337, simple_loss=0.3672, pruned_loss=0.1534, over 4696011.53 frames. ], batch size: 56, lr: 4.11e-02, grad_scale: 32.0 2023-09-28 13:32:28,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:32:28,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:32:28,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:32:28,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 13:32:30,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 13:32:33,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:32:34,446 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=33000.0, ans=0.125 2023-09-28 13:32:35,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 13:32:35,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=33000.0, ans=0.1 2023-09-28 13:32:37,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 13:32:38,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 13:32:38,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:32:40,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 13:32:40,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:32:40,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:32:40,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:32:40,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:32:42,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:32:43,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:32:45,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:32:46,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:32:48,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:32:48,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:32:53,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:32:58,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:33:00,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:33:02,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:33:02,284 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:02,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=33133.333333333336, ans=0.05 2023-09-28 13:33:03,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:33:05,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 13:33:05,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 13:33:09,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:10,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:33:12,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:33:12,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:33:12,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:33:14,489 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:33:16,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:33:19,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:33:20,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:33:22,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:33:23,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:23,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 13:33:25,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:33:25,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:33:30,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:33:32,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:33:32,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:33:32,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:34,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:33:34,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:33:37,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:33:37,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:33:38,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:33:38,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 13:33:42,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:33:49,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 13:33:50,636 INFO [train.py:1039] (3/4) Epoch 1, batch 5000, loss[loss=0.3282, simple_loss=0.3659, pruned_loss=0.1452, over 24284.00 frames. ], tot_loss[loss=0.3357, simple_loss=0.3664, pruned_loss=0.1525, over 4699748.13 frames. ], batch size: 61, lr: 4.10e-02, grad_scale: 16.0 2023-09-28 13:33:50,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 13:33:57,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:33:57,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:33:58,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 13:33:59,186 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=33333.333333333336, ans=0.1 2023-09-28 13:34:02,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 13:34:04,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:34:04,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 13:34:05,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:34:05,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:34:07,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 13:34:07,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:34:07,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:34:09,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 13:34:09,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:34:09,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:34:10,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 13:34:12,240 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.427e+02 3.057e+02 3.609e+02 4.472e+02 7.216e+02, threshold=7.218e+02, percent-clipped=2.0 2023-09-28 13:34:12,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 13:34:12,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:34:12,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 13:34:12,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:34:14,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:34:15,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:34:15,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 13:34:15,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 13:34:17,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 13:34:17,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:34:18,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:34:19,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 13:34:19,170 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:34:20,825 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:34:22,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:34:24,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 13:34:25,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 13:34:26,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:34:27,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:34:32,094 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 13:34:34,316 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:34:36,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:34:37,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:34:37,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:34:42,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 13:34:42,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:34:43,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:34:43,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:34:45,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 13:34:46,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:34:49,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:34:50,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:34:55,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 13:35:00,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:35:02,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=33600.0, ans=0.0 2023-09-28 13:35:11,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:35:12,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:35:12,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:35:12,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:35:12,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:35:12,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:35:14,081 INFO [train.py:1039] (3/4) Epoch 1, batch 5050, loss[loss=0.3384, simple_loss=0.3908, pruned_loss=0.143, over 24309.00 frames. ], tot_loss[loss=0.3355, simple_loss=0.3667, pruned_loss=0.1522, over 4690755.97 frames. ], batch size: 74, lr: 4.10e-02, grad_scale: 16.0 2023-09-28 13:35:14,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:35:17,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=33666.666666666664, ans=0.0 2023-09-28 13:35:19,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:35:19,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 13:35:19,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:35:22,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:35:26,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:35:26,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 13:35:27,057 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.94 vs. limit=22.5 2023-09-28 13:35:27,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:35:27,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:35:30,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:35:32,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:35:32,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:35:32,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=33733.333333333336, ans=0.0 2023-09-28 13:35:42,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 13:35:44,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 13:35:46,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:35:46,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 13:35:47,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:35:48,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:35:49,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:35:49,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:35:49,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 13:35:51,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 13:35:52,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:35:54,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:35:57,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:35:57,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 13:36:00,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:36:01,147 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.67 vs. limit=12.0 2023-09-28 13:36:04,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 13:36:05,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:36:05,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:36:07,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:36:07,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:36:09,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:36:12,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:36:12,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:36:12,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:36:14,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:36:14,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 13:36:15,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:36:18,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:36:23,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:36:23,091 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 13:36:23,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 13:36:24,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:36:24,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:36:26,116 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 13:36:29,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:36:30,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 13:36:30,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:36:34,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:36:34,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:36:34,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 13:36:36,984 INFO [train.py:1039] (3/4) Epoch 1, batch 5100, loss[loss=0.2517, simple_loss=0.3057, pruned_loss=0.09882, over 24340.00 frames. ], tot_loss[loss=0.3342, simple_loss=0.3667, pruned_loss=0.1508, over 4709206.08 frames. ], batch size: 56, lr: 4.09e-02, grad_scale: 16.0 2023-09-28 13:36:37,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 13:36:38,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:36:38,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:36:39,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:36:42,592 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 13:36:46,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:36:47,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 13:36:49,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 13:36:49,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:36:49,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:36:52,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:36:54,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 13:36:54,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 13:36:58,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=34066.666666666664, ans=0.1 2023-09-28 13:36:59,954 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.897e+02 3.081e+02 3.841e+02 4.720e+02 7.459e+02, threshold=7.682e+02, percent-clipped=1.0 2023-09-28 13:37:00,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:37:00,901 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=27.18 vs. limit=22.5 2023-09-28 13:37:01,594 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:37:06,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:37:09,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 13:37:09,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:37:13,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:37:13,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 13:37:17,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:37:17,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:37:17,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 13:37:20,084 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 13:37:21,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:37:21,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 13:37:21,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 13:37:26,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:37:32,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:37:36,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 13:37:38,285 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 13:37:38,321 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 13:37:41,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 13:37:41,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:37:42,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 13:37:46,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 13:37:49,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 13:37:51,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:37:53,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 13:37:55,508 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=34266.666666666664, ans=0.125 2023-09-28 13:37:56,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 13:37:56,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 13:37:59,165 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.95 vs. limit=12.0 2023-09-28 13:38:01,418 INFO [train.py:1039] (3/4) Epoch 1, batch 5150, loss[loss=0.3499, simple_loss=0.3672, pruned_loss=0.1663, over 23792.00 frames. ], tot_loss[loss=0.3364, simple_loss=0.3679, pruned_loss=0.1524, over 4708824.87 frames. ], batch size: 179, lr: 4.09e-02, grad_scale: 16.0 2023-09-28 13:38:03,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:38:03,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:38:03,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:38:04,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:38:04,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 13:38:06,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:38:06,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 13:38:06,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 13:38:06,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=34333.333333333336, ans=0.125 2023-09-28 13:38:07,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 13:38:07,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:38:07,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 13:38:11,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:38:11,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 13:38:13,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:38:15,004 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:38:19,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:38:19,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 13:38:19,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:38:19,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:38:22,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=34400.0, ans=0.125 2023-09-28 13:38:23,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 13:38:23,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:38:23,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:38:25,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:38:25,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:38:25,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 13:38:26,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:38:26,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:38:29,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:38:29,800 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.67 vs. limit=15.0 2023-09-28 13:38:32,062 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 13:38:33,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:38:35,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=34466.666666666664, ans=0.0 2023-09-28 13:38:36,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.min_positive, batch_count=34466.666666666664, ans=0.025 2023-09-28 13:38:37,496 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.57 vs. limit=6.0 2023-09-28 13:38:39,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:38:43,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 13:38:46,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:38:51,132 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.82 vs. limit=6.0 2023-09-28 13:38:54,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:38:55,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:39:00,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:39:00,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:39:02,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 13:39:04,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=34533.333333333336, ans=0.0033623188405797096 2023-09-28 13:39:07,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:39:08,451 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.74 vs. limit=15.0 2023-09-28 13:39:08,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:39:09,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:39:12,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:39:12,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:39:13,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 13:39:14,620 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.03 vs. limit=15.0 2023-09-28 13:39:20,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:39:21,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 13:39:24,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:39:24,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:39:24,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 13:39:25,901 INFO [train.py:1039] (3/4) Epoch 1, batch 5200, loss[loss=0.3133, simple_loss=0.3692, pruned_loss=0.1287, over 24633.00 frames. ], tot_loss[loss=0.3366, simple_loss=0.3693, pruned_loss=0.1519, over 4717363.11 frames. ], batch size: 68, lr: 4.08e-02, grad_scale: 32.0 2023-09-28 13:39:25,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:39:26,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:39:27,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:39:30,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:39:32,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:39:35,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:39:39,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 13:39:41,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:39:42,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:39:44,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:39:44,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:39:44,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:39:46,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 13:39:47,263 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.249e+02 2.894e+02 3.453e+02 4.172e+02 7.980e+02, threshold=6.907e+02, percent-clipped=1.0 2023-09-28 13:39:47,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:39:49,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:39:50,025 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=18.26 vs. limit=15.0 2023-09-28 13:39:51,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 13:39:54,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:39:54,474 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:39:55,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:39:55,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 13:39:57,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 13:39:59,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 13:40:00,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:40:00,639 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 13:40:00,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:40:02,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:40:02,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:40:03,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 13:40:03,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:40:07,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:40:12,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 13:40:12,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 13:40:12,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 13:40:17,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=34866.666666666664, ans=0.125 2023-09-28 13:40:18,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 13:40:18,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:40:26,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:40:26,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:40:28,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 13:40:28,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:40:28,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 13:40:30,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:40:30,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:40:33,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:40:33,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:40:34,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=34933.333333333336, ans=0.125 2023-09-28 13:40:37,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:40:39,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:40:39,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:40:44,294 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=34933.333333333336, ans=0.125 2023-09-28 13:40:47,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:40:47,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 13:40:47,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:40:49,066 INFO [train.py:1039] (3/4) Epoch 1, batch 5250, loss[loss=0.3111, simple_loss=0.3715, pruned_loss=0.1254, over 24642.00 frames. ], tot_loss[loss=0.3354, simple_loss=0.3681, pruned_loss=0.1514, over 4722486.95 frames. ], batch size: 73, lr: 4.07e-02, grad_scale: 32.0 2023-09-28 13:40:49,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:40:49,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:40:49,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 13:40:50,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:40:53,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:40:55,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=35000.0, ans=10.0 2023-09-28 13:40:56,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:40:56,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:40:58,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:41:03,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:41:05,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:41:10,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:41:11,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:41:13,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 13:41:13,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:41:14,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:42:02,543 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=35333.333333333336, ans=0.1 2023-09-28 13:42:03,701 INFO [train.py:1039] (3/4) Epoch 1, batch 5300, loss[loss=0.3586, simple_loss=0.4004, pruned_loss=0.1584, over 24334.00 frames. ], tot_loss[loss=0.3346, simple_loss=0.3673, pruned_loss=0.1509, over 4715201.03 frames. ], batch size: 77, lr: 4.07e-02, grad_scale: 32.0 2023-09-28 13:42:18,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:42:18,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 13:42:18,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 13:42:18,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:42:19,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:42:19,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:42:19,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:42:19,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:42:19,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:42:20,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:42:20,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 13:42:20,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:42:20,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 13:42:20,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 13:42:20,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 13:42:21,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 13:42:21,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 13:42:21,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 13:42:21,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:42:21,842 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:42:21,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:42:22,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:42:22,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:42:23,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:42:23,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:42:23,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:42:23,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:42:23,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:42:23,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:42:23,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:42:23,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:42:24,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 13:42:24,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:42:24,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:42:24,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 13:42:24,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 13:42:25,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:42:25,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:42:25,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 13:42:25,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 13:42:25,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:42:26,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:42:26,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:42:27,010 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 13:42:27,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 13:42:27,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:42:27,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:42:27,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 13:42:27,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 13:42:27,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 13:42:27,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:42:40,297 INFO [train.py:1039] (3/4) Epoch 2, batch 0, loss[loss=0.332, simple_loss=0.3695, pruned_loss=0.1473, over 24312.00 frames. ], tot_loss[loss=0.332, simple_loss=0.3695, pruned_loss=0.1473, over 24312.00 frames. ], batch size: 61, lr: 3.99e-02, grad_scale: 32.0 2023-09-28 13:42:40,297 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-28 13:42:56,272 INFO [train.py:1071] (3/4) Epoch 2, validation: loss=0.367, simple_loss=0.3421, pruned_loss=0.196, over 1125622.00 frames. 2023-09-28 13:42:56,273 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-28 13:42:57,795 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.107e+02 3.100e+02 3.616e+02 4.753e+02 9.571e+02, threshold=7.232e+02, percent-clipped=1.0 2023-09-28 13:43:00,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 13:43:00,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:43:02,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:43:07,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:43:07,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:43:07,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:43:08,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 13:43:10,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 13:43:13,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:43:14,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:43:17,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:43:17,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:43:19,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:43:19,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:43:21,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=35480.0, ans=0.5 2023-09-28 13:43:22,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 13:43:24,489 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:43:33,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:43:33,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:43:33,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=35546.666666666664, ans=0.125 2023-09-28 13:43:35,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 13:43:41,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:43:41,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:43:44,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:43:49,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:43:49,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=35613.333333333336, ans=0.1 2023-09-28 13:43:49,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=35613.333333333336, ans=0.0031275362318840573 2023-09-28 13:43:53,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:43:53,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=35613.333333333336, ans=0.125 2023-09-28 13:44:00,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 13:44:02,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 13:44:02,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:44:02,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:44:03,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:44:05,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:44:05,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 13:44:08,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:44:10,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:44:15,712 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:44:19,258 INFO [train.py:1039] (3/4) Epoch 2, batch 50, loss[loss=0.3393, simple_loss=0.3652, pruned_loss=0.1567, over 23413.00 frames. ], tot_loss[loss=0.3291, simple_loss=0.3632, pruned_loss=0.1475, over 1060682.09 frames. ], batch size: 134, lr: 3.98e-02, grad_scale: 32.0 2023-09-28 13:44:19,320 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 13:44:19,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:44:22,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:44:24,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:44:24,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 13:44:25,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 13:44:25,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:44:27,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=35746.666666666664, ans=0.125 2023-09-28 13:44:30,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:44:32,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:44:35,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:44:37,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 13:44:37,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:44:43,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 13:44:44,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 13:44:46,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 13:44:49,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:44:49,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=35813.333333333336, ans=0.07 2023-09-28 13:44:52,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:44:52,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:44:52,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:44:53,744 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.18 vs. limit=15.0 2023-09-28 13:44:54,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 13:44:54,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 13:44:54,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:45:02,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:45:04,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=35880.0, ans=0.125 2023-09-28 13:45:05,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:45:05,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:45:05,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 13:45:07,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 13:45:08,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:45:09,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 13:45:10,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:45:12,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 13:45:18,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:45:18,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:45:22,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:45:22,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:45:22,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 13:45:26,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 13:45:27,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 13:45:27,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:45:29,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 13:45:30,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:45:30,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:45:30,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 13:45:30,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 13:45:32,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 13:45:32,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:45:33,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:45:35,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 13:45:35,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 13:45:35,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:45:37,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:45:38,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 13:45:38,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:45:42,079 INFO [train.py:1039] (3/4) Epoch 2, batch 100, loss[loss=0.3446, simple_loss=0.3704, pruned_loss=0.1593, over 23597.00 frames. ], tot_loss[loss=0.3339, simple_loss=0.368, pruned_loss=0.1499, over 1865735.92 frames. ], batch size: 232, lr: 3.97e-02, grad_scale: 32.0 2023-09-28 13:45:43,574 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.251e+02 2.783e+02 3.462e+02 4.523e+02 1.049e+03, threshold=6.924e+02, percent-clipped=4.0 2023-09-28 13:45:43,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:45:45,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:45:46,226 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.48 vs. limit=15.0 2023-09-28 13:45:46,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=36080.0, ans=15.0 2023-09-28 13:45:48,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:45:50,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 13:45:50,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:45:50,724 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=2.554e-03 2023-09-28 13:45:56,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:45:56,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:45:56,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:45:56,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:45:56,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=36080.0, ans=0.125 2023-09-28 13:45:57,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:45:59,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 13:46:00,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:46:00,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:46:02,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:46:02,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:46:05,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 13:46:07,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:46:08,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:46:10,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:46:12,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:46:15,778 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 13:46:15,821 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 13:46:16,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:46:16,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:46:20,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 13:46:22,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:46:23,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:31,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:32,821 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 13:46:34,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 13:46:36,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=36280.0, ans=0.125 2023-09-28 13:46:39,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:46:41,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:46:42,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:45,082 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.45 vs. limit=10.0 2023-09-28 13:46:45,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:46:48,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:46:49,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:46:52,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:54,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:46:54,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:46:55,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:46:55,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:46:57,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 13:46:57,384 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 13:46:57,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:46:58,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:46:59,868 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.24 vs. limit=5.0 2023-09-28 13:47:01,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:01,740 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:47:01,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 13:47:01,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:47:01,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:47:01,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:04,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:47:05,405 INFO [train.py:1039] (3/4) Epoch 2, batch 150, loss[loss=0.3508, simple_loss=0.3891, pruned_loss=0.1563, over 24028.00 frames. ], tot_loss[loss=0.3346, simple_loss=0.3687, pruned_loss=0.1503, over 2490754.95 frames. ], batch size: 80, lr: 3.97e-02, grad_scale: 32.0 2023-09-28 13:47:05,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:47:05,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:47:07,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:47:08,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:47:14,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:47:14,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:47:14,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:17,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:47:17,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:20,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:47:20,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:20,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=36480.0, ans=0.0 2023-09-28 13:47:25,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 13:47:25,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=36480.0, ans=0.2 2023-09-28 13:47:27,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 13:47:27,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 13:47:30,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:47:30,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:47:31,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:47:33,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:47:33,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:47:33,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:33,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:47:34,970 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 13:47:38,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:47:44,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:47:47,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 13:47:50,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 13:47:53,403 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.44 vs. limit=12.0 2023-09-28 13:47:54,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:47:56,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:47:56,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:47:59,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:48:01,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:48:01,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=36613.333333333336, ans=0.2 2023-09-28 13:48:02,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:48:04,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:48:06,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 13:48:08,644 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:48:11,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:48:11,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:48:11,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:48:11,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:48:13,237 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=36680.0, ans=0.125 2023-09-28 13:48:14,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:48:14,759 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=36680.0, ans=0.125 2023-09-28 13:48:16,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 13:48:19,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 13:48:21,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:48:21,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:48:24,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:48:24,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 13:48:24,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:48:25,021 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 13:48:30,008 INFO [train.py:1039] (3/4) Epoch 2, batch 200, loss[loss=0.3211, simple_loss=0.3556, pruned_loss=0.1433, over 17229.00 frames. ], tot_loss[loss=0.3358, simple_loss=0.3693, pruned_loss=0.1511, over 2975617.84 frames. ], batch size: 37, lr: 3.96e-02, grad_scale: 32.0 2023-09-28 13:48:30,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:48:31,380 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.151e+02 2.761e+02 3.224e+02 4.160e+02 8.294e+02, threshold=6.447e+02, percent-clipped=1.0 2023-09-28 13:48:33,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:48:34,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:48:37,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=36746.666666666664, ans=0.2 2023-09-28 13:48:38,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 13:48:39,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:48:40,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:48:42,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 13:48:44,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 13:48:45,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:48:45,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:48:50,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:48:52,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:48:52,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:49:10,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:49:10,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:49:12,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 13:49:13,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:49:13,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 13:49:13,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:49:13,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:15,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:49:15,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:49:16,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:49:18,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 13:49:19,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 13:49:19,062 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:49:23,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:49:27,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:49:35,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:35,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:49:44,111 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:45,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 13:49:47,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:49:47,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:49:47,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:49:47,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 13:49:51,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 13:49:51,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=37080.0, ans=0.125 2023-09-28 13:49:52,437 INFO [train.py:1039] (3/4) Epoch 2, batch 250, loss[loss=0.3757, simple_loss=0.3924, pruned_loss=0.1795, over 23620.00 frames. ], tot_loss[loss=0.3375, simple_loss=0.3711, pruned_loss=0.152, over 3362207.96 frames. ], batch size: 149, lr: 3.95e-02, grad_scale: 32.0 2023-09-28 13:49:52,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:49:52,572 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 13:49:54,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:55,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:49:57,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:49:57,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:50:00,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:50:01,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:50:03,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:50:09,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:50:10,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=37146.666666666664, ans=0.1 2023-09-28 13:50:21,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:50:24,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:50:24,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:50:31,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 13:50:33,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 13:50:34,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:50:34,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:50:34,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=37213.333333333336, ans=0.125 2023-09-28 13:50:35,358 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.87 vs. limit=6.0 2023-09-28 13:50:36,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:50:36,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:50:36,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:50:37,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:50:40,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 13:50:40,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:50:43,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:50:43,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 13:50:43,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:50:44,200 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.57 vs. limit=15.0 2023-09-28 13:50:45,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:50:46,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:50:46,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:50:46,776 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:50:46,823 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=37280.0, ans=0.0 2023-09-28 13:50:49,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:50:51,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 13:50:52,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:50:57,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 13:51:01,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:51:02,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:51:08,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:51:11,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:51:11,547 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=37346.666666666664, ans=0.00275072463768116 2023-09-28 13:51:14,099 INFO [train.py:1039] (3/4) Epoch 2, batch 300, loss[loss=0.3526, simple_loss=0.3949, pruned_loss=0.1552, over 24516.00 frames. ], tot_loss[loss=0.3344, simple_loss=0.3676, pruned_loss=0.1506, over 3656487.94 frames. ], batch size: 71, lr: 3.95e-02, grad_scale: 32.0 2023-09-28 13:51:14,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 13:51:15,703 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.217e+02 3.009e+02 3.543e+02 4.126e+02 1.008e+03, threshold=7.086e+02, percent-clipped=8.0 2023-09-28 13:51:15,873 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:51:15,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 13:51:18,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 13:51:18,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 13:51:19,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:51:19,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 13:51:23,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:51:24,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:51:29,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 13:51:29,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 13:51:31,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:51:31,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 13:51:31,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 13:51:32,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:51:38,276 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=37480.0, ans=0.0 2023-09-28 13:51:39,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 13:51:42,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:51:42,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 13:51:45,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 13:51:47,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:51:49,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:51:51,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:51:51,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 13:51:51,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 13:51:55,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:51:57,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:51:57,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:51:59,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff2.min_abs, batch_count=37546.666666666664, ans=0.1 2023-09-28 13:52:01,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 13:52:01,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 13:52:04,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:52:07,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:52:10,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 13:52:10,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:52:16,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:52:17,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:52:17,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 13:52:22,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:52:22,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 13:52:25,448 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:52:27,061 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:52:27,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 13:52:29,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 13:52:29,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:52:30,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 13:52:32,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:52:33,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:52:33,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:52:34,183 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=37680.0, ans=0.05 2023-09-28 13:52:35,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:52:36,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:52:37,466 INFO [train.py:1039] (3/4) Epoch 2, batch 350, loss[loss=0.2854, simple_loss=0.3321, pruned_loss=0.1194, over 22365.00 frames. ], tot_loss[loss=0.3309, simple_loss=0.3641, pruned_loss=0.1489, over 3889441.96 frames. ], batch size: 49, lr: 3.94e-02, grad_scale: 32.0 2023-09-28 13:52:40,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:52:41,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 13:52:44,621 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:52:51,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:52:54,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:52:54,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:52:56,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 13:52:57,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:52:58,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 13:52:59,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:53:00,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=37813.333333333336, ans=0.0026492753623188394 2023-09-28 13:53:01,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 13:53:01,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=37813.333333333336, ans=0.0 2023-09-28 13:53:03,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:53:06,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 13:53:08,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:53:10,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:53:12,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:53:13,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:53:13,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:53:15,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:53:15,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:53:15,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:53:18,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:53:18,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:53:25,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:53:25,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 13:53:26,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:53:26,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:53:28,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=37946.666666666664, ans=0.0 2023-09-28 13:53:31,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 13:53:31,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:53:38,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:53:38,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:53:38,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:53:39,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 13:53:42,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:53:42,882 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 13:53:44,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 13:53:44,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:53:48,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 13:53:48,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 13:53:49,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:53:51,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 13:53:55,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:53:56,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:53:56,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:53:58,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:54:01,047 INFO [train.py:1039] (3/4) Epoch 2, batch 400, loss[loss=0.2856, simple_loss=0.3283, pruned_loss=0.1214, over 21419.00 frames. ], tot_loss[loss=0.3286, simple_loss=0.363, pruned_loss=0.1471, over 4074078.73 frames. ], batch size: 47, lr: 3.94e-02, grad_scale: 32.0 2023-09-28 13:54:01,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:54:02,560 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.015e+02 2.905e+02 3.509e+02 4.327e+02 7.986e+02, threshold=7.018e+02, percent-clipped=1.0 2023-09-28 13:54:05,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 13:54:07,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 13:54:07,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:54:07,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:54:09,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:54:09,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:54:12,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:54:14,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:54:17,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 13:54:18,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 13:54:18,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:54:20,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 13:54:22,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:54:24,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:54:24,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:54:26,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 13:54:26,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:54:26,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:54:26,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:54:26,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:54:30,377 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 13:54:30,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 13:54:35,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:54:36,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:54:38,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 13:54:40,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 13:54:42,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:54:45,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:54:51,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 13:54:53,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 13:54:55,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 13:54:57,790 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=38280.0, ans=0.125 2023-09-28 13:55:00,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:55:01,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:55:01,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 13:55:02,270 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=38280.0, ans=0.125 2023-09-28 13:55:05,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:55:08,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 13:55:10,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:55:13,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:55:15,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 13:55:17,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 13:55:18,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 13:55:18,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 13:55:20,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:55:23,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 13:55:24,945 INFO [train.py:1039] (3/4) Epoch 2, batch 450, loss[loss=0.3615, simple_loss=0.371, pruned_loss=0.176, over 22811.00 frames. ], tot_loss[loss=0.3286, simple_loss=0.3635, pruned_loss=0.1468, over 4222891.40 frames. ], batch size: 322, lr: 3.93e-02, grad_scale: 32.0 2023-09-28 13:55:25,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:55:25,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=38413.333333333336, ans=0.05 2023-09-28 13:55:26,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:55:26,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 13:55:30,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 13:55:30,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:55:31,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:55:31,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:55:31,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 13:55:31,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 13:55:33,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 13:55:37,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 13:55:47,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:55:47,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:55:51,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 13:55:51,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 13:55:55,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:55:56,182 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.74 vs. limit=15.0 2023-09-28 13:55:57,649 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=2.86 vs. limit=15.0 2023-09-28 13:55:58,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:55:58,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:56:00,575 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=38546.666666666664, ans=0.0 2023-09-28 13:56:03,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:56:04,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:56:08,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 13:56:08,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 13:56:10,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 13:56:10,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:56:12,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:56:12,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 13:56:13,017 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.49 vs. limit=15.0 2023-09-28 13:56:15,980 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 13:56:15,994 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 13:56:16,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:56:18,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:56:19,188 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 13:56:20,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 13:56:23,559 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 13:56:23,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 13:56:25,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 13:56:25,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 13:56:27,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:56:27,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=38613.333333333336, ans=0.002475362318840579 2023-09-28 13:56:30,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 13:56:30,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 13:56:32,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 13:56:36,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 13:56:37,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 13:56:38,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 13:56:39,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 13:56:40,832 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.13 vs. limit=15.0 2023-09-28 13:56:45,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:56:46,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:56:47,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=38746.666666666664, ans=0.0 2023-09-28 13:56:48,206 INFO [train.py:1039] (3/4) Epoch 2, batch 500, loss[loss=0.3468, simple_loss=0.3778, pruned_loss=0.1579, over 23533.00 frames. ], tot_loss[loss=0.3292, simple_loss=0.3639, pruned_loss=0.1473, over 4326304.87 frames. ], batch size: 105, lr: 3.92e-02, grad_scale: 32.0 2023-09-28 13:56:48,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 13:56:48,402 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 13:56:50,434 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.109e+02 2.855e+02 3.493e+02 4.304e+02 8.305e+02, threshold=6.986e+02, percent-clipped=1.0 2023-09-28 13:56:53,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:56:53,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 13:56:55,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:56:55,172 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 13:56:56,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 13:56:56,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:56:59,780 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten.whitening_limit, batch_count=38746.666666666664, ans=15.0 2023-09-28 13:57:00,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 13:57:05,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 13:57:06,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 13:57:08,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:57:08,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:57:09,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:12,813 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.48 vs. limit=6.0 2023-09-28 13:57:13,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=38813.333333333336, ans=0.1 2023-09-28 13:57:15,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=38813.333333333336, ans=0.125 2023-09-28 13:57:18,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:57:18,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 13:57:19,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 13:57:19,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:57:21,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 13:57:21,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 13:57:25,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:57:25,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 13:57:27,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 13:57:27,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:57:28,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 13:57:33,687 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 13:57:35,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=38880.0, ans=0.0 2023-09-28 13:57:38,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:57:38,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:39,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:41,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:41,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 13:57:43,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 13:57:46,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 13:57:47,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:57:50,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:57:53,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:57:58,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:58:01,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 13:58:01,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:58:01,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:58:06,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 13:58:08,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 13:58:09,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:58:11,451 INFO [train.py:1039] (3/4) Epoch 2, batch 550, loss[loss=0.3429, simple_loss=0.3951, pruned_loss=0.1453, over 24553.00 frames. ], tot_loss[loss=0.329, simple_loss=0.3645, pruned_loss=0.1468, over 4423264.27 frames. ], batch size: 71, lr: 3.92e-02, grad_scale: 32.0 2023-09-28 13:58:14,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 13:58:16,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 13:58:16,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:58:16,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 13:58:16,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=39080.0, ans=0.125 2023-09-28 13:58:17,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 13:58:17,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:58:19,170 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:19,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:19,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=39080.0, ans=0.0 2023-09-28 13:58:20,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 13:58:20,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 13:58:22,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:58:23,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 13:58:23,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 13:58:25,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=39146.666666666664, ans=0.125 2023-09-28 13:58:29,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:58:30,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:31,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:58:33,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:35,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 13:58:37,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 13:58:38,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 13:58:41,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=39146.666666666664, ans=0.2 2023-09-28 13:58:41,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=39146.666666666664, ans=0.125 2023-09-28 13:58:43,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:58:43,148 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:58:44,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 13:58:47,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:58:47,952 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 13:58:49,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 13:58:50,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 13:58:52,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 13:58:54,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 13:58:54,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 13:58:55,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:58:55,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 13:58:59,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 13:58:59,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:58:59,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 13:59:01,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:59:01,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 13:59:04,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 13:59:06,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 13:59:11,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 13:59:13,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=39280.0, ans=0.0 2023-09-28 13:59:14,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:59:14,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 13:59:15,323 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.41 vs. limit=15.0 2023-09-28 13:59:15,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 13:59:17,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:59:18,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 13:59:20,499 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:59:22,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 13:59:22,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 13:59:29,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 13:59:32,915 INFO [train.py:1039] (3/4) Epoch 2, batch 600, loss[loss=0.3052, simple_loss=0.3376, pruned_loss=0.1364, over 24303.00 frames. ], tot_loss[loss=0.328, simple_loss=0.3633, pruned_loss=0.1464, over 4497381.09 frames. ], batch size: 56, lr: 3.91e-02, grad_scale: 32.0 2023-09-28 13:59:33,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 13:59:34,963 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.039e+02 2.924e+02 3.724e+02 4.722e+02 8.175e+02, threshold=7.448e+02, percent-clipped=4.0 2023-09-28 13:59:35,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=39413.333333333336, ans=0.2 2023-09-28 13:59:36,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 13:59:36,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 13:59:36,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 13:59:44,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 13:59:45,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 13:59:47,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 13:59:49,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 13:59:52,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 13:59:54,271 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 13:59:54,792 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=39480.0, ans=0.95 2023-09-28 13:59:55,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 13:59:55,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 13:59:59,641 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.12 vs. limit=22.5 2023-09-28 14:00:04,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 14:00:06,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:00:06,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:00:08,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:00:12,182 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=39546.666666666664, ans=0.04949747468305833 2023-09-28 14:00:13,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:00:13,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:00:13,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:00:20,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:00:26,468 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.24 vs. limit=22.5 2023-09-28 14:00:27,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:00:27,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:00:27,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:00:29,399 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.05 vs. limit=15.0 2023-09-28 14:00:34,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 14:00:38,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=39680.0, ans=0.125 2023-09-28 14:00:39,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 14:00:39,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:00:44,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 14:00:46,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:00:46,780 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=39680.0, ans=0.002243478260869565 2023-09-28 14:00:49,832 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.39 vs. limit=15.0 2023-09-28 14:00:50,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 14:00:50,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:00:50,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:00:51,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=39680.0, ans=0.2 2023-09-28 14:00:57,132 INFO [train.py:1039] (3/4) Epoch 2, batch 650, loss[loss=0.3475, simple_loss=0.3655, pruned_loss=0.1648, over 23862.00 frames. ], tot_loss[loss=0.3261, simple_loss=0.3616, pruned_loss=0.1453, over 4548228.72 frames. ], batch size: 195, lr: 3.90e-02, grad_scale: 32.0 2023-09-28 14:00:57,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 14:00:58,105 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.71 vs. limit=22.5 2023-09-28 14:01:00,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 14:01:01,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:01:03,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:01:04,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:08,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 14:01:09,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:01:09,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=39746.666666666664, ans=0.1 2023-09-28 14:01:14,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:01:14,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:01:17,873 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:01:22,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 14:01:25,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:01:25,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:01:30,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:01:31,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 14:01:32,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:01:33,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:34,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 14:01:36,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:36,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=39880.0, ans=0.2 2023-09-28 14:01:38,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:01:39,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 14:01:40,955 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 14:01:40,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:01:40,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:01:42,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:44,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:01:44,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:01:45,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:01:47,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 14:01:47,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:01:49,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:01:49,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:01:49,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:01:51,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:01:52,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 14:01:54,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 14:01:55,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:01:55,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:01:56,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:01:56,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:01:58,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:02:05,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:02:05,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:02:08,485 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:02:10,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:02:10,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:02:11,731 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:02:19,115 INFO [train.py:1039] (3/4) Epoch 2, batch 700, loss[loss=0.3014, simple_loss=0.3354, pruned_loss=0.1337, over 23672.00 frames. ], tot_loss[loss=0.3246, simple_loss=0.3595, pruned_loss=0.1448, over 4561890.96 frames. ], batch size: 135, lr: 3.90e-02, grad_scale: 32.0 2023-09-28 14:02:19,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:02:19,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:02:20,145 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.84 vs. limit=15.0 2023-09-28 14:02:20,672 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.035e+02 2.820e+02 3.434e+02 4.210e+02 9.710e+02, threshold=6.868e+02, percent-clipped=2.0 2023-09-28 14:02:20,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:02:20,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:02:24,728 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 14:02:26,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 14:02:27,546 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=9.67 vs. limit=10.0 2023-09-28 14:02:29,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 14:02:29,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:02:33,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:02:35,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 14:02:37,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=40146.666666666664, ans=0.1 2023-09-28 14:02:37,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=40146.666666666664, ans=0.125 2023-09-28 14:02:38,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:02:42,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:02:42,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=40146.666666666664, ans=0.125 2023-09-28 14:02:43,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:02:45,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:02:46,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:02:49,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:02:50,077 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=40146.666666666664, ans=0.125 2023-09-28 14:02:51,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 14:02:51,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:02:54,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 14:02:54,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=40213.333333333336, ans=0.125 2023-09-28 14:02:57,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 14:03:01,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:03:02,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:03:03,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:03:08,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:03:08,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 14:03:14,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:03:15,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:03:15,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 14:03:21,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:03:21,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:03:24,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:03:29,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:03:29,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 14:03:35,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 14:03:35,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 14:03:38,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:03:40,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:03:40,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:03:41,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:03:41,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 14:03:43,879 INFO [train.py:1039] (3/4) Epoch 2, batch 750, loss[loss=0.2804, simple_loss=0.3226, pruned_loss=0.119, over 24416.00 frames. ], tot_loss[loss=0.3238, simple_loss=0.3589, pruned_loss=0.1443, over 4603375.06 frames. ], batch size: 58, lr: 3.89e-02, grad_scale: 32.0 2023-09-28 14:03:45,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 14:03:47,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 14:03:47,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 14:03:48,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 14:03:48,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 14:03:48,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:03:51,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 14:03:53,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:03:53,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:03:54,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:03:56,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:03:56,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:03:56,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=40413.333333333336, ans=0.1 2023-09-28 14:03:57,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:03:59,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:04:00,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:04:02,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:04:05,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:04:05,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:04:07,460 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 14:04:09,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:04:11,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:04:14,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:04:14,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 14:04:16,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 14:04:16,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:04:19,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 14:04:19,667 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 14:04:19,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 14:04:21,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 14:04:21,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 14:04:22,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:04:28,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:04:28,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:04:28,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:04:32,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:04:35,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:04:35,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 14:04:35,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:04:36,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 14:04:36,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:04:40,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:04:42,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 14:04:44,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:04:48,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:04:50,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:04:51,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:04:54,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:04:57,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 14:04:57,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:04:59,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:05:02,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:05:02,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:05:05,534 INFO [train.py:1039] (3/4) Epoch 2, batch 800, loss[loss=0.3327, simple_loss=0.3636, pruned_loss=0.1509, over 23859.00 frames. ], tot_loss[loss=0.324, simple_loss=0.3598, pruned_loss=0.1441, over 4632207.12 frames. ], batch size: 179, lr: 3.88e-02, grad_scale: 32.0 2023-09-28 14:05:05,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:05:05,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:05:07,068 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.053e+02 2.783e+02 3.464e+02 4.160e+02 6.985e+02, threshold=6.929e+02, percent-clipped=3.0 2023-09-28 14:05:14,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:05:14,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:17,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:05:17,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:05:18,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:20,169 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.25 vs. limit=12.0 2023-09-28 14:05:20,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:05:22,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:26,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:05:27,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:05:30,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 14:05:30,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=40813.333333333336, ans=0.2 2023-09-28 14:05:31,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:05:31,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:05:33,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:05:33,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:05:34,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 14:05:34,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:05:34,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 14:05:37,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:39,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:05:41,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:05:41,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:05:43,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=40880.0, ans=0.125 2023-09-28 14:05:45,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:05:45,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:05:52,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:05:52,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:05:52,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 14:05:53,858 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 14:05:53,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 14:05:53,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:05:53,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:05:57,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:05:57,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:05:57,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=40946.666666666664, ans=0.0019681159420289855 2023-09-28 14:06:03,298 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 14:06:03,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 14:06:05,119 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=40946.666666666664, ans=0.125 2023-09-28 14:06:06,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:06:07,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:06:11,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:06:11,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=41013.333333333336, ans=0.125 2023-09-28 14:06:15,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:06:15,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 14:06:17,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:06:19,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 14:06:25,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:06:28,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:06:28,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 14:06:28,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=41080.0, ans=0.125 2023-09-28 14:06:29,436 INFO [train.py:1039] (3/4) Epoch 2, batch 850, loss[loss=0.3291, simple_loss=0.3539, pruned_loss=0.1521, over 23493.00 frames. ], tot_loss[loss=0.3248, simple_loss=0.3607, pruned_loss=0.1445, over 4653508.17 frames. ], batch size: 134, lr: 3.88e-02, grad_scale: 32.0 2023-09-28 14:06:29,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:06:29,722 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:06:31,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 14:06:31,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:06:31,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:06:33,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:06:37,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:06:37,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:06:38,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 14:06:40,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 14:06:40,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 14:06:41,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:06:41,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:06:44,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:06:44,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:06:46,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:06:50,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:06:50,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:06:50,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 14:06:54,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 14:06:57,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=41146.666666666664, ans=0.04949747468305833 2023-09-28 14:06:58,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=41146.666666666664, ans=0.125 2023-09-28 14:06:59,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:07:01,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 14:07:03,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=41213.333333333336, ans=0.0 2023-09-28 14:07:05,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 14:07:05,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 14:07:06,059 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.88 vs. limit=15.0 2023-09-28 14:07:08,428 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 14:07:08,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:07:08,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:07:08,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 14:07:11,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:07:12,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:07:12,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 14:07:17,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:07:19,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:07:19,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=41280.0, ans=0.125 2023-09-28 14:07:20,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:07:20,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 14:07:21,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:07:22,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 14:07:22,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 14:07:23,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=41280.0, ans=0.125 2023-09-28 14:07:28,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:07:28,772 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:07:28,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:07:30,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:07:30,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=41280.0, ans=0.001895652173913043 2023-09-28 14:07:32,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:07:33,231 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.88 vs. limit=15.0 2023-09-28 14:07:34,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:07:37,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 14:07:39,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:07:39,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:07:41,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:07:49,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 14:07:49,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:07:51,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 14:07:51,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:07:51,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:07:53,222 INFO [train.py:1039] (3/4) Epoch 2, batch 900, loss[loss=0.3172, simple_loss=0.3446, pruned_loss=0.1449, over 23325.00 frames. ], tot_loss[loss=0.326, simple_loss=0.3619, pruned_loss=0.145, over 4673594.65 frames. ], batch size: 119, lr: 3.87e-02, grad_scale: 32.0 2023-09-28 14:07:54,703 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.205e+02 2.862e+02 3.366e+02 4.167e+02 7.237e+02, threshold=6.733e+02, percent-clipped=1.0 2023-09-28 14:07:54,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 14:07:59,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:08:02,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:08:02,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 14:08:05,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=41413.333333333336, ans=0.0 2023-09-28 14:08:06,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:08:07,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 14:08:09,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 14:08:09,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:08:09,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:08:11,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:08:11,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:08:24,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:08:24,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:08:24,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:08:28,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:08:31,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 14:08:33,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:08:33,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=41546.666666666664, ans=0.125 2023-09-28 14:08:39,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:08:40,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:08:42,596 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 14:08:42,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 14:08:48,345 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:08:48,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:08:49,094 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=6.05 vs. limit=6.0 2023-09-28 14:08:49,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:08:55,607 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.92 vs. limit=15.0 2023-09-28 14:08:58,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:08:58,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:08:59,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 14:08:59,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:09:04,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 14:09:06,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:09:06,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:09:07,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:09:07,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:09:09,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 14:09:11,129 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 14:09:14,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 14:09:14,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 14:09:14,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=41746.666666666664, ans=0.1 2023-09-28 14:09:16,240 INFO [train.py:1039] (3/4) Epoch 2, batch 950, loss[loss=0.3256, simple_loss=0.3392, pruned_loss=0.156, over 22764.00 frames. ], tot_loss[loss=0.3258, simple_loss=0.3619, pruned_loss=0.1449, over 4693475.23 frames. ], batch size: 322, lr: 3.87e-02, grad_scale: 32.0 2023-09-28 14:09:17,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:09:21,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 14:09:26,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:09:30,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:09:30,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:09:31,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 14:09:33,472 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 14:09:37,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:09:38,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:09:39,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:09:40,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:09:40,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 14:09:40,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=41813.333333333336, ans=0.0017797101449275356 2023-09-28 14:09:41,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 14:09:43,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:09:43,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=41813.333333333336, ans=0.0017797101449275356 2023-09-28 14:09:44,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 14:09:46,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:09:49,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:09:49,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:09:51,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:09:51,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 14:09:53,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 14:09:55,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:09:56,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:10:02,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:10:02,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:10:06,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 14:10:08,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=41946.666666666664, ans=0.0 2023-09-28 14:10:09,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 14:10:09,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:10:09,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:10:09,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:10:09,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:10:10,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=41946.666666666664, ans=0.2 2023-09-28 14:10:13,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 14:10:16,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:10:18,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:10:19,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:10:19,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 14:10:19,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:10:19,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:10:21,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 14:10:24,575 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=42013.333333333336, ans=0.2 2023-09-28 14:10:26,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:10:30,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:10:35,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:10:37,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 14:10:37,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 14:10:39,898 INFO [train.py:1039] (3/4) Epoch 2, batch 1000, loss[loss=0.3143, simple_loss=0.3352, pruned_loss=0.1467, over 22836.00 frames. ], tot_loss[loss=0.3264, simple_loss=0.3615, pruned_loss=0.1457, over 4695560.29 frames. ], batch size: 322, lr: 3.86e-02, grad_scale: 16.0 2023-09-28 14:10:41,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:10:42,873 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.243e+02 2.891e+02 3.339e+02 3.802e+02 9.955e+02, threshold=6.678e+02, percent-clipped=4.0 2023-09-28 14:10:43,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=42080.0, ans=0.0 2023-09-28 14:10:44,591 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 14:10:44,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:10:51,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:10:52,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 14:10:52,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 14:10:54,603 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=42146.666666666664, ans=0.125 2023-09-28 14:10:57,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:10:57,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:10:59,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:11:03,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 14:11:06,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 14:11:09,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 14:11:09,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:11:11,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 14:11:13,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 14:11:13,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 14:11:14,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:11:16,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:11:24,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:11:24,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:11:26,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:11:27,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:11:27,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 14:11:27,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:11:29,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:11:29,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:11:31,386 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 14:11:34,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 14:11:34,885 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=42280.0, ans=0.125 2023-09-28 14:11:36,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 14:11:37,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 14:11:39,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:11:41,685 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:11:46,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:11:46,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:11:46,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:11:48,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:11:48,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=42346.666666666664, ans=0.1 2023-09-28 14:11:49,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 14:11:51,277 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:11:51,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 14:11:51,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 14:11:52,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:11:52,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:11:54,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:11:54,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=42346.666666666664, ans=0.0 2023-09-28 14:11:58,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:12:00,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:12:00,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=42346.666666666664, ans=0.125 2023-09-28 14:12:03,239 INFO [train.py:1039] (3/4) Epoch 2, batch 1050, loss[loss=0.3305, simple_loss=0.349, pruned_loss=0.156, over 22745.00 frames. ], tot_loss[loss=0.3234, simple_loss=0.3576, pruned_loss=0.1446, over 4668515.25 frames. ], batch size: 322, lr: 3.85e-02, grad_scale: 16.0 2023-09-28 14:12:04,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:12:06,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:12:07,390 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.48 vs. limit=15.0 2023-09-28 14:12:08,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 14:12:09,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:12:10,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=42413.333333333336, ans=0.0016492753623188403 2023-09-28 14:12:11,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:12:13,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:12:15,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:12:16,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:12:18,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:12:18,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:12:20,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:12:20,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 14:12:21,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:12:21,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 14:12:22,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=42480.0, ans=0.0016347826086956525 2023-09-28 14:12:24,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:12:24,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 14:12:24,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:12:30,544 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=42480.0, ans=0.125 2023-09-28 14:12:31,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:12:34,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:12:35,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:12:38,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 14:12:38,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 14:12:39,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:12:42,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 14:12:45,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 14:12:46,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:12:50,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 14:12:53,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 14:12:53,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:12:55,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:12:58,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:12:59,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten.whitening_limit, batch_count=42613.333333333336, ans=15.0 2023-09-28 14:12:59,223 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.59 vs. limit=22.5 2023-09-28 14:13:03,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 14:13:04,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 14:13:04,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 14:13:06,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:13:06,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:13:08,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 14:13:12,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:13:15,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:13:15,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:13:17,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:13:17,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:13:19,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=42680.0, ans=0.1 2023-09-28 14:13:20,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:13:20,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 14:13:22,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:13:22,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 14:13:22,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 14:13:24,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:13:26,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=42746.666666666664, ans=0.2 2023-09-28 14:13:27,022 INFO [train.py:1039] (3/4) Epoch 2, batch 1100, loss[loss=0.3625, simple_loss=0.3561, pruned_loss=0.1845, over 19248.00 frames. ], tot_loss[loss=0.3221, simple_loss=0.3569, pruned_loss=0.1437, over 4680001.49 frames. ], batch size: 388, lr: 3.85e-02, grad_scale: 16.0 2023-09-28 14:13:29,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:13:30,547 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.933e+02 2.833e+02 3.199e+02 3.709e+02 7.263e+02, threshold=6.397e+02, percent-clipped=1.0 2023-09-28 14:13:33,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:13:35,745 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=42746.666666666664, ans=0.05 2023-09-28 14:13:39,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:13:41,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:13:41,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:13:41,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 14:13:43,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:13:47,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 14:13:50,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:13:53,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:13:53,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 14:13:55,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 14:13:56,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:13:56,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:13:59,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:14:00,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:14:06,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:14:06,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=42880.0, ans=0.1 2023-09-28 14:14:08,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 14:14:09,863 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 14:14:09,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:14:11,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=42880.0, ans=0.125 2023-09-28 14:14:13,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:14:14,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:14:14,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:14:16,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 14:14:17,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:14:17,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:14:17,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:14:19,111 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:14:19,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 14:14:27,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:14:27,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 14:14:28,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:14:34,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:14:38,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 14:14:38,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 14:14:38,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:14:42,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:14:42,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:14:44,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 14:14:44,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:14:45,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:14:47,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 14:14:47,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:14:47,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 14:14:48,584 INFO [train.py:1039] (3/4) Epoch 2, batch 1150, loss[loss=0.3333, simple_loss=0.3787, pruned_loss=0.1439, over 23947.00 frames. ], tot_loss[loss=0.3223, simple_loss=0.3579, pruned_loss=0.1433, over 4698628.84 frames. ], batch size: 80, lr: 3.84e-02, grad_scale: 16.0 2023-09-28 14:14:48,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:14:48,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:14:50,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:14:52,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=43080.0, ans=0.125 2023-09-28 14:14:56,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:14:59,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:15:00,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:15:00,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:15:00,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 14:15:02,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:15:04,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 14:15:05,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:15:05,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:15:10,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 14:15:12,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:15:15,048 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.85 vs. limit=12.0 2023-09-28 14:15:17,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:15:17,991 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.27 vs. limit=22.5 2023-09-28 14:15:18,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:15:18,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 14:15:18,727 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:15:18,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:15:21,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 14:15:23,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:15:25,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:15:35,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:15:42,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:15:42,435 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=43280.0, ans=0.1 2023-09-28 14:15:43,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 14:15:45,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:15:45,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:15:53,317 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 14:15:54,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:16:01,836 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 14:16:06,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:16:08,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:16:08,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:16:09,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:16:11,794 INFO [train.py:1039] (3/4) Epoch 2, batch 1200, loss[loss=0.2792, simple_loss=0.3315, pruned_loss=0.1134, over 24636.00 frames. ], tot_loss[loss=0.3227, simple_loss=0.3592, pruned_loss=0.1431, over 4713802.35 frames. ], batch size: 65, lr: 3.83e-02, grad_scale: 32.0 2023-09-28 14:16:13,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:16:14,363 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.83 vs. limit=15.0 2023-09-28 14:16:15,007 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.963e+02 2.991e+02 3.527e+02 4.351e+02 6.174e+02, threshold=7.053e+02, percent-clipped=0.0 2023-09-28 14:16:18,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:16:18,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:16:19,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:16:19,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:16:21,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:16:21,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:16:25,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:16:26,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:16:26,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:16:28,557 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 14:16:32,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 14:16:36,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:16:38,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:16:40,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:16:44,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:16:44,862 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 14:16:44,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:16:54,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:16:54,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:16:54,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 14:16:56,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:16:56,697 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:16:59,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 14:17:04,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 14:17:04,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:17:06,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:17:07,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:17:07,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:17:09,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:17:09,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:17:12,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:17:12,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 14:17:14,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:17:14,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:17:14,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:17:17,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:17:17,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:17:21,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 14:17:24,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:17:28,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 14:17:31,171 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 14:17:34,024 INFO [train.py:1039] (3/4) Epoch 2, batch 1250, loss[loss=0.4495, simple_loss=0.4304, pruned_loss=0.2343, over 19128.00 frames. ], tot_loss[loss=0.3232, simple_loss=0.3599, pruned_loss=0.1432, over 4718495.26 frames. ], batch size: 388, lr: 3.83e-02, grad_scale: 32.0 2023-09-28 14:17:34,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:17:35,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:17:37,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:17:39,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:17:42,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 14:17:47,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:17:48,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:17:48,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 14:17:49,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:17:51,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:17:54,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 14:17:55,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:17:56,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:17:56,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:17:58,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:18:02,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 14:18:02,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 14:18:02,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:18:04,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:18:05,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:18:09,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:18:10,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:18:18,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 14:18:19,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:18:21,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:18:21,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 14:18:23,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:18:23,340 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 14:18:23,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:18:23,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:18:26,894 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=43946.666666666664, ans=0.2 2023-09-28 14:18:29,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:18:33,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:18:33,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:18:35,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=43946.666666666664, ans=10.0 2023-09-28 14:18:36,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 14:18:36,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 14:18:36,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 14:18:39,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:18:41,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 14:18:42,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:18:43,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=44013.333333333336, ans=0.125 2023-09-28 14:18:44,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 14:18:44,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:18:46,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 14:18:46,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 14:18:46,142 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:18:46,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 14:18:46,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=44013.333333333336, ans=0.125 2023-09-28 14:18:47,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:18:50,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 14:18:53,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:18:55,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:18:56,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:18:56,895 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=44080.0, ans=0.125 2023-09-28 14:18:58,472 INFO [train.py:1039] (3/4) Epoch 2, batch 1300, loss[loss=0.3062, simple_loss=0.3617, pruned_loss=0.1253, over 24565.00 frames. ], tot_loss[loss=0.3255, simple_loss=0.3615, pruned_loss=0.1447, over 4710477.20 frames. ], batch size: 71, lr: 3.82e-02, grad_scale: 32.0 2023-09-28 14:18:58,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 14:19:01,544 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.024e+02 2.943e+02 3.508e+02 4.700e+02 1.321e+03, threshold=7.016e+02, percent-clipped=7.0 2023-09-28 14:19:01,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:19:01,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 14:19:07,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:19:08,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:19:08,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:19:10,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:19:11,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:19:13,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 14:19:18,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:19:18,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:19:21,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 14:19:22,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=44146.666666666664, ans=0.5 2023-09-28 14:19:22,660 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.83 vs. limit=10.0 2023-09-28 14:19:25,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:19:30,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:19:32,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:19:32,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:19:34,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:19:35,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:19:35,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 14:19:36,077 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=44213.333333333336, ans=0.1 2023-09-28 14:19:37,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 14:19:42,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:19:43,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:19:44,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 14:19:45,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 14:19:47,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:19:49,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:19:51,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 14:19:51,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:19:51,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 14:19:54,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:19:58,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:19:58,889 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:20:03,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 14:20:04,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 14:20:04,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 14:20:09,170 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.33 vs. limit=22.5 2023-09-28 14:20:09,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:20:13,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 14:20:15,044 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:20:17,030 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=44346.666666666664, ans=0.125 2023-09-28 14:20:20,914 INFO [train.py:1039] (3/4) Epoch 2, batch 1350, loss[loss=0.3667, simple_loss=0.4078, pruned_loss=0.1628, over 24453.00 frames. ], tot_loss[loss=0.3247, simple_loss=0.3611, pruned_loss=0.1441, over 4710370.16 frames. ], batch size: 66, lr: 3.82e-02, grad_scale: 32.0 2023-09-28 14:20:21,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 14:20:23,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=44413.333333333336, ans=0.125 2023-09-28 14:20:25,153 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.10 vs. limit=10.0 2023-09-28 14:20:25,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:20:27,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:20:31,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:20:31,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:20:33,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:20:34,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:20:38,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:20:41,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 14:20:43,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:20:43,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:20:45,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 14:20:48,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:20:48,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:20:48,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 14:20:51,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 14:20:53,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 14:20:54,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:20:54,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 14:20:55,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=44546.666666666664, ans=0.2 2023-09-28 14:21:07,364 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.34 vs. limit=6.0 2023-09-28 14:21:07,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:21:16,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:21:16,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:21:16,542 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 14:21:22,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:21:24,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 14:21:24,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:21:24,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:21:27,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:21:27,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=44680.0, ans=0.0 2023-09-28 14:21:30,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 14:21:31,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:21:38,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 14:21:40,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 14:21:43,239 INFO [train.py:1039] (3/4) Epoch 2, batch 1400, loss[loss=0.297, simple_loss=0.3218, pruned_loss=0.1361, over 23460.00 frames. ], tot_loss[loss=0.3226, simple_loss=0.3587, pruned_loss=0.1433, over 4707953.12 frames. ], batch size: 285, lr: 3.81e-02, grad_scale: 32.0 2023-09-28 14:21:45,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 14:21:46,399 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.080e+02 2.861e+02 3.179e+02 3.709e+02 7.568e+02, threshold=6.358e+02, percent-clipped=1.0 2023-09-28 14:21:46,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:21:46,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=44746.666666666664, ans=0.1 2023-09-28 14:21:51,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:21:51,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:21:52,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=44746.666666666664, ans=0.125 2023-09-28 14:21:58,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 14:21:58,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 14:22:00,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=44813.333333333336, ans=0.125 2023-09-28 14:22:07,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:22:11,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:22:13,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:22:13,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 14:22:18,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:22:18,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 14:22:23,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=44880.0, ans=0.125 2023-09-28 14:22:29,902 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:22:29,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:22:34,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 14:22:36,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:22:37,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:22:37,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:22:39,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:22:40,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:22:40,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:22:40,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:22:42,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=44946.666666666664, ans=0.125 2023-09-28 14:22:43,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 14:22:43,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:22:47,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:22:51,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:23:00,423 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 14:23:02,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 14:23:02,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:23:04,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 14:23:05,540 INFO [train.py:1039] (3/4) Epoch 2, batch 1450, loss[loss=0.3552, simple_loss=0.3735, pruned_loss=0.1685, over 23870.00 frames. ], tot_loss[loss=0.3212, simple_loss=0.3576, pruned_loss=0.1424, over 4707321.92 frames. ], batch size: 195, lr: 3.80e-02, grad_scale: 32.0 2023-09-28 14:23:05,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:23:05,894 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:23:10,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:23:10,429 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:23:10,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:10,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 14:23:17,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:23:18,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:23:20,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:23:20,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 14:23:21,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:23:23,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 14:23:24,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:24,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=45146.666666666664, ans=0.2 2023-09-28 14:23:25,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:23:25,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 14:23:27,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:23:28,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:23:28,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 14:23:28,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:23:30,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:23:30,690 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=45146.666666666664, ans=0.125 2023-09-28 14:23:31,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:35,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:23:37,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:23:37,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:23:41,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:23:41,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:43,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=45213.333333333336, ans=0.125 2023-09-28 14:23:44,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:23:44,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:23:44,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:23:45,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:23:48,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 14:23:49,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=45213.333333333336, ans=0.0 2023-09-28 14:23:52,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:23:55,743 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 14:23:57,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:23:58,281 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.91 vs. limit=6.0 2023-09-28 14:23:59,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:24:00,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:24:02,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 14:24:04,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:24:04,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=45280.0, ans=0.0 2023-09-28 14:24:06,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 14:24:07,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 14:24:09,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:24:13,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:24:13,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:24:14,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 14:24:16,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=45346.666666666664, ans=10.0 2023-09-28 14:24:17,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 14:24:17,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 14:24:18,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=45346.666666666664, ans=0.1 2023-09-28 14:24:19,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:24:20,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:24:25,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=45346.666666666664, ans=0.2 2023-09-28 14:24:30,456 INFO [train.py:1039] (3/4) Epoch 2, batch 1500, loss[loss=0.3307, simple_loss=0.3756, pruned_loss=0.1429, over 23419.00 frames. ], tot_loss[loss=0.3222, simple_loss=0.3585, pruned_loss=0.143, over 4710576.58 frames. ], batch size: 93, lr: 3.80e-02, grad_scale: 32.0 2023-09-28 14:24:30,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 14:24:30,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:24:30,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:24:33,281 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.960e+02 2.694e+02 3.191e+02 3.911e+02 7.189e+02, threshold=6.382e+02, percent-clipped=1.0 2023-09-28 14:24:33,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:24:33,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:24:33,974 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=45413.333333333336, ans=0.0 2023-09-28 14:24:35,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:24:36,662 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=7.28 vs. limit=15.0 2023-09-28 14:24:37,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 14:24:37,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=45413.333333333336, ans=0.000997101449275362 2023-09-28 14:24:38,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:24:40,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:24:40,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:24:40,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:24:41,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:24:43,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:24:49,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:24:49,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 14:24:50,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:24:50,292 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=45480.0, ans=0.125 2023-09-28 14:24:50,348 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=45480.0, ans=0.1 2023-09-28 14:24:51,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:24:51,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:24:56,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 14:25:01,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=45546.666666666664, ans=0.2 2023-09-28 14:25:02,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 14:25:04,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:25:06,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 14:25:07,518 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.29 vs. limit=22.5 2023-09-28 14:25:08,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 14:25:11,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:25:13,010 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:25:13,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:25:15,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 14:25:16,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:25:16,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:25:17,569 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 14:25:17,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:25:22,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:25:22,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 14:25:22,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=45613.333333333336, ans=0.07 2023-09-28 14:25:30,454 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:25:31,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:25:35,171 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 14:25:35,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:25:35,279 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 14:25:36,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=45680.0, ans=15.0 2023-09-28 14:25:37,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=45680.0, ans=0.0 2023-09-28 14:25:37,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=45680.0, ans=0.000939130434782609 2023-09-28 14:25:38,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:25:40,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:25:40,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=45680.0, ans=0.125 2023-09-28 14:25:41,997 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 14:25:42,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:25:42,555 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=45680.0, ans=0.125 2023-09-28 14:25:45,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 14:25:46,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:25:50,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:25:51,694 INFO [train.py:1039] (3/4) Epoch 2, batch 1550, loss[loss=0.3154, simple_loss=0.3569, pruned_loss=0.137, over 23361.00 frames. ], tot_loss[loss=0.3213, simple_loss=0.3586, pruned_loss=0.1419, over 4718845.20 frames. ], batch size: 93, lr: 3.79e-02, grad_scale: 32.0 2023-09-28 14:25:51,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:25:51,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:25:53,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:25:53,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:25:54,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 14:25:56,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 14:25:56,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:25:57,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 14:25:59,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 14:26:01,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:26:02,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:26:04,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:26:04,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:26:04,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=45746.666666666664, ans=0.2 2023-09-28 14:26:06,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:26:06,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:26:08,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=45813.333333333336, ans=0.125 2023-09-28 14:26:09,484 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 14:26:09,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:26:10,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:26:10,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:26:11,841 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.69 vs. limit=6.0 2023-09-28 14:26:14,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:26:14,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 14:26:16,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:26:18,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 14:26:18,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 14:26:18,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 14:26:18,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:26:18,742 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:26:19,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:26:24,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:26:27,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 14:26:27,856 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 14:26:28,194 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=45880.0, ans=0.125 2023-09-28 14:26:33,077 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=2.39 vs. limit=15.0 2023-09-28 14:26:35,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:26:36,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=45880.0, ans=0.0 2023-09-28 14:26:38,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:26:38,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 14:26:38,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:26:38,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 14:26:45,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:26:45,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:26:46,576 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.21 vs. limit=10.0 2023-09-28 14:26:49,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:26:51,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:26:53,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:26:53,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 14:26:54,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:26:56,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:26:56,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:26:58,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 14:26:58,524 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 14:27:00,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:27:00,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff3.min_abs, batch_count=46013.333333333336, ans=0.2 2023-09-28 14:27:05,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=46013.333333333336, ans=0.05 2023-09-28 14:27:06,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 14:27:09,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=46013.333333333336, ans=0.1 2023-09-28 14:27:12,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:27:14,218 INFO [train.py:1039] (3/4) Epoch 2, batch 1600, loss[loss=0.4246, simple_loss=0.413, pruned_loss=0.2181, over 19232.00 frames. ], tot_loss[loss=0.3221, simple_loss=0.3596, pruned_loss=0.1423, over 4728633.20 frames. ], batch size: 388, lr: 3.78e-02, grad_scale: 32.0 2023-09-28 14:27:15,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:27:15,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 14:27:17,243 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.915e+02 2.937e+02 3.574e+02 4.472e+02 6.126e+02, threshold=7.147e+02, percent-clipped=0.0 2023-09-28 14:27:17,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:27:18,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:27:18,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:27:19,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:27:19,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:27:22,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:27:24,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 14:27:26,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 14:27:27,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 14:27:29,630 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:27:29,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 14:27:31,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:27:31,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=46146.666666666664, ans=0.0 2023-09-28 14:27:34,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:27:34,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=46146.666666666664, ans=0.2 2023-09-28 14:27:38,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:27:41,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=46146.666666666664, ans=0.125 2023-09-28 14:27:42,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 14:27:44,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=46146.666666666664, ans=0.125 2023-09-28 14:27:45,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:27:45,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 14:27:47,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:27:49,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 14:27:55,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 14:27:55,901 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:28:02,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:28:04,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=46280.0, ans=0.2 2023-09-28 14:28:05,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 14:28:07,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:28:07,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:28:07,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:28:10,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 14:28:15,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 14:28:17,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:28:18,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:28:18,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:28:18,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:28:20,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:28:21,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:28:23,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:28:30,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:28:32,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:28:33,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 14:28:33,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:28:36,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 14:28:37,320 INFO [train.py:1039] (3/4) Epoch 2, batch 1650, loss[loss=0.3175, simple_loss=0.3441, pruned_loss=0.1455, over 23790.00 frames. ], tot_loss[loss=0.3228, simple_loss=0.3604, pruned_loss=0.1426, over 4717405.97 frames. ], batch size: 164, lr: 3.78e-02, grad_scale: 32.0 2023-09-28 14:28:40,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:28:42,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:28:43,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:28:43,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 14:28:43,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 14:28:43,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 14:28:45,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 14:28:47,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:28:47,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:28:49,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:28:49,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:28:52,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:28:54,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 14:28:54,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=46480.0, ans=0.0 2023-09-28 14:28:57,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:28:57,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:28:57,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:28:57,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:28:57,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 14:28:57,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 14:29:04,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:29:07,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=46480.0, ans=0.125 2023-09-28 14:29:08,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:29:16,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 14:29:17,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:29:18,309 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.91 vs. limit=15.0 2023-09-28 14:29:21,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 14:29:23,165 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=46546.666666666664, ans=0.0 2023-09-28 14:29:24,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:29:24,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=46546.666666666664, ans=0.125 2023-09-28 14:29:26,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:29:26,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:29:27,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:29:29,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:29:29,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:29:32,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:29:32,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:29:32,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:29:33,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:29:34,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:29:36,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:29:38,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:29:38,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 14:29:41,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:29:41,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 14:29:43,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 14:29:43,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 14:29:43,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:29:44,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:29:44,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:29:46,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:29:46,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 14:29:51,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:29:54,345 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:29:54,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:29:56,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 14:30:00,472 INFO [train.py:1039] (3/4) Epoch 2, batch 1700, loss[loss=0.3514, simple_loss=0.3749, pruned_loss=0.164, over 23811.00 frames. ], tot_loss[loss=0.3221, simple_loss=0.3591, pruned_loss=0.1426, over 4711149.48 frames. ], batch size: 195, lr: 3.77e-02, grad_scale: 16.0 2023-09-28 14:30:00,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:30:00,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:30:00,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 14:30:00,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=46746.666666666664, ans=0.05 2023-09-28 14:30:02,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:30:02,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:30:02,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:30:02,843 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.19 vs. limit=15.0 2023-09-28 14:30:05,061 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.163e+02 2.907e+02 3.272e+02 3.830e+02 6.451e+02, threshold=6.545e+02, percent-clipped=0.0 2023-09-28 14:30:05,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:30:05,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:30:05,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 14:30:10,261 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:30:18,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:30:21,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:30:28,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:30:28,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:30:28,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:30:30,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:30:31,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 14:30:35,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:30:35,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:30:36,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:30:38,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:30:41,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 14:30:41,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 14:30:43,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:30:43,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 14:30:45,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:30:54,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:30:54,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:30:54,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=46946.666666666664, ans=0.2 2023-09-28 14:30:55,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:30:57,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 14:30:57,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 14:30:57,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:31:00,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:31:00,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 14:31:00,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:31:00,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:31:01,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:31:02,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:31:05,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:31:05,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:31:05,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:31:07,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:31:07,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:31:10,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:31:11,931 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 14:31:15,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:31:17,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:31:18,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 14:31:24,029 INFO [train.py:1039] (3/4) Epoch 2, batch 1750, loss[loss=0.3113, simple_loss=0.3616, pruned_loss=0.1305, over 24402.00 frames. ], tot_loss[loss=0.319, simple_loss=0.3566, pruned_loss=0.1406, over 4706836.12 frames. ], batch size: 77, lr: 3.76e-02, grad_scale: 16.0 2023-09-28 14:31:27,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:31:27,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=47080.0, ans=0.125 2023-09-28 14:31:30,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:31:30,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 14:31:31,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 14:31:31,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:31:35,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:31:35,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:31:40,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 14:31:43,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:31:43,549 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=47146.666666666664, ans=0.0 2023-09-28 14:31:44,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 14:31:46,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:31:47,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:31:49,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 14:31:50,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 14:31:53,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:31:53,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 14:31:53,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=47146.666666666664, ans=0.0006202898550724638 2023-09-28 14:32:01,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:32:05,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:32:05,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:32:08,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:32:08,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:32:11,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:32:12,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:32:14,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:32:16,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:32:17,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 14:32:20,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:32:22,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 14:32:24,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:32:27,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:32:27,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:32:30,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=47346.666666666664, ans=0.125 2023-09-28 14:32:33,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:32:33,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 14:32:34,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:32:36,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:32:38,470 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.64 vs. limit=6.0 2023-09-28 14:32:42,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:32:44,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:32:44,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:32:44,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 14:32:44,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:32:44,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=47413.333333333336, ans=0.0 2023-09-28 14:32:45,830 INFO [train.py:1039] (3/4) Epoch 2, batch 1800, loss[loss=0.2836, simple_loss=0.3306, pruned_loss=0.1183, over 24582.00 frames. ], tot_loss[loss=0.3179, simple_loss=0.3561, pruned_loss=0.1398, over 4714251.98 frames. ], batch size: 60, lr: 3.76e-02, grad_scale: 16.0 2023-09-28 14:32:47,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:32:47,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:32:47,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:32:47,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:32:49,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:32:50,426 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.289e+02 2.760e+02 3.253e+02 3.974e+02 7.457e+02, threshold=6.506e+02, percent-clipped=1.0 2023-09-28 14:32:52,098 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:32:52,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:32:53,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 14:32:56,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:33:00,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 14:33:02,546 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:33:05,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:33:09,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:33:09,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:33:11,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:33:12,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:33:12,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 14:33:14,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=47480.0, ans=0.2 2023-09-28 14:33:16,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:33:19,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:33:20,105 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.18 vs. limit=15.0 2023-09-28 14:33:22,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 14:33:25,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 14:33:25,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 14:33:26,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:33:26,348 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=47546.666666666664, ans=0.0005333333333333336 2023-09-28 14:33:27,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:33:27,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:33:27,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:33:34,783 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=47613.333333333336, ans=0.0 2023-09-28 14:33:35,894 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 14:33:37,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:33:39,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:33:41,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 14:33:41,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 14:33:41,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=47613.333333333336, ans=0.000518840579710144 2023-09-28 14:33:43,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:33:44,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:33:46,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:33:49,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 14:33:52,940 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.83 vs. limit=15.0 2023-09-28 14:33:56,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:33:58,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 14:33:59,718 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:33:59,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:33:59,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:34:01,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 14:34:03,614 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.68 vs. limit=15.0 2023-09-28 14:34:04,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:34:04,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:34:07,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 14:34:07,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:34:07,964 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=47746.666666666664, ans=0.125 2023-09-28 14:34:09,034 INFO [train.py:1039] (3/4) Epoch 2, batch 1850, loss[loss=0.3159, simple_loss=0.341, pruned_loss=0.1454, over 23800.00 frames. ], tot_loss[loss=0.3182, simple_loss=0.3562, pruned_loss=0.1401, over 4709661.32 frames. ], batch size: 179, lr: 3.75e-02, grad_scale: 16.0 2023-09-28 14:34:09,901 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten.whitening_limit, batch_count=47746.666666666664, ans=15.0 2023-09-28 14:34:10,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:34:10,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:34:10,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:34:12,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:34:13,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:34:16,098 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:34:16,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:34:17,965 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.40 vs. limit=22.5 2023-09-28 14:34:19,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:34:21,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:34:29,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:34:29,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 14:34:31,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=47813.333333333336, ans=0.035 2023-09-28 14:34:32,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 14:34:34,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 14:34:37,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:34:37,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 14:34:37,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 14:34:47,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:34:49,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 14:34:54,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:34:54,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:34:58,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 14:34:59,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:34:59,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 14:34:59,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:35:01,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=47946.666666666664, ans=0.125 2023-09-28 14:35:01,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=47946.666666666664, ans=10.0 2023-09-28 14:35:02,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:35:02,979 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:35:04,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=47946.666666666664, ans=0.125 2023-09-28 14:35:05,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:35:07,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:35:08,175 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.99 vs. limit=6.0 2023-09-28 14:35:08,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:35:08,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 14:35:08,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:35:10,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:35:13,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:35:16,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 14:35:18,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:35:22,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:35:22,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:35:22,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 14:35:22,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 14:35:25,340 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 14:35:25,493 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 14:35:29,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:35:29,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:35:29,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:35:29,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:35:31,282 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 14:35:31,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:35:31,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:35:32,690 INFO [train.py:1039] (3/4) Epoch 2, batch 1900, loss[loss=0.3479, simple_loss=0.3731, pruned_loss=0.1614, over 23677.00 frames. ], tot_loss[loss=0.3179, simple_loss=0.3561, pruned_loss=0.1398, over 4705472.15 frames. ], batch size: 256, lr: 3.75e-02, grad_scale: 16.0 2023-09-28 14:35:32,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:35:32,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:35:34,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:35:34,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 14:35:37,594 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.980e+02 2.945e+02 3.315e+02 3.866e+02 6.379e+02, threshold=6.630e+02, percent-clipped=0.0 2023-09-28 14:35:37,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:35:37,819 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 14:35:37,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:35:39,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:35:45,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:35:45,891 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=48080.0, ans=0.0 2023-09-28 14:35:48,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:35:50,179 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 14:35:50,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 14:35:51,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:35:53,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:35:53,269 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 14:35:53,311 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 14:35:58,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 14:36:00,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:36:04,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 14:36:06,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 14:36:17,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 14:36:20,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 14:36:20,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:36:20,784 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 14:36:20,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 14:36:22,255 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 14:36:22,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 14:36:22,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:36:25,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=48280.0, ans=0.09899494936611666 2023-09-28 14:36:27,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 14:36:30,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:36:35,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:36:35,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 14:36:37,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:36:42,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 14:36:43,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:36:44,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=48346.666666666664, ans=0.0 2023-09-28 14:36:47,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:36:47,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:36:47,954 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:36:49,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:36:51,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 14:36:51,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 14:36:52,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:36:55,508 INFO [train.py:1039] (3/4) Epoch 2, batch 1950, loss[loss=0.3405, simple_loss=0.3614, pruned_loss=0.1598, over 23705.00 frames. ], tot_loss[loss=0.3166, simple_loss=0.3557, pruned_loss=0.1388, over 4722395.31 frames. ], batch size: 164, lr: 3.74e-02, grad_scale: 16.0 2023-09-28 14:36:55,581 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:36:55,584 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:36:57,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:36:57,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:36:58,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:36:58,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:37:01,232 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.99 vs. limit=15.0 2023-09-28 14:37:03,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:37:05,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:37:07,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:37:07,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:37:09,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 14:37:09,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 14:37:10,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:37:11,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=48480.0, ans=0.09899494936611666 2023-09-28 14:37:12,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:37:14,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:37:14,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:37:16,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:37:17,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:37:21,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:37:21,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:37:21,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:37:21,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:37:25,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:37:28,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=48546.666666666664, ans=0.0 2023-09-28 14:37:29,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:37:29,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:37:30,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 14:37:30,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 14:37:31,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 14:37:31,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:37:31,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:37:31,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=48546.666666666664, ans=0.0 2023-09-28 14:37:34,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:37:37,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:37:43,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:37:44,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:37:46,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:37:46,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 14:37:46,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:37:52,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:37:53,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:37:55,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:38:03,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:38:04,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:38:07,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:38:09,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:38:12,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:38:13,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:38:15,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 14:38:15,652 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:38:15,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:38:17,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 14:38:19,178 INFO [train.py:1039] (3/4) Epoch 2, batch 2000, loss[loss=0.3492, simple_loss=0.3692, pruned_loss=0.1646, over 23225.00 frames. ], tot_loss[loss=0.319, simple_loss=0.3575, pruned_loss=0.1402, over 4708314.95 frames. ], batch size: 119, lr: 3.73e-02, grad_scale: 32.0 2023-09-28 14:38:19,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:38:22,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:38:23,957 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.836e+02 2.784e+02 3.169e+02 3.809e+02 6.996e+02, threshold=6.339e+02, percent-clipped=1.0 2023-09-28 14:38:24,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:38:24,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:38:26,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:38:29,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:38:32,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 14:38:33,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:38:35,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:38:37,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 14:38:38,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 14:38:38,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:38:40,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:38:43,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 14:38:44,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:38:48,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:38:48,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:38:50,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 14:38:50,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 14:38:53,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 14:38:53,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:38:57,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:38:58,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 14:38:58,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:38:58,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:39:00,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:39:02,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 14:39:03,244 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:39:04,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 14:39:06,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:39:06,060 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:39:07,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=48946.666666666664, ans=0.125 2023-09-28 14:39:09,934 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.40 vs. limit=6.0 2023-09-28 14:39:10,250 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.70 vs. limit=15.0 2023-09-28 14:39:10,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:39:12,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:39:12,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:39:12,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:39:12,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=48946.666666666664, ans=0.1 2023-09-28 14:39:13,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:39:15,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:39:15,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:39:15,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:39:17,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:20,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:39:20,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 14:39:25,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:39:27,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:39:31,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:39:31,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:39:36,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:37,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:39:37,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:39,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:39:40,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:39:42,368 INFO [train.py:1039] (3/4) Epoch 2, batch 2050, loss[loss=0.3295, simple_loss=0.3496, pruned_loss=0.1547, over 23879.00 frames. ], tot_loss[loss=0.3199, simple_loss=0.3572, pruned_loss=0.1413, over 4704568.58 frames. ], batch size: 195, lr: 3.73e-02, grad_scale: 32.0 2023-09-28 14:39:42,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:39:44,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:44,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=49080.0, ans=0.125 2023-09-28 14:39:45,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=49080.0, ans=0.00020000000000000052 2023-09-28 14:39:47,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:39:48,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:52,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:39:55,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:39:56,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:39:56,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:39:59,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 14:39:59,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:39:59,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:39:59,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:40:02,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=49146.666666666664, ans=0.00018550724637681273 2023-09-28 14:40:12,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:40:12,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:40:14,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 14:40:14,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=49213.333333333336, ans=0.0 2023-09-28 14:40:15,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:40:17,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 14:40:17,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:40:20,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:40:23,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:40:23,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=49213.333333333336, ans=0.025 2023-09-28 14:40:25,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:40:25,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:40:26,087 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.45 vs. limit=15.0 2023-09-28 14:40:26,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:40:29,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:40:29,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:40:33,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:40:35,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:40:38,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:40:39,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:40:45,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:40:50,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:40:50,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 14:40:56,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:40:57,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:41:00,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:41:01,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 14:41:05,398 INFO [train.py:1039] (3/4) Epoch 2, batch 2100, loss[loss=0.2874, simple_loss=0.344, pruned_loss=0.1154, over 24010.00 frames. ], tot_loss[loss=0.3175, simple_loss=0.3555, pruned_loss=0.1398, over 4714780.59 frames. ], batch size: 80, lr: 3.72e-02, grad_scale: 32.0 2023-09-28 14:41:06,964 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 14:41:06,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:41:07,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:41:08,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:41:09,914 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.967e+02 2.956e+02 3.430e+02 4.185e+02 6.974e+02, threshold=6.859e+02, percent-clipped=1.0 2023-09-28 14:41:10,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:41:10,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 14:41:10,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 14:41:12,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:41:16,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:41:17,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:41:20,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:41:20,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:41:20,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 14:41:22,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:41:23,726 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 14:41:23,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 14:41:25,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:41:26,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:41:26,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 14:41:26,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 14:41:28,994 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:41:32,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 14:41:32,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:41:34,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:41:36,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:41:39,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:41:39,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 14:41:39,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:41:39,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 14:41:43,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 14:41:44,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:41:44,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 14:41:44,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 14:41:44,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 14:41:46,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:41:49,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:41:52,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:41:55,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 14:41:56,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:41:56,673 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:41:56,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 14:41:58,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:41:58,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:41:58,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:41:58,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 14:41:59,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 14:42:01,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 14:42:05,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:42:10,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:42:10,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 14:42:17,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:42:18,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:42:18,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:42:18,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:42:20,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 14:42:20,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:42:22,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:42:22,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:42:24,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:42:24,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:42:26,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 14:42:27,846 INFO [train.py:1039] (3/4) Epoch 2, batch 2150, loss[loss=0.2975, simple_loss=0.3435, pruned_loss=0.1257, over 24605.00 frames. ], tot_loss[loss=0.3152, simple_loss=0.3535, pruned_loss=0.1384, over 4707086.18 frames. ], batch size: 60, lr: 3.72e-02, grad_scale: 32.0 2023-09-28 14:42:28,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 14:42:28,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:42:30,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:42:30,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:42:30,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:42:31,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:42:37,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 14:42:38,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:42:38,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:42:40,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:42:40,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:42:40,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:42:45,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:42:45,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:42:45,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:42:50,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:42:50,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 14:42:54,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:42:57,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:42:58,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:42:58,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:42:58,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:42:58,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:42:58,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:43:00,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:43:00,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:43:00,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 14:43:01,186 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.63 vs. limit=15.0 2023-09-28 14:43:02,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:43:02,767 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.10 vs. limit=15.0 2023-09-28 14:43:03,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:43:03,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:43:05,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:43:06,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:43:06,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=49880.0, ans=0.2 2023-09-28 14:43:09,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:43:09,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:43:11,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:43:11,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 14:43:11,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 14:43:13,900 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.94 vs. limit=10.0 2023-09-28 14:43:14,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:43:15,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:17,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:43:19,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:43:19,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:43:19,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=49946.666666666664, ans=10.0 2023-09-28 14:43:19,974 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=49946.666666666664, ans=1.1594202898552314e-05 2023-09-28 14:43:21,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:21,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 14:43:22,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 14:43:24,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:43:24,115 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 14:43:24,654 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.49 vs. limit=15.0 2023-09-28 14:43:25,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:43:28,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:43:28,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 14:43:28,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:43:28,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 14:43:29,681 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 14:43:29,681 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 14:43:31,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 14:43:31,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:43:32,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:43:32,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:43:34,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:34,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 14:43:37,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:43:37,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:45,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:43:46,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 14:43:47,929 INFO [train.py:1039] (3/4) Epoch 2, batch 2200, loss[loss=0.2893, simple_loss=0.3395, pruned_loss=0.1195, over 24637.00 frames. ], tot_loss[loss=0.3155, simple_loss=0.354, pruned_loss=0.1385, over 4714737.73 frames. ], batch size: 65, lr: 3.71e-02, grad_scale: 32.0 2023-09-28 14:43:51,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:43:52,563 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.160e+02 2.946e+02 3.281e+02 3.928e+02 6.005e+02, threshold=6.562e+02, percent-clipped=0.0 2023-09-28 14:43:56,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:43:56,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:43:56,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:43:56,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=50080.0, ans=0.125 2023-09-28 14:43:59,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 14:44:02,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:44:02,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:44:03,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 14:44:08,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 14:44:08,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=50146.666666666664, ans=0.05 2023-09-28 14:44:09,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:44:16,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 14:44:19,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:44:19,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:44:19,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:44:22,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:44:22,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 14:44:27,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:44:29,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:44:29,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 14:44:34,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:44:35,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:44:38,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:44:40,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:44:41,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 14:44:42,625 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.73 vs. limit=6.0 2023-09-28 14:44:43,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:44:44,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 14:44:46,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:44:46,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 14:44:46,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:44:46,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=50280.0, ans=0.125 2023-09-28 14:44:48,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:44:49,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:44:49,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:44:49,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:44:51,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:44:51,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:44:52,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 14:44:56,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 14:44:56,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:44:59,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:45:01,394 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 14:45:03,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:45:04,546 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 14:45:04,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 14:45:06,935 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 14:45:08,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:45:08,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 14:45:10,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:45:11,955 INFO [train.py:1039] (3/4) Epoch 2, batch 2250, loss[loss=0.3297, simple_loss=0.3816, pruned_loss=0.1389, over 24363.00 frames. ], tot_loss[loss=0.3174, simple_loss=0.3556, pruned_loss=0.1396, over 4706338.40 frames. ], batch size: 74, lr: 3.70e-02, grad_scale: 32.0 2023-09-28 14:45:12,204 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 14:45:12,976 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.89 vs. limit=15.0 2023-09-28 14:45:13,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:45:14,072 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=50413.333333333336, ans=0.0 2023-09-28 14:45:15,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:45:20,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=50413.333333333336, ans=0.125 2023-09-28 14:45:22,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:45:23,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:45:26,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:45:27,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:45:29,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 14:45:29,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=50480.0, ans=0.5 2023-09-28 14:45:32,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 14:45:32,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:45:32,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:45:35,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 14:45:35,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:45:37,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:45:38,765 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 14:45:39,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=50480.0, ans=0.0 2023-09-28 14:45:44,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:45:45,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 14:45:45,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:45:47,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 14:45:49,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:45:50,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:45:56,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=50546.666666666664, ans=15.0 2023-09-28 14:45:57,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:45:57,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=50546.666666666664, ans=0.125 2023-09-28 14:45:58,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:46:00,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:46:00,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:46:01,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:46:03,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:46:10,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:46:11,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 14:46:13,826 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=50613.333333333336, ans=0.125 2023-09-28 14:46:13,880 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:46:16,903 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=50680.0, ans=0.125 2023-09-28 14:46:18,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 14:46:18,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:46:20,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:46:24,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 14:46:28,323 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=16.25 vs. limit=15.0 2023-09-28 14:46:29,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:46:29,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 14:46:29,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=50680.0, ans=0.1 2023-09-28 14:46:30,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:46:30,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:46:30,987 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=50680.0, ans=0.04949747468305833 2023-09-28 14:46:32,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 14:46:33,545 INFO [train.py:1039] (3/4) Epoch 2, batch 2300, loss[loss=0.293, simple_loss=0.3309, pruned_loss=0.1275, over 23308.00 frames. ], tot_loss[loss=0.3162, simple_loss=0.355, pruned_loss=0.1388, over 4711142.13 frames. ], batch size: 119, lr: 3.70e-02, grad_scale: 32.0 2023-09-28 14:46:34,456 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.85 vs. limit=22.5 2023-09-28 14:46:35,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:46:35,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:46:38,466 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.122e+02 3.000e+02 3.557e+02 4.160e+02 8.082e+02, threshold=7.115e+02, percent-clipped=2.0 2023-09-28 14:46:41,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:46:41,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:46:45,345 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 14:46:46,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:46:47,732 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.41 vs. limit=15.0 2023-09-28 14:46:48,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=50813.333333333336, ans=0.1 2023-09-28 14:46:53,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:46:53,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:46:54,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:46:54,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:46:54,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 14:46:57,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:47:00,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:47:00,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:47:03,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:47:06,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:47:10,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:47:16,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:47:17,010 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:47:20,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:47:22,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:47:27,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:47:27,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:47:27,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:47:27,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 14:47:30,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=50946.666666666664, ans=0.125 2023-09-28 14:47:33,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 14:47:33,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:47:33,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:47:33,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:47:33,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:47:34,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 14:47:34,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 14:47:34,911 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=50946.666666666664, ans=0.0 2023-09-28 14:47:36,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 14:47:36,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:47:36,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:47:37,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 14:47:44,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:47:47,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:47:51,381 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.88 vs. limit=22.5 2023-09-28 14:47:52,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:47:52,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:47:52,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 14:47:52,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:47:54,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:47:55,606 INFO [train.py:1039] (3/4) Epoch 2, batch 2350, loss[loss=0.3263, simple_loss=0.3712, pruned_loss=0.1407, over 23989.00 frames. ], tot_loss[loss=0.3164, simple_loss=0.3556, pruned_loss=0.1386, over 4723813.74 frames. ], batch size: 86, lr: 3.69e-02, grad_scale: 32.0 2023-09-28 14:47:55,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:47:55,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 14:48:01,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:48:01,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 14:48:08,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 14:48:11,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:48:14,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:48:14,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:48:14,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:48:15,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:48:16,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 14:48:19,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:48:19,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=51146.666666666664, ans=0.125 2023-09-28 14:48:24,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 14:48:26,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=51146.666666666664, ans=0.125 2023-09-28 14:48:27,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:48:29,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:48:29,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:48:32,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:48:32,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 14:48:34,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:48:36,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:48:36,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:48:37,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:48:42,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:48:44,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 14:48:44,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:48:47,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:48:47,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:48:49,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 14:48:50,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:48:55,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 14:48:55,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:49:00,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 14:49:05,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 14:49:05,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:49:05,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 14:49:07,362 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 14:49:07,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 14:49:08,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 14:49:12,249 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.97 vs. limit=10.0 2023-09-28 14:49:13,368 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.28 vs. limit=10.0 2023-09-28 14:49:14,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:49:18,681 INFO [train.py:1039] (3/4) Epoch 2, batch 2400, loss[loss=0.2829, simple_loss=0.3408, pruned_loss=0.1125, over 24461.00 frames. ], tot_loss[loss=0.3162, simple_loss=0.3552, pruned_loss=0.1386, over 4716146.19 frames. ], batch size: 63, lr: 3.68e-02, grad_scale: 32.0 2023-09-28 14:49:18,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:49:23,878 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.085e+02 2.787e+02 3.340e+02 4.281e+02 7.222e+02, threshold=6.680e+02, percent-clipped=0.0 2023-09-28 14:49:24,126 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:49:24,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:49:25,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 14:49:25,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 14:49:33,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 14:49:33,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:49:36,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 14:49:36,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:49:37,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:49:37,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 14:49:41,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=51480.0, ans=0.025 2023-09-28 14:49:41,842 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=51480.0, ans=0.1 2023-09-28 14:49:43,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=51480.0, ans=0.0 2023-09-28 14:49:45,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:49:46,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 14:49:51,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 14:49:57,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 14:50:00,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:50:00,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=51546.666666666664, ans=0.125 2023-09-28 14:50:03,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:50:06,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:50:07,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 14:50:08,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 14:50:09,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=51613.333333333336, ans=0.0 2023-09-28 14:50:17,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:50:18,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:50:19,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=51613.333333333336, ans=0.125 2023-09-28 14:50:22,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:50:24,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:50:24,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 14:50:24,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:50:24,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:50:24,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:50:24,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 14:50:24,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=51680.0, ans=0.1 2023-09-28 14:50:30,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:50:30,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 14:50:32,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 14:50:32,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 14:50:35,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:50:35,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:50:35,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 14:50:35,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 14:50:37,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 14:50:37,025 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 14:50:37,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 14:50:37,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:50:38,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:50:38,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:50:39,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=51680.0, ans=0.125 2023-09-28 14:50:40,488 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 14:50:41,855 INFO [train.py:1039] (3/4) Epoch 2, batch 2450, loss[loss=0.2618, simple_loss=0.3158, pruned_loss=0.1039, over 24316.00 frames. ], tot_loss[loss=0.3143, simple_loss=0.3537, pruned_loss=0.1374, over 4727620.03 frames. ], batch size: 61, lr: 3.68e-02, grad_scale: 32.0 2023-09-28 14:50:42,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:50:44,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 14:50:47,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:50:47,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:50:52,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:50:52,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:50:53,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 14:50:58,020 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=51813.333333333336, ans=0.125 2023-09-28 14:50:59,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:50:59,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:51:02,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:51:02,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:51:02,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:51:02,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 14:51:07,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:51:09,051 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:51:09,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:51:10,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 14:51:12,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:51:12,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=51813.333333333336, ans=0.1 2023-09-28 14:51:14,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:51:14,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:51:17,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 14:51:17,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:51:29,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:51:30,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:51:30,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:51:32,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 14:51:32,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:51:34,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:51:35,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 14:51:37,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 14:51:37,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:51:42,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:51:42,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:51:49,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 14:51:49,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 14:51:49,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:51:51,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:51:51,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 14:51:51,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:51:52,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:51:57,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:52:01,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:52:01,096 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:52:04,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 14:52:05,507 INFO [train.py:1039] (3/4) Epoch 2, batch 2500, loss[loss=0.3177, simple_loss=0.3553, pruned_loss=0.1401, over 23633.00 frames. ], tot_loss[loss=0.3129, simple_loss=0.3526, pruned_loss=0.1365, over 4732795.40 frames. ], batch size: 149, lr: 3.67e-02, grad_scale: 32.0 2023-09-28 14:52:06,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 14:52:10,766 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.999e+02 2.754e+02 3.242e+02 3.766e+02 6.714e+02, threshold=6.484e+02, percent-clipped=2.0 2023-09-28 14:52:12,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:52:21,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=52146.666666666664, ans=0.125 2023-09-28 14:52:22,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 14:52:22,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:52:25,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:52:25,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 14:52:32,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 14:52:33,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:52:34,007 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=52146.666666666664, ans=0.125 2023-09-28 14:52:35,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 14:52:35,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 14:52:36,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 14:52:38,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:52:38,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:52:40,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 14:52:40,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:52:42,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 14:52:42,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:52:45,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:52:47,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:52:50,753 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 14:52:50,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 14:52:50,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:52:52,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:52:57,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:53:01,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:53:05,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:53:10,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 14:53:12,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 14:53:12,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:53:12,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 14:53:16,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 14:53:16,343 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 14:53:17,923 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 14:53:17,923 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 14:53:17,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 14:53:19,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:53:22,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 14:53:22,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 14:53:22,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:53:24,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 14:53:28,140 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:53:29,223 INFO [train.py:1039] (3/4) Epoch 2, batch 2550, loss[loss=0.3236, simple_loss=0.3703, pruned_loss=0.1384, over 24641.00 frames. ], tot_loss[loss=0.3133, simple_loss=0.3533, pruned_loss=0.1367, over 4735327.89 frames. ], batch size: 73, lr: 3.67e-02, grad_scale: 32.0 2023-09-28 14:53:29,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 14:53:32,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:53:32,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=52413.333333333336, ans=0.125 2023-09-28 14:53:33,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:53:36,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:53:36,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:53:36,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=52413.333333333336, ans=0.125 2023-09-28 14:53:37,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 14:53:37,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:53:41,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 14:53:42,731 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:53:46,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:53:49,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:53:49,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 14:53:50,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:53:50,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:53:50,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:53:53,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:53:53,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 14:53:54,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 14:53:54,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:53:54,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 14:53:54,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=52480.0, ans=0.125 2023-09-28 14:54:00,167 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=52480.0, ans=0.0 2023-09-28 14:54:03,380 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 14:54:06,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 14:54:07,079 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.85 vs. limit=15.0 2023-09-28 14:54:11,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:54:11,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:54:11,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:54:13,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 14:54:21,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:54:24,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 14:54:24,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:54:24,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 14:54:26,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 14:54:26,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 14:54:31,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:54:31,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:54:35,658 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=24.11 vs. limit=22.5 2023-09-28 14:54:37,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:54:39,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 14:54:39,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:54:39,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:54:40,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 14:54:43,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 14:54:44,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:54:52,179 INFO [train.py:1039] (3/4) Epoch 2, batch 2600, loss[loss=0.3288, simple_loss=0.3689, pruned_loss=0.1444, over 23397.00 frames. ], tot_loss[loss=0.313, simple_loss=0.3536, pruned_loss=0.1362, over 4732540.62 frames. ], batch size: 105, lr: 3.66e-02, grad_scale: 32.0 2023-09-28 14:54:52,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:54:53,928 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:54:57,273 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.968e+02 2.901e+02 3.329e+02 4.085e+02 7.147e+02, threshold=6.657e+02, percent-clipped=2.0 2023-09-28 14:54:57,495 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 14:54:59,246 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 14:54:59,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 14:54:59,321 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 14:55:00,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 14:55:00,881 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 14:55:03,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:55:03,309 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 14:55:06,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 14:55:06,523 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=52746.666666666664, ans=0.125 2023-09-28 14:55:07,624 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 14:55:11,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:55:11,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=52813.333333333336, ans=0.2 2023-09-28 14:55:12,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 14:55:14,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 14:55:16,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 14:55:16,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 14:55:19,588 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 14:55:19,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 14:55:21,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=52813.333333333336, ans=0.0 2023-09-28 14:55:25,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:55:27,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:55:27,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:55:27,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 14:55:28,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=52880.0, ans=0.04949747468305833 2023-09-28 14:55:29,845 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=2.335e-02 2023-09-28 14:55:30,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 14:55:31,802 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.93 vs. limit=22.5 2023-09-28 14:55:35,710 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 14:55:42,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:55:42,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:55:42,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 14:55:42,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:55:42,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:55:43,035 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=52946.666666666664, ans=0.0 2023-09-28 14:55:44,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 14:55:47,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:55:47,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:55:49,670 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=52946.666666666664, ans=0.125 2023-09-28 14:55:50,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:55:51,203 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=52946.666666666664, ans=0.95 2023-09-28 14:55:53,044 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 14:55:54,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:55:54,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 14:56:02,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:56:02,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 14:56:04,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 14:56:04,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:56:07,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:56:07,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:56:11,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=53013.333333333336, ans=0.125 2023-09-28 14:56:13,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 14:56:13,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:56:15,311 INFO [train.py:1039] (3/4) Epoch 2, batch 2650, loss[loss=0.3243, simple_loss=0.3532, pruned_loss=0.1477, over 23798.00 frames. ], tot_loss[loss=0.3143, simple_loss=0.3545, pruned_loss=0.1371, over 4725289.02 frames. ], batch size: 164, lr: 3.65e-02, grad_scale: 32.0 2023-09-28 14:56:15,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 14:56:15,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=53080.0, ans=0.125 2023-09-28 14:56:19,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 14:56:19,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:56:20,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 14:56:22,226 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 14:56:22,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:56:22,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten.whitening_limit, batch_count=53080.0, ans=15.0 2023-09-28 14:56:24,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=53080.0, ans=0.2 2023-09-28 14:56:24,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=53080.0, ans=0.0 2023-09-28 14:56:25,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:56:27,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 14:56:29,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:56:30,352 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.97 vs. limit=15.0 2023-09-28 14:56:30,954 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:56:32,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 14:56:32,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 14:56:32,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:56:35,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 14:56:38,996 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 14:56:42,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:56:45,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 14:56:45,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:56:47,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 14:56:50,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:56:50,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 14:56:50,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:56:52,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:56:57,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 14:56:57,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 14:56:59,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:57:03,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 14:57:04,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:57:05,485 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:06,163 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.94 vs. limit=15.0 2023-09-28 14:57:06,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:57:06,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:57:06,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:57:08,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:57:10,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=53280.0, ans=0.0 2023-09-28 14:57:11,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:57:12,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:57:13,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:57:13,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 14:57:13,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=53280.0, ans=0.125 2023-09-28 14:57:15,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:57:16,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:57:16,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:57:22,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:57:23,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 14:57:27,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:28,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 14:57:28,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:57:29,664 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.66 vs. limit=15.0 2023-09-28 14:57:30,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 14:57:35,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:57:36,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:37,118 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=53346.666666666664, ans=0.125 2023-09-28 14:57:38,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:39,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:57:40,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 14:57:40,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:57:42,114 INFO [train.py:1039] (3/4) Epoch 2, batch 2700, loss[loss=0.2833, simple_loss=0.3308, pruned_loss=0.1179, over 20133.00 frames. ], tot_loss[loss=0.315, simple_loss=0.3552, pruned_loss=0.1374, over 4727700.28 frames. ], batch size: 44, lr: 3.65e-02, grad_scale: 32.0 2023-09-28 14:57:42,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:57:42,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 14:57:45,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:57:46,776 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.066e+02 2.772e+02 3.228e+02 4.080e+02 7.773e+02, threshold=6.457e+02, percent-clipped=3.0 2023-09-28 14:57:47,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 14:57:48,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:57:50,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:57:50,728 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:57:50,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 14:57:52,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:57:52,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 14:57:52,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 14:57:52,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 14:57:53,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 14:57:53,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:57:55,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:57:55,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:57:59,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 14:58:00,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 14:58:00,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:58:09,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 14:58:09,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:58:14,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 14:58:14,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 14:58:14,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 14:58:14,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 14:58:19,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:58:22,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:58:22,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 14:58:22,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:58:29,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:58:29,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 14:58:37,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten.whitening_limit, batch_count=53613.333333333336, ans=22.5 2023-09-28 14:58:38,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 14:58:38,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:58:41,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 14:58:42,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:58:43,415 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.42 vs. limit=10.0 2023-09-28 14:58:47,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:58:49,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:58:50,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 14:58:50,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:58:52,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 14:58:52,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:58:52,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=53680.0, ans=0.125 2023-09-28 14:58:56,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 14:58:58,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:58:58,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 14:59:00,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 14:59:00,430 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=53680.0, ans=0.1 2023-09-28 14:59:02,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:59:03,497 INFO [train.py:1039] (3/4) Epoch 2, batch 2750, loss[loss=0.3361, simple_loss=0.3536, pruned_loss=0.1594, over 23871.00 frames. ], tot_loss[loss=0.316, simple_loss=0.3558, pruned_loss=0.1382, over 4723928.75 frames. ], batch size: 150, lr: 3.64e-02, grad_scale: 16.0 2023-09-28 14:59:05,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 14:59:05,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 14:59:07,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 14:59:07,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:59:08,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:59:08,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:59:11,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:12,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 14:59:12,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:17,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:59:18,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 14:59:18,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 14:59:18,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:18,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 14:59:18,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 14:59:18,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 14:59:25,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 14:59:27,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 14:59:27,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:28,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 14:59:29,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 14:59:30,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 14:59:31,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 14:59:31,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:59:33,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:59:39,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 14:59:39,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 14:59:39,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 14:59:40,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:42,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 14:59:48,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 14:59:50,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 14:59:50,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 14:59:55,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 14:59:55,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 14:59:57,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:00:02,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:00:02,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:00:02,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 15:00:08,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:00:10,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 15:00:16,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 15:00:18,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:00:18,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 15:00:20,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:00:23,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:00:23,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 15:00:24,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:00:26,434 INFO [train.py:1039] (3/4) Epoch 2, batch 2800, loss[loss=0.3103, simple_loss=0.359, pruned_loss=0.1308, over 24661.00 frames. ], tot_loss[loss=0.3151, simple_loss=0.3547, pruned_loss=0.1378, over 4730915.45 frames. ], batch size: 65, lr: 3.64e-02, grad_scale: 32.0 2023-09-28 15:00:28,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 15:00:28,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:00:28,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:00:29,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 15:00:29,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:00:29,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:00:31,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:00:33,353 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.006e+02 2.948e+02 3.600e+02 4.282e+02 6.554e+02, threshold=7.201e+02, percent-clipped=1.0 2023-09-28 15:00:33,489 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 15:00:33,490 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 15:00:34,426 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.23 vs. limit=15.0 2023-09-28 15:00:36,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:00:38,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:00:38,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:00:42,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:00:43,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 15:00:45,901 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:00:47,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 15:00:48,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 15:00:50,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:00:50,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:00:50,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:00:53,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:00:53,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:00:53,554 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 15:00:56,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:01:03,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:01:05,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:01:10,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:01:10,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:01:12,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:01:15,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:01:15,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 15:01:16,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:01:17,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=54280.0, ans=0.125 2023-09-28 15:01:18,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:01:18,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:01:19,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=54280.0, ans=0.1 2023-09-28 15:01:22,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:01:23,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:01:26,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:01:28,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:01:28,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:01:28,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 15:01:28,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 15:01:28,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:01:29,474 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.75 vs. limit=15.0 2023-09-28 15:01:30,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:01:30,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 15:01:32,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:01:34,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:01:34,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:01:35,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 15:01:36,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:01:36,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:01:38,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:01:39,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 15:01:42,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=54346.666666666664, ans=0.125 2023-09-28 15:01:45,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=54346.666666666664, ans=0.0 2023-09-28 15:01:46,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:01:46,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 15:01:46,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:01:47,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=54346.666666666664, ans=0.125 2023-09-28 15:01:49,651 INFO [train.py:1039] (3/4) Epoch 2, batch 2850, loss[loss=0.2924, simple_loss=0.3566, pruned_loss=0.1141, over 24650.00 frames. ], tot_loss[loss=0.3144, simple_loss=0.3534, pruned_loss=0.1376, over 4706551.62 frames. ], batch size: 73, lr: 3.63e-02, grad_scale: 32.0 2023-09-28 15:01:49,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:01:55,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=54413.333333333336, ans=0.1 2023-09-28 15:01:56,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:01:56,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:01:56,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:02:01,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:02:01,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:02:02,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:02:02,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 15:02:06,030 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=54480.0, ans=0.125 2023-09-28 15:02:09,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 15:02:09,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:02:12,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 15:02:13,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:02:15,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 15:02:17,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 15:02:17,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:02:17,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=54480.0, ans=0.2 2023-09-28 15:02:21,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=54546.666666666664, ans=0.125 2023-09-28 15:02:31,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:02:31,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=54546.666666666664, ans=0.125 2023-09-28 15:02:32,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:02:32,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:02:34,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 15:02:34,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:02:34,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:02:36,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:02:36,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 15:02:39,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:02:39,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:02:39,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:02:41,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:02:44,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:02:44,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:02:45,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:02:48,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:02:50,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:02:53,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:02:53,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:02:56,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:03:02,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:03:04,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 15:03:04,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 15:03:06,429 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:03:08,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:03:08,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 15:03:09,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:03:09,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:03:10,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:03:10,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:03:10,953 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 15:03:11,008 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 15:03:11,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:03:11,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=54746.666666666664, ans=0.95 2023-09-28 15:03:12,360 INFO [train.py:1039] (3/4) Epoch 2, batch 2900, loss[loss=0.3413, simple_loss=0.3652, pruned_loss=0.1587, over 23315.00 frames. ], tot_loss[loss=0.3132, simple_loss=0.3532, pruned_loss=0.1367, over 4708470.07 frames. ], batch size: 119, lr: 3.62e-02, grad_scale: 32.0 2023-09-28 15:03:12,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:03:15,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 15:03:15,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:03:16,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=54746.666666666664, ans=0.09899494936611666 2023-09-28 15:03:18,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:03:19,517 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.086e+02 2.913e+02 3.691e+02 4.538e+02 7.186e+02, threshold=7.382e+02, percent-clipped=0.0 2023-09-28 15:03:19,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 15:03:21,597 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=54746.666666666664, ans=0.0 2023-09-28 15:03:22,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:03:22,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 15:03:24,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 15:03:26,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:03:26,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:03:29,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:03:30,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:03:34,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:03:34,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:03:39,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 15:03:39,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 15:03:39,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:03:42,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:03:44,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 15:03:44,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 15:03:47,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:03:47,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 15:03:47,594 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:03:47,835 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=54880.0, ans=0.125 2023-09-28 15:03:50,673 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:03:50,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 15:03:55,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:03:57,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:03:57,441 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=54880.0, ans=0.1 2023-09-28 15:04:00,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:04:01,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:04:05,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 15:04:06,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 15:04:06,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:04:09,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:04:12,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 15:04:14,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:04:16,367 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.22 vs. limit=22.5 2023-09-28 15:04:20,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:04:29,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:04:29,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:04:31,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 15:04:34,434 INFO [train.py:1039] (3/4) Epoch 2, batch 2950, loss[loss=0.33, simple_loss=0.3629, pruned_loss=0.1485, over 23583.00 frames. ], tot_loss[loss=0.3146, simple_loss=0.3543, pruned_loss=0.1375, over 4707189.32 frames. ], batch size: 120, lr: 3.62e-02, grad_scale: 32.0 2023-09-28 15:04:34,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:04:34,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 15:04:34,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:04:36,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:04:41,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:04:42,943 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 15:04:44,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:04:44,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:04:46,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:04:47,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:04:48,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 15:04:49,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 15:04:50,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:04:50,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:04:54,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=55146.666666666664, ans=0.1 2023-09-28 15:04:58,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:04:59,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:05:01,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:05:03,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:05:06,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:05:06,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:05:08,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:05:08,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:05:08,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:05:10,504 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=55213.333333333336, ans=0.125 2023-09-28 15:05:11,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 15:05:11,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=55213.333333333336, ans=0.0 2023-09-28 15:05:15,104 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=55213.333333333336, ans=0.5 2023-09-28 15:05:16,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 15:05:16,839 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 15:05:18,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:05:20,816 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 15:05:22,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 15:05:22,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:05:24,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:05:24,560 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 15:05:24,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 15:05:27,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 15:05:27,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:05:27,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:05:30,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:05:32,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:05:33,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:05:33,894 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 15:05:33,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:05:35,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 15:05:40,648 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:05:42,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:05:42,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 15:05:42,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:05:45,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 15:05:46,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:05:48,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:05:50,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:05:50,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:05:52,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 15:05:54,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:05:54,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:05:54,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:05:55,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:05:56,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:05:56,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:05:57,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:05:58,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 15:05:59,270 INFO [train.py:1039] (3/4) Epoch 2, batch 3000, loss[loss=0.2969, simple_loss=0.3595, pruned_loss=0.1171, over 24321.00 frames. ], tot_loss[loss=0.3149, simple_loss=0.3548, pruned_loss=0.1375, over 4693910.41 frames. ], batch size: 74, lr: 3.61e-02, grad_scale: 32.0 2023-09-28 15:05:59,270 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-28 15:06:14,696 INFO [train.py:1071] (3/4) Epoch 2, validation: loss=0.3279, simple_loss=0.3383, pruned_loss=0.1588, over 1125622.00 frames. 2023-09-28 15:06:14,697 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-28 15:06:14,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:06:17,845 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:06:17,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:06:20,855 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.039e+02 2.946e+02 3.548e+02 4.220e+02 7.965e+02, threshold=7.096e+02, percent-clipped=1.0 2023-09-28 15:06:20,954 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 15:06:21,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 15:06:23,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:06:23,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:06:25,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 15:06:25,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:06:33,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 15:06:34,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=55480.0, ans=0.125 2023-09-28 15:06:43,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:06:50,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 15:06:52,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:06:55,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:06:55,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:06:55,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:06:58,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:06:58,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 15:07:01,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 15:07:02,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:07:02,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:07:04,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:07:04,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:07:04,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=55613.333333333336, ans=0.125 2023-09-28 15:07:06,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:07:06,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:07:09,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:07:10,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:07:10,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:07:13,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:07:16,406 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 15:07:16,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:07:16,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:07:17,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:07:23,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:07:23,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:07:24,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 15:07:24,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 15:07:24,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:07:24,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 15:07:26,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:07:28,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 15:07:31,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:07:32,103 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.94 vs. limit=15.0 2023-09-28 15:07:34,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:07:34,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 15:07:35,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 15:07:35,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 15:07:36,379 INFO [train.py:1039] (3/4) Epoch 2, batch 3050, loss[loss=0.3174, simple_loss=0.3685, pruned_loss=0.1331, over 24651.00 frames. ], tot_loss[loss=0.3159, simple_loss=0.3559, pruned_loss=0.138, over 4688943.62 frames. ], batch size: 65, lr: 3.61e-02, grad_scale: 32.0 2023-09-28 15:07:37,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:07:38,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=55746.666666666664, ans=0.1 2023-09-28 15:07:39,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:07:39,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 15:07:39,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:07:39,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:07:41,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 15:07:41,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=55746.666666666664, ans=0.125 2023-09-28 15:07:42,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:07:45,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:07:46,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:07:51,814 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:07:54,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 15:08:00,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 15:08:02,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 15:08:02,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:08:04,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:08:04,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=55813.333333333336, ans=0.2 2023-09-28 15:08:07,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:08:08,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:08:09,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:08:13,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:08:13,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:08:15,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:08:15,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:08:15,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:08:16,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:08:17,294 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=55880.0, ans=0.0 2023-09-28 15:08:18,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:08:20,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=55880.0, ans=0.0 2023-09-28 15:08:21,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:08:21,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 15:08:21,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:08:21,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:08:24,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:08:26,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:08:27,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:08:27,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:08:31,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:08:33,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:08:36,094 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=23.59 vs. limit=22.5 2023-09-28 15:08:40,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:08:40,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:08:40,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:08:43,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:08:44,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:08:44,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:08:47,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 15:08:47,321 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=1.134e-02 2023-09-28 15:08:48,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=56013.333333333336, ans=0.2 2023-09-28 15:08:50,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:08:50,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:08:51,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 15:08:53,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:08:57,730 INFO [train.py:1039] (3/4) Epoch 2, batch 3100, loss[loss=0.3345, simple_loss=0.3464, pruned_loss=0.1613, over 19848.00 frames. ], tot_loss[loss=0.315, simple_loss=0.3549, pruned_loss=0.1375, over 4703092.74 frames. ], batch size: 389, lr: 3.60e-02, grad_scale: 32.0 2023-09-28 15:08:59,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:09:00,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:09:03,851 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.205e+02 2.748e+02 3.065e+02 3.838e+02 6.915e+02, threshold=6.130e+02, percent-clipped=0.0 2023-09-28 15:09:03,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 15:09:06,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 15:09:09,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 15:09:11,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 15:09:11,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:09:14,712 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:09:14,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:09:16,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 15:09:21,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:09:27,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 15:09:30,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:09:30,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:09:32,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:09:32,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:09:33,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 15:09:35,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:09:35,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 15:09:35,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:09:38,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:09:38,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 15:09:40,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:09:42,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:09:44,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 15:09:44,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 15:09:46,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:09:47,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:09:49,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:09:49,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:09:49,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:09:52,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:09:52,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:09:56,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:09:56,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:09:56,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:09:56,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 15:09:56,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=56280.0, ans=0.0 2023-09-28 15:10:02,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:10:04,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 15:10:05,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:10:07,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 15:10:07,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:10:07,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:10:07,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 15:10:16,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 15:10:17,101 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.37 vs. limit=10.0 2023-09-28 15:10:19,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:10:19,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:10:20,596 INFO [train.py:1039] (3/4) Epoch 2, batch 3150, loss[loss=0.2857, simple_loss=0.2906, pruned_loss=0.1404, over 19221.00 frames. ], tot_loss[loss=0.3134, simple_loss=0.3536, pruned_loss=0.1366, over 4698345.15 frames. ], batch size: 388, lr: 3.59e-02, grad_scale: 32.0 2023-09-28 15:10:22,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:10:22,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:10:22,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=56413.333333333336, ans=0.125 2023-09-28 15:10:25,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 15:10:26,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:10:27,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 15:10:29,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 15:10:29,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:10:30,791 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 15:10:35,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 15:10:35,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:10:36,719 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 15:10:38,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 15:10:39,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 15:10:40,067 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:10:41,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 15:10:41,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 15:10:41,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:10:41,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:10:42,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:10:44,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 15:10:46,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:10:46,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:10:48,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:10:50,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 15:10:53,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 15:10:54,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=56546.666666666664, ans=0.1 2023-09-28 15:10:55,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:10:58,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:10:59,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:11:00,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 15:11:03,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 15:11:05,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:11:05,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 15:11:05,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:11:05,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:11:05,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:11:08,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:11:08,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:11:08,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 15:11:10,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:11:10,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:11,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:11:11,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:11:13,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 15:11:13,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:11:16,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 15:11:16,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:16,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 15:11:18,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 15:11:18,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:11:20,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:11:22,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 15:11:22,534 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 15:11:24,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:11:27,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:11:28,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:28,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:11:31,024 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.17 vs. limit=10.0 2023-09-28 15:11:32,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=56680.0, ans=10.0 2023-09-28 15:11:34,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:11:36,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:40,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 15:11:43,297 INFO [train.py:1039] (3/4) Epoch 2, batch 3200, loss[loss=0.3265, simple_loss=0.353, pruned_loss=0.15, over 23813.00 frames. ], tot_loss[loss=0.3112, simple_loss=0.3515, pruned_loss=0.1355, over 4705761.89 frames. ], batch size: 164, lr: 3.59e-02, grad_scale: 32.0 2023-09-28 15:11:45,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:11:45,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 15:11:49,720 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.102e+02 2.897e+02 3.504e+02 4.245e+02 7.793e+02, threshold=7.007e+02, percent-clipped=2.0 2023-09-28 15:11:49,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:11:51,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:11:51,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 15:11:54,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:12:01,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:12:04,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:12:13,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:12:13,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=56813.333333333336, ans=0.2 2023-09-28 15:12:24,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 15:12:25,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:12:28,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 15:12:29,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:12:34,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:12:34,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:12:36,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:12:39,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=56946.666666666664, ans=0.125 2023-09-28 15:12:41,028 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 15:12:42,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 15:12:44,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 15:12:46,893 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=56946.666666666664, ans=0.1 2023-09-28 15:12:47,091 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.86 vs. limit=15.0 2023-09-28 15:12:47,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 15:12:49,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:12:49,895 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=57013.333333333336, ans=0.125 2023-09-28 15:12:54,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:12:55,586 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.37 vs. limit=15.0 2023-09-28 15:12:56,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:12:56,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:12:57,721 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 15:12:57,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:13:00,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:13:01,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 15:13:03,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 15:13:03,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 15:13:04,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 15:13:06,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:13:07,642 INFO [train.py:1039] (3/4) Epoch 2, batch 3250, loss[loss=0.3573, simple_loss=0.3533, pruned_loss=0.1807, over 19283.00 frames. ], tot_loss[loss=0.3108, simple_loss=0.3508, pruned_loss=0.1355, over 4708173.72 frames. ], batch size: 388, lr: 3.58e-02, grad_scale: 32.0 2023-09-28 15:13:10,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 15:13:10,106 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 15:13:10,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:13:10,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:11,626 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 15:13:14,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:13:18,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:13:23,766 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=21.06 vs. limit=15.0 2023-09-28 15:13:26,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:13:26,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 15:13:26,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=57146.666666666664, ans=0.1 2023-09-28 15:13:28,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:13:28,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:13:28,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:13:29,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:13:29,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:13:34,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:34,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:13:34,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:13:36,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:36,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:36,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:13:39,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=57213.333333333336, ans=0.125 2023-09-28 15:13:40,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:13:42,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:13:42,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:13:44,681 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:13:46,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:13:46,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:13:46,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:13:49,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 15:13:50,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:13:50,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:13:53,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:13:53,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:13:59,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:14:01,516 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=15.33 vs. limit=15.0 2023-09-28 15:14:09,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:14:11,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:14:11,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 15:14:11,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:14:11,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 15:14:11,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:14:13,408 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=57346.666666666664, ans=0.125 2023-09-28 15:14:15,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 15:14:16,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 15:14:16,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:14:17,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:14:19,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:14:19,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 15:14:21,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:14:24,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:14:24,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:14:27,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 15:14:27,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:14:31,065 INFO [train.py:1039] (3/4) Epoch 2, batch 3300, loss[loss=0.2928, simple_loss=0.3529, pruned_loss=0.1164, over 24542.00 frames. ], tot_loss[loss=0.3114, simple_loss=0.3516, pruned_loss=0.1356, over 4712432.12 frames. ], batch size: 71, lr: 3.58e-02, grad_scale: 16.0 2023-09-28 15:14:31,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:14:31,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 15:14:34,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:14:34,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 15:14:36,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 15:14:37,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 15:14:37,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:14:38,072 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=57413.333333333336, ans=0.0 2023-09-28 15:14:38,884 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.841e+02 2.772e+02 3.522e+02 4.271e+02 9.362e+02, threshold=7.044e+02, percent-clipped=2.0 2023-09-28 15:14:41,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:14:42,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:14:42,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=57413.333333333336, ans=0.125 2023-09-28 15:14:44,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:14:46,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 15:14:46,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:14:49,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:14:51,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:14:54,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 15:14:54,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:14:54,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:14:56,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:14:56,552 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 15:14:58,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:14:58,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 15:14:59,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:14:59,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:14:59,726 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 15:15:03,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:15:03,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:15:05,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:15:05,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 15:15:06,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 15:15:06,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:15:08,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:15:09,885 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 15:15:12,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 15:15:14,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:15:16,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 15:15:19,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=57546.666666666664, ans=0.125 2023-09-28 15:15:20,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:15:22,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 15:15:22,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:15:25,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:15:25,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:15:25,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:15:27,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:15:27,542 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=57613.333333333336, ans=0.0 2023-09-28 15:15:29,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:15:29,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:15:29,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:15:30,851 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 15:15:32,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 15:15:34,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 15:15:34,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=57613.333333333336, ans=0.125 2023-09-28 15:15:35,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:15:35,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:15:39,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:15:39,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:15:40,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 15:15:40,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:15:42,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 15:15:42,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:15:45,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 15:15:46,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=57680.0, ans=0.0 2023-09-28 15:15:48,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 15:15:48,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:15:49,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:15:53,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:15:53,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:15:54,719 INFO [train.py:1039] (3/4) Epoch 2, batch 3350, loss[loss=0.3339, simple_loss=0.3666, pruned_loss=0.1506, over 23451.00 frames. ], tot_loss[loss=0.3128, simple_loss=0.3532, pruned_loss=0.1362, over 4713415.53 frames. ], batch size: 93, lr: 3.57e-02, grad_scale: 16.0 2023-09-28 15:15:54,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:15:56,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:15:56,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:15:59,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:16:01,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:16:03,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:16:06,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:07,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:16:08,047 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=57746.666666666664, ans=0.0 2023-09-28 15:16:09,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:16:09,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:16:11,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 15:16:13,421 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 15:16:13,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:16:15,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 15:16:15,226 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 15:16:16,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:16:18,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:16:18,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:16:19,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 15:16:19,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:19,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:16:21,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:24,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:16:25,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:28,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:16:28,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=57880.0, ans=0.125 2023-09-28 15:16:30,286 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.98 vs. limit=15.0 2023-09-28 15:16:31,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:16:34,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:16:34,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:16:38,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=57880.0, ans=0.2 2023-09-28 15:16:39,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:16:40,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:16:43,526 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:16:43,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:16:46,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:16:48,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 15:16:48,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 15:16:48,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 15:16:48,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:16:50,391 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 15:16:51,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:16:53,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:17:00,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:17:01,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 15:17:03,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:17:03,451 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=58013.333333333336, ans=0.2 2023-09-28 15:17:04,746 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:17:04,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=58013.333333333336, ans=0.1 2023-09-28 15:17:06,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:17:12,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:17:12,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 15:17:12,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 15:17:12,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:17:14,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:17:15,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 15:17:16,535 INFO [train.py:1039] (3/4) Epoch 2, batch 3400, loss[loss=0.2994, simple_loss=0.3599, pruned_loss=0.1195, over 24641.00 frames. ], tot_loss[loss=0.3124, simple_loss=0.3532, pruned_loss=0.1358, over 4717095.36 frames. ], batch size: 68, lr: 3.56e-02, grad_scale: 16.0 2023-09-28 15:17:16,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:17:16,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 15:17:18,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:17:18,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:17:20,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 15:17:21,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:17:21,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 15:17:24,561 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.046e+02 2.787e+02 3.091e+02 3.869e+02 5.571e+02, threshold=6.183e+02, percent-clipped=0.0 2023-09-28 15:17:26,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 15:17:26,281 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 15:17:26,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:17:30,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:17:30,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:17:31,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:17:33,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:17:38,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:17:41,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 15:17:47,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:17:50,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:17:50,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:17:51,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 15:17:59,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:18:03,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 15:18:05,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=58280.0, ans=0.125 2023-09-28 15:18:08,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=58280.0, ans=0.125 2023-09-28 15:18:10,542 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:18:10,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:18:10,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 15:18:10,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:18:12,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:18:14,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:18:14,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:18:17,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:18:20,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:18:20,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:18:25,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:18:25,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 15:18:32,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 15:18:37,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 15:18:38,595 INFO [train.py:1039] (3/4) Epoch 2, batch 3450, loss[loss=0.3417, simple_loss=0.3705, pruned_loss=0.1565, over 23597.00 frames. ], tot_loss[loss=0.3115, simple_loss=0.3517, pruned_loss=0.1357, over 4725270.72 frames. ], batch size: 134, lr: 3.56e-02, grad_scale: 16.0 2023-09-28 15:18:41,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 15:18:41,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:18:43,902 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:18:43,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 15:18:46,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:18:46,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=58413.333333333336, ans=0.125 2023-09-28 15:18:49,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:18:55,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:18:55,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:18:55,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=58480.0, ans=0.2 2023-09-28 15:18:57,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:18:57,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:18:59,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:19:06,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 15:19:10,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 15:19:10,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:19:10,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:19:13,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:19:20,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 15:19:21,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:19:25,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:19:25,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:19:27,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:19:28,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:19:30,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 15:19:30,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:19:32,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:19:35,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:19:35,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=58613.333333333336, ans=10.0 2023-09-28 15:19:37,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 15:19:42,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:19:47,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:19:47,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:19:52,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:19:57,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:19:57,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:19:57,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:19:58,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:20:01,757 INFO [train.py:1039] (3/4) Epoch 2, batch 3500, loss[loss=0.2787, simple_loss=0.3333, pruned_loss=0.1121, over 24460.00 frames. ], tot_loss[loss=0.3097, simple_loss=0.3503, pruned_loss=0.1346, over 4731251.93 frames. ], batch size: 63, lr: 3.55e-02, grad_scale: 16.0 2023-09-28 15:20:01,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:20:06,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:20:06,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 15:20:08,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 15:20:09,934 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.044e+02 2.839e+02 3.369e+02 4.173e+02 9.194e+02, threshold=6.738e+02, percent-clipped=6.0 2023-09-28 15:20:11,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 15:20:13,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:20:13,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 15:20:14,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=58746.666666666664, ans=0.1 2023-09-28 15:20:14,255 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=58746.666666666664, ans=0.1 2023-09-28 15:20:18,736 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:20:20,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:20:22,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:20:22,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:20:24,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 15:20:24,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:24,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:20:25,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 15:20:30,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:32,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 15:20:33,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:20:38,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:38,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 15:20:38,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:20:43,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:20:45,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:20:46,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:48,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:20:48,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:20:52,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 15:20:52,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 15:20:52,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 15:20:53,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:20:55,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:20:57,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:20:57,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:21:01,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 15:21:01,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:21:01,843 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=58946.666666666664, ans=0.2 2023-09-28 15:21:06,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:21:07,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 15:21:07,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 15:21:07,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:21:11,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:21:12,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:21:14,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:21:16,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 15:21:16,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:21:16,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=59013.333333333336, ans=0.125 2023-09-28 15:21:18,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=59013.333333333336, ans=0.125 2023-09-28 15:21:19,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:21:19,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 15:21:23,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 15:21:24,849 INFO [train.py:1039] (3/4) Epoch 2, batch 3550, loss[loss=0.3085, simple_loss=0.344, pruned_loss=0.1365, over 23443.00 frames. ], tot_loss[loss=0.3084, simple_loss=0.349, pruned_loss=0.1339, over 4721316.27 frames. ], batch size: 285, lr: 3.55e-02, grad_scale: 16.0 2023-09-28 15:21:25,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:21:27,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:21:28,043 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:21:28,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:21:31,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:21:34,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=59080.0, ans=0.1 2023-09-28 15:21:41,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:21:43,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 15:21:44,275 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.48 vs. limit=15.0 2023-09-28 15:21:45,497 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.54 vs. limit=15.0 2023-09-28 15:21:46,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:21:46,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=59146.666666666664, ans=0.09899494936611666 2023-09-28 15:21:47,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:21:49,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:21:51,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:21:51,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:21:54,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:21:54,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:21:55,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:21:55,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 15:21:56,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=59213.333333333336, ans=0.125 2023-09-28 15:21:57,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:22:02,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:22:02,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:22:04,649 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=59213.333333333336, ans=0.125 2023-09-28 15:22:05,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:22:05,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:22:05,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:22:05,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 15:22:05,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:22:07,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:22:08,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 15:22:14,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:22:14,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:22:15,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=59280.0, ans=10.0 2023-09-28 15:22:15,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=59280.0, ans=0.125 2023-09-28 15:22:16,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:22:17,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 15:22:18,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:22:19,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 15:22:19,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:22:21,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:22:22,111 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=59280.0, ans=0.125 2023-09-28 15:22:23,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:22:25,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 15:22:26,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:22:28,547 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=59280.0, ans=0.2 2023-09-28 15:22:33,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:22:33,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 15:22:34,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:22:38,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:22:39,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 15:22:48,090 INFO [train.py:1039] (3/4) Epoch 2, batch 3600, loss[loss=0.3262, simple_loss=0.3562, pruned_loss=0.1481, over 23628.00 frames. ], tot_loss[loss=0.3086, simple_loss=0.3496, pruned_loss=0.1338, over 4728931.74 frames. ], batch size: 256, lr: 3.54e-02, grad_scale: 32.0 2023-09-28 15:22:48,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 15:22:48,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:22:49,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:22:51,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:22:51,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:22:53,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:22:56,190 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.024e+02 2.598e+02 2.903e+02 3.548e+02 6.359e+02, threshold=5.806e+02, percent-clipped=0.0 2023-09-28 15:22:57,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:22:59,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:22:59,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=59413.333333333336, ans=0.125 2023-09-28 15:23:01,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:23:01,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:23:02,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:23:02,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 15:23:08,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:23:09,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:23:09,453 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=59480.0, ans=0.125 2023-09-28 15:23:12,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:23:13,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:23:15,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:23:15,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:23:15,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 15:23:16,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:23:18,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:23:20,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:23:23,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:23:25,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:23:25,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:23:26,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 15:23:32,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=59546.666666666664, ans=0.2 2023-09-28 15:23:35,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:23:36,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:23:36,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 15:23:43,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:23:48,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:23:51,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:23:58,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:23:58,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:23:58,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 15:24:00,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 15:24:01,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 15:24:05,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:24:05,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:24:06,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 15:24:06,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:24:08,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:24:08,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:24:08,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 15:24:09,692 INFO [train.py:1039] (3/4) Epoch 2, batch 3650, loss[loss=0.2841, simple_loss=0.3468, pruned_loss=0.1107, over 24647.00 frames. ], tot_loss[loss=0.308, simple_loss=0.3495, pruned_loss=0.1333, over 4727981.47 frames. ], batch size: 68, lr: 3.53e-02, grad_scale: 32.0 2023-09-28 15:24:09,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 15:24:13,558 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.27 vs. limit=22.5 2023-09-28 15:24:14,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:24:14,390 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 15:24:19,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 15:24:21,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:24:23,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=59746.666666666664, ans=0.125 2023-09-28 15:24:24,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 15:24:26,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 15:24:31,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:24:31,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:24:33,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:24:36,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 15:24:36,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:24:36,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 15:24:38,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:24:39,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:24:39,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 15:24:40,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 15:24:41,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:24:41,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:24:43,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:24:46,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 15:24:47,261 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.14 vs. limit=15.0 2023-09-28 15:24:47,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 15:24:47,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:24:49,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 15:24:50,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:24:51,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:24:57,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:24:59,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:24:59,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:25:02,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:25:04,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:25:08,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:25:08,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=59946.666666666664, ans=0.125 2023-09-28 15:25:08,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_ff2.min_abs, batch_count=59946.666666666664, ans=0.1 2023-09-28 15:25:09,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:25:11,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:25:11,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:25:15,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 15:25:15,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=60013.333333333336, ans=0.125 2023-09-28 15:25:16,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:25:16,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:25:24,116 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 15:25:27,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:25:27,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:25:27,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:25:29,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:25:31,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:25:32,606 INFO [train.py:1039] (3/4) Epoch 2, batch 3700, loss[loss=0.3204, simple_loss=0.3686, pruned_loss=0.1361, over 24686.00 frames. ], tot_loss[loss=0.3075, simple_loss=0.3496, pruned_loss=0.1327, over 4734890.92 frames. ], batch size: 73, lr: 3.53e-02, grad_scale: 32.0 2023-09-28 15:25:32,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:25:34,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 15:25:34,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:25:34,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=60080.0, ans=0.125 2023-09-28 15:25:37,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:25:39,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:25:41,572 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.121e+02 2.788e+02 3.403e+02 4.126e+02 8.216e+02, threshold=6.806e+02, percent-clipped=7.0 2023-09-28 15:25:41,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:25:43,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:25:43,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 15:25:43,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:25:43,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 15:25:45,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:25:46,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:25:50,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:25:51,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:25:51,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:25:53,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:25:53,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 15:25:55,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:25:58,189 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 15:26:06,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:26:08,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:26:09,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:26:09,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 15:26:09,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:26:14,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:26:15,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 15:26:17,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:26:18,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:26:23,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:26:23,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 15:26:24,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 15:26:29,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:26:29,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 15:26:31,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:26:31,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 15:26:36,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:26:37,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:26:41,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:26:41,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 15:26:44,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:26:44,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:26:44,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:26:44,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:26:44,595 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=60346.666666666664, ans=0.2 2023-09-28 15:26:46,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=60346.666666666664, ans=0.0 2023-09-28 15:26:47,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:26:47,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 15:26:49,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 15:26:50,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:26:50,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:26:52,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:26:53,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=60346.666666666664, ans=0.125 2023-09-28 15:26:54,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:26:55,358 INFO [train.py:1039] (3/4) Epoch 2, batch 3750, loss[loss=0.3255, simple_loss=0.3781, pruned_loss=0.1365, over 24284.00 frames. ], tot_loss[loss=0.3092, simple_loss=0.3514, pruned_loss=0.1335, over 4737558.73 frames. ], batch size: 74, lr: 3.52e-02, grad_scale: 32.0 2023-09-28 15:26:57,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:26:58,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:27:00,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:27:02,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 15:27:02,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=60413.333333333336, ans=0.2 2023-09-28 15:27:03,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 15:27:06,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 15:27:06,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 15:27:08,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:27:08,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:27:10,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:27:11,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:27:15,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:27:17,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:27:20,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:27:24,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:27:27,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:27:28,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 15:27:30,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:27:31,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:27:31,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:27:34,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 15:27:40,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 15:27:41,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:27:41,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:27:43,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:27:48,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:27:50,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 15:27:52,085 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 15:27:53,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 15:27:57,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:28:00,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:28:00,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:28:03,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:28:08,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 15:28:10,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 15:28:13,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:28:15,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:28:18,195 INFO [train.py:1039] (3/4) Epoch 2, batch 3800, loss[loss=0.3252, simple_loss=0.3608, pruned_loss=0.1448, over 23673.00 frames. ], tot_loss[loss=0.3097, simple_loss=0.3518, pruned_loss=0.1338, over 4716128.63 frames. ], batch size: 149, lr: 3.52e-02, grad_scale: 32.0 2023-09-28 15:28:18,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:28:22,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=60746.666666666664, ans=0.07 2023-09-28 15:28:25,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:28:26,486 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.996e+02 2.661e+02 3.070e+02 3.841e+02 5.617e+02, threshold=6.140e+02, percent-clipped=0.0 2023-09-28 15:28:30,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:28:30,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 15:28:32,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 15:28:35,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:28:36,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:28:38,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 15:28:40,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 15:28:40,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:28:40,276 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:28:40,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=60813.333333333336, ans=0.0 2023-09-28 15:28:41,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:28:41,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:28:43,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:28:44,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 15:28:47,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=60813.333333333336, ans=0.125 2023-09-28 15:28:49,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 15:28:49,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:28:49,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:28:53,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=60880.0, ans=0.2 2023-09-28 15:28:55,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:28:55,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:28:56,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 15:28:56,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:28:58,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:28:59,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:29:05,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 15:29:06,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 15:29:08,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:29:08,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=60946.666666666664, ans=0.1 2023-09-28 15:29:16,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:29:22,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:29:24,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 15:29:25,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 15:29:27,459 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:29:29,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:29:29,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:29:32,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 15:29:34,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 15:29:34,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 15:29:34,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:29:36,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:29:41,184 INFO [train.py:1039] (3/4) Epoch 2, batch 3850, loss[loss=0.2845, simple_loss=0.3082, pruned_loss=0.1304, over 22790.00 frames. ], tot_loss[loss=0.3081, simple_loss=0.3492, pruned_loss=0.1335, over 4712276.10 frames. ], batch size: 322, lr: 3.51e-02, grad_scale: 32.0 2023-09-28 15:29:41,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:29:41,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:29:41,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=61080.0, ans=0.125 2023-09-28 15:29:49,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:29:49,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 15:29:51,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:29:52,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:29:55,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:29:57,584 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:30:01,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 15:30:02,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 15:30:09,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:10,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:30:14,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:30:14,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:30:17,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:19,481 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:30:19,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:30:19,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:30:21,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:30:23,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:30:24,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:24,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:30:24,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 15:30:24,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 15:30:25,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:30:25,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:28,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:30:28,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:30,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 15:30:31,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 15:30:34,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:30:37,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 15:30:39,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 15:30:44,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:30:46,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:30:46,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=61346.666666666664, ans=0.125 2023-09-28 15:30:50,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:30:50,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 15:30:54,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 15:30:56,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:30:57,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:30:59,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:31:00,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:31:01,161 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.03 vs. limit=22.5 2023-09-28 15:31:01,991 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:02,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:02,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:31:02,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 15:31:03,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:31:03,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 15:31:05,403 INFO [train.py:1039] (3/4) Epoch 2, batch 3900, loss[loss=0.2767, simple_loss=0.3329, pruned_loss=0.1103, over 14199.00 frames. ], tot_loss[loss=0.3071, simple_loss=0.3481, pruned_loss=0.1331, over 4691868.29 frames. ], batch size: 30, lr: 3.51e-02, grad_scale: 32.0 2023-09-28 15:31:06,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:06,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:31:09,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:31:09,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:10,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:31:12,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:31:12,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:31:13,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:31:13,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 15:31:14,947 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.111e+02 3.017e+02 3.758e+02 4.866e+02 8.103e+02, threshold=7.517e+02, percent-clipped=9.0 2023-09-28 15:31:15,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:19,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:31:19,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:31:19,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:31:21,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:31:23,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:31:23,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:23,982 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=61480.0, ans=0.2 2023-09-28 15:31:25,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:31:25,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 15:31:25,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:31:29,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 15:31:29,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:31:30,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 15:31:32,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 15:31:37,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:31:37,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:31:37,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:31:37,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:31:44,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:31:45,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:31:47,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:31:47,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:31:48,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:31:54,192 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.02 vs. limit=10.0 2023-09-28 15:31:54,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:31:56,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:32:03,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 15:32:06,765 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:32:07,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=61613.333333333336, ans=0.0 2023-09-28 15:32:15,358 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:32:17,482 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.28 vs. limit=15.0 2023-09-28 15:32:18,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:32:18,451 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 15:32:18,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 15:32:18,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:32:21,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 15:32:22,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:32:23,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 15:32:27,656 INFO [train.py:1039] (3/4) Epoch 2, batch 3950, loss[loss=0.3086, simple_loss=0.3436, pruned_loss=0.1368, over 23238.00 frames. ], tot_loss[loss=0.307, simple_loss=0.3484, pruned_loss=0.1328, over 4686674.93 frames. ], batch size: 105, lr: 3.50e-02, grad_scale: 16.0 2023-09-28 15:32:30,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:32:32,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 15:32:33,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:32:36,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:32:36,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:32:42,167 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.59 vs. limit=15.0 2023-09-28 15:32:42,765 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 15:32:42,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:32:42,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 15:32:44,405 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 15:32:44,468 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:32:47,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:32:47,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:32:47,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:32:48,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=61813.333333333336, ans=0.125 2023-09-28 15:32:51,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 15:32:54,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:32:54,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:32:54,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:32:55,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:32:55,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:33:10,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:33:10,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:33:10,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=61880.0, ans=0.125 2023-09-28 15:33:15,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 15:33:21,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 15:33:21,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 15:33:21,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:33:21,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:33:31,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:33:31,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:33:31,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:33:31,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:33:32,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 15:33:35,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=62013.333333333336, ans=10.0 2023-09-28 15:33:37,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:33:37,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:33:43,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 15:33:48,546 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=62013.333333333336, ans=0.125 2023-09-28 15:33:50,910 INFO [train.py:1039] (3/4) Epoch 2, batch 4000, loss[loss=0.3283, simple_loss=0.3551, pruned_loss=0.1508, over 22831.00 frames. ], tot_loss[loss=0.3067, simple_loss=0.3487, pruned_loss=0.1324, over 4700827.07 frames. ], batch size: 322, lr: 3.49e-02, grad_scale: 32.0 2023-09-28 15:33:53,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:34:00,524 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.115e+02 2.667e+02 3.102e+02 3.739e+02 5.797e+02, threshold=6.204e+02, percent-clipped=0.0 2023-09-28 15:34:02,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:34:05,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:34:05,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:34:06,921 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:34:06,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 15:34:08,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:34:08,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 15:34:08,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:34:08,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 15:34:10,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:34:11,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=62146.666666666664, ans=0.2 2023-09-28 15:34:15,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:34:15,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:34:15,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:34:15,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:34:15,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 15:34:17,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:34:19,424 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 15:34:20,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:34:21,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:34:21,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=62146.666666666664, ans=0.1 2023-09-28 15:34:24,118 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 15:34:24,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:34:24,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:34:30,940 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=13.08 vs. limit=15.0 2023-09-28 15:34:33,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 15:34:33,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:34:35,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:34:37,104 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 15:34:38,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:34:38,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 15:34:38,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:34:41,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:34:41,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:34:43,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:34:45,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:34:45,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:34:47,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 15:34:47,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:34:50,108 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 15:34:53,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:34:56,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 15:34:59,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:34:59,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:35:01,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:35:01,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:35:07,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:35:11,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 15:35:11,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 15:35:12,607 INFO [train.py:1039] (3/4) Epoch 2, batch 4050, loss[loss=0.3043, simple_loss=0.3438, pruned_loss=0.1325, over 23573.00 frames. ], tot_loss[loss=0.3064, simple_loss=0.349, pruned_loss=0.1319, over 4713203.41 frames. ], batch size: 135, lr: 3.49e-02, grad_scale: 32.0 2023-09-28 15:35:14,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:35:14,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:35:15,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:35:17,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:35:17,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:35:22,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:35:24,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:35:25,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 15:35:27,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:35:27,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:35:32,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:35:34,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:35:37,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 15:35:37,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=62480.0, ans=0.125 2023-09-28 15:35:38,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 15:35:39,575 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 15:35:41,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:35:47,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=62546.666666666664, ans=0.07 2023-09-28 15:35:49,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 15:35:49,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=62546.666666666664, ans=0.0 2023-09-28 15:35:50,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:35:54,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:35:57,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:35:57,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:35:57,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:36:00,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:36:03,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 15:36:04,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 15:36:07,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:36:09,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 15:36:13,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:36:23,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 15:36:25,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:36:25,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:36:25,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 15:36:25,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 15:36:25,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:36:29,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:36:31,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:36:31,208 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:36:35,779 INFO [train.py:1039] (3/4) Epoch 2, batch 4100, loss[loss=0.3145, simple_loss=0.3639, pruned_loss=0.1326, over 24340.00 frames. ], tot_loss[loss=0.3089, simple_loss=0.3512, pruned_loss=0.1333, over 4699641.24 frames. ], batch size: 77, lr: 3.48e-02, grad_scale: 16.0 2023-09-28 15:36:39,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 15:36:39,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 15:36:42,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 15:36:44,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 15:36:44,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:36:44,427 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:36:44,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:36:44,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:36:45,963 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 15:36:46,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=62746.666666666664, ans=0.125 2023-09-28 15:36:47,352 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.099e+02 2.677e+02 3.262e+02 4.112e+02 6.784e+02, threshold=6.525e+02, percent-clipped=2.0 2023-09-28 15:36:47,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=62746.666666666664, ans=0.125 2023-09-28 15:36:49,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:36:49,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:36:49,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:36:51,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:36:56,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:36:58,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:36:58,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:36:58,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 15:36:59,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:36:59,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:36:59,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:37:01,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:37:01,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 15:37:03,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=62813.333333333336, ans=0.125 2023-09-28 15:37:06,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:37:07,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 15:37:07,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:37:12,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:37:12,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 15:37:13,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:37:13,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:37:13,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:37:15,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 15:37:15,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=62880.0, ans=0.125 2023-09-28 15:37:19,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:37:19,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:37:20,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=62880.0, ans=0.125 2023-09-28 15:37:22,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 15:37:23,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:37:23,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:37:27,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:37:32,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:37:35,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:37:35,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:37:46,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:37:46,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:37:47,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=63013.333333333336, ans=0.125 2023-09-28 15:37:48,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:37:49,011 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=63013.333333333336, ans=0.025 2023-09-28 15:37:51,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:37:51,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=63013.333333333336, ans=0.1 2023-09-28 15:37:53,419 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=63013.333333333336, ans=0.125 2023-09-28 15:37:53,441 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=63013.333333333336, ans=0.125 2023-09-28 15:37:53,788 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.91 vs. limit=15.0 2023-09-28 15:37:58,101 INFO [train.py:1039] (3/4) Epoch 2, batch 4150, loss[loss=0.29, simple_loss=0.3256, pruned_loss=0.1272, over 23507.00 frames. ], tot_loss[loss=0.3087, simple_loss=0.3514, pruned_loss=0.133, over 4712227.38 frames. ], batch size: 134, lr: 3.48e-02, grad_scale: 16.0 2023-09-28 15:37:58,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:37:58,322 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:37:59,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:37:59,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:38:03,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 15:38:03,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:38:03,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 15:38:05,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 15:38:05,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 15:38:06,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:38:11,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:38:11,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:38:15,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:38:17,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:38:18,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:38:20,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 15:38:20,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:38:21,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 15:38:25,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:38:25,255 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=63146.666666666664, ans=0.0 2023-09-28 15:38:30,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:38:31,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 15:38:35,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 15:38:35,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:38:36,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 15:38:36,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:38:36,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:38:37,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=63213.333333333336, ans=0.09899494936611666 2023-09-28 15:38:39,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:38:39,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:38:45,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 15:38:47,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=63280.0, ans=0.0 2023-09-28 15:38:49,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:38:49,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=63280.0, ans=0.0 2023-09-28 15:38:50,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=63280.0, ans=0.0 2023-09-28 15:38:51,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:38:52,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 15:38:53,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:38:55,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 15:38:55,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:38:55,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=63280.0, ans=0.1 2023-09-28 15:38:58,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:38:59,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:38:59,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 15:38:59,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:38:59,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 15:39:03,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 15:39:06,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 15:39:07,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:39:07,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:39:07,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:39:07,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 15:39:09,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:39:09,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 15:39:10,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:39:14,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:39:14,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 15:39:14,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 15:39:19,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:39:21,073 INFO [train.py:1039] (3/4) Epoch 2, batch 4200, loss[loss=0.2833, simple_loss=0.3333, pruned_loss=0.1167, over 24304.00 frames. ], tot_loss[loss=0.3062, simple_loss=0.3491, pruned_loss=0.1317, over 4709525.54 frames. ], batch size: 61, lr: 3.47e-02, grad_scale: 16.0 2023-09-28 15:39:21,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 15:39:24,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:39:26,226 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:39:27,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:39:27,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:39:27,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:39:30,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 15:39:32,112 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.187e+02 2.868e+02 3.365e+02 4.143e+02 5.998e+02, threshold=6.730e+02, percent-clipped=0.0 2023-09-28 15:39:33,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 15:39:33,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:39:34,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=63413.333333333336, ans=0.0 2023-09-28 15:39:37,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:39:39,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:39:42,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 15:39:43,163 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.62 vs. limit=22.5 2023-09-28 15:39:46,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:39:46,046 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:39:47,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 15:39:47,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:39:49,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:39:49,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:39:49,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:39:52,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:39:55,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 15:39:55,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:39:59,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 15:40:01,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:40:02,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:40:03,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=63546.666666666664, ans=0.2 2023-09-28 15:40:04,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:40:07,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:40:07,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 15:40:07,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:40:08,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:40:14,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 15:40:17,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:40:22,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:40:25,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 15:40:29,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:40:32,478 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.32 vs. limit=15.0 2023-09-28 15:40:33,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=63680.0, ans=0.125 2023-09-28 15:40:34,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 15:40:34,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:40:37,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 15:40:42,460 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 15:40:43,863 INFO [train.py:1039] (3/4) Epoch 2, batch 4250, loss[loss=0.3279, simple_loss=0.3578, pruned_loss=0.1489, over 23763.00 frames. ], tot_loss[loss=0.3055, simple_loss=0.3472, pruned_loss=0.1319, over 4713292.86 frames. ], batch size: 179, lr: 3.47e-02, grad_scale: 16.0 2023-09-28 15:40:45,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 15:40:45,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 15:40:47,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:40:55,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:40:55,079 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 15:40:55,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:40:58,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:41:01,141 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.94 vs. limit=15.0 2023-09-28 15:41:01,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:41:05,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:41:07,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:41:08,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:41:08,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:41:10,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:41:10,398 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:41:11,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:41:13,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:41:15,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:41:16,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 15:41:20,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 15:41:21,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:41:22,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:41:22,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:41:23,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:41:23,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:41:23,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:41:26,443 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.29 vs. limit=22.5 2023-09-28 15:41:28,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 15:41:30,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:41:30,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=63880.0, ans=0.125 2023-09-28 15:41:34,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:41:35,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:41:35,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 15:41:35,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:41:36,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=63946.666666666664, ans=0.125 2023-09-28 15:41:37,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 15:41:39,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:41:40,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:41:43,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:41:43,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:41:45,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 15:41:46,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 15:41:48,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 15:41:51,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:41:55,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:41:56,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:41:59,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:42:00,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:42:02,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:42:02,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:42:02,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 15:42:05,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:42:09,186 INFO [train.py:1039] (3/4) Epoch 2, batch 4300, loss[loss=0.3045, simple_loss=0.3396, pruned_loss=0.1347, over 23444.00 frames. ], tot_loss[loss=0.3062, simple_loss=0.3479, pruned_loss=0.1322, over 4718236.16 frames. ], batch size: 134, lr: 3.46e-02, grad_scale: 16.0 2023-09-28 15:42:12,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:42:12,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:42:15,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:42:19,725 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.044e+02 2.736e+02 3.234e+02 3.981e+02 6.423e+02, threshold=6.467e+02, percent-clipped=0.0 2023-09-28 15:42:21,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=64080.0, ans=0.0 2023-09-28 15:42:22,065 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.88 vs. limit=22.5 2023-09-28 15:42:23,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:42:23,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 15:42:24,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:42:26,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:42:26,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:42:26,289 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 15:42:29,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:42:33,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:42:36,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 15:42:36,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:42:36,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 15:42:40,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 15:42:42,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:42:45,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:42:45,630 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:42:47,108 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:42:48,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:42:48,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:42:48,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 15:42:49,718 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.18 vs. limit=15.0 2023-09-28 15:42:50,305 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 15:42:53,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:42:55,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=64213.333333333336, ans=0.2 2023-09-28 15:42:56,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:42:56,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:42:56,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:42:56,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:42:56,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 15:42:56,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 15:42:57,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 15:42:59,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:43:00,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 15:43:00,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 15:43:06,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:43:08,939 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 15:43:10,440 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:43:12,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:43:12,060 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:43:14,483 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.59 vs. limit=15.0 2023-09-28 15:43:15,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 15:43:16,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 15:43:16,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:43:17,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:43:19,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:43:19,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:43:22,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:43:22,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=64346.666666666664, ans=0.125 2023-09-28 15:43:23,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=64346.666666666664, ans=0.2 2023-09-28 15:43:25,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:43:25,840 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.00 vs. limit=15.0 2023-09-28 15:43:26,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:43:26,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:43:29,646 INFO [train.py:1039] (3/4) Epoch 2, batch 4350, loss[loss=0.2644, simple_loss=0.3137, pruned_loss=0.1076, over 24431.00 frames. ], tot_loss[loss=0.3071, simple_loss=0.349, pruned_loss=0.1326, over 4721496.29 frames. ], batch size: 58, lr: 3.46e-02, grad_scale: 16.0 2023-09-28 15:43:32,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 15:43:34,276 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 15:43:37,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=64413.333333333336, ans=0.125 2023-09-28 15:43:38,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:43:40,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:43:44,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:43:44,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:43:49,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:43:53,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:43:55,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:43:55,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:43:59,077 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.82 vs. limit=22.5 2023-09-28 15:43:59,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:44:02,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:44:04,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:44:09,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 15:44:11,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:44:12,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:16,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:20,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 15:44:22,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:44:24,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 15:44:31,238 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 15:44:32,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:44:32,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:44:32,884 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 15:44:34,365 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 15:44:34,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:44:34,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:44:35,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:44:37,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:44:37,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:44:38,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:44:40,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 15:44:40,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:40,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:44:40,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:40,819 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=64680.0, ans=0.1 2023-09-28 15:44:42,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 15:44:42,381 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 15:44:42,389 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 15:44:42,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 15:44:46,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:44:46,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:44:46,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:44:48,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:44:49,768 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.66 vs. limit=15.0 2023-09-28 15:44:50,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 15:44:51,888 INFO [train.py:1039] (3/4) Epoch 2, batch 4400, loss[loss=0.285, simple_loss=0.3333, pruned_loss=0.1183, over 24660.00 frames. ], tot_loss[loss=0.3089, simple_loss=0.3502, pruned_loss=0.1338, over 4721267.28 frames. ], batch size: 65, lr: 3.45e-02, grad_scale: 32.0 2023-09-28 15:44:52,103 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 15:44:52,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:55,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=64746.666666666664, ans=0.125 2023-09-28 15:44:56,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:44:56,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:44:58,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:45:00,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 15:45:02,317 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 15:45:02,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 15:45:02,424 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 15:45:03,800 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.169e+02 2.849e+02 3.157e+02 3.871e+02 7.582e+02, threshold=6.315e+02, percent-clipped=2.0 2023-09-28 15:45:03,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 15:45:03,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:45:05,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 15:45:08,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:45:10,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:45:10,161 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 15:45:13,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:45:13,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 15:45:13,323 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 15:45:15,598 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.87 vs. limit=15.0 2023-09-28 15:45:17,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 15:45:17,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 15:45:17,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 15:45:19,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:45:19,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:45:20,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:45:20,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:45:22,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 15:45:22,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 15:45:22,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:45:26,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:45:26,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:45:27,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:45:29,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:45:29,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 15:45:29,435 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 15:45:33,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:45:40,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:45:43,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 15:45:48,088 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:45:51,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:45:54,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:45:54,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 15:45:54,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:45:54,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:45:54,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:45:55,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 15:46:01,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 15:46:04,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 15:46:05,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 15:46:05,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:46:05,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 15:46:07,406 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:46:12,856 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:46:14,482 INFO [train.py:1039] (3/4) Epoch 2, batch 4450, loss[loss=0.3054, simple_loss=0.3659, pruned_loss=0.1225, over 24468.00 frames. ], tot_loss[loss=0.3094, simple_loss=0.3511, pruned_loss=0.1338, over 4717071.38 frames. ], batch size: 69, lr: 3.44e-02, grad_scale: 32.0 2023-09-28 15:46:14,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 15:46:17,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:46:20,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:46:22,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:46:23,118 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=6.44 vs. limit=15.0 2023-09-28 15:46:24,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=65080.0, ans=0.125 2023-09-28 15:46:29,921 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:46:29,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:46:34,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:46:37,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:46:38,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=65146.666666666664, ans=0.07 2023-09-28 15:46:39,171 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.71 vs. limit=15.0 2023-09-28 15:46:40,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:46:41,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:46:42,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 15:46:42,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:46:42,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:46:43,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:46:43,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 15:46:46,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:46:50,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:46:51,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:46:53,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:46:53,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:46:55,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:46:58,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 15:46:59,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 15:46:59,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 15:46:59,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:47:01,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:47:02,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 15:47:07,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 15:47:07,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=65280.0, ans=0.0 2023-09-28 15:47:07,828 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=65280.0, ans=0.125 2023-09-28 15:47:09,736 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:47:10,090 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=65280.0, ans=0.125 2023-09-28 15:47:11,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 15:47:11,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:47:11,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:47:11,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:47:11,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:47:14,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:47:19,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 15:47:21,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 15:47:23,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:47:26,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:47:26,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:47:28,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:47:28,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 15:47:31,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:47:35,443 INFO [train.py:1039] (3/4) Epoch 2, batch 4500, loss[loss=0.307, simple_loss=0.3404, pruned_loss=0.1368, over 23829.00 frames. ], tot_loss[loss=0.309, simple_loss=0.3511, pruned_loss=0.1334, over 4721651.44 frames. ], batch size: 195, lr: 3.44e-02, grad_scale: 32.0 2023-09-28 15:47:35,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 15:47:37,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:47:41,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:47:43,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 15:47:43,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 15:47:45,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:47:46,457 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.128e+02 2.918e+02 3.364e+02 4.065e+02 7.320e+02, threshold=6.729e+02, percent-clipped=3.0 2023-09-28 15:47:50,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:47:50,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:47:51,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 15:47:53,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:47:53,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:47:53,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:48:06,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:48:06,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:48:09,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:48:11,411 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:48:11,569 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:48:16,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:48:22,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:48:26,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:48:28,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:48:29,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 15:48:30,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:48:31,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:48:33,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:48:33,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:48:33,893 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=65613.33333333333, ans=0.125 2023-09-28 15:48:36,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:48:36,594 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 15:48:36,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 15:48:36,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:48:40,267 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.01 vs. limit=22.5 2023-09-28 15:48:42,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:48:42,599 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 15:48:43,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=65680.0, ans=0.5 2023-09-28 15:48:43,627 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.18 vs. limit=15.0 2023-09-28 15:48:44,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:48:46,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:48:46,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:48:47,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 15:48:49,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 15:48:51,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 15:48:53,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=65680.0, ans=0.125 2023-09-28 15:48:56,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 15:48:57,852 INFO [train.py:1039] (3/4) Epoch 2, batch 4550, loss[loss=0.2969, simple_loss=0.3138, pruned_loss=0.14, over 22815.00 frames. ], tot_loss[loss=0.3073, simple_loss=0.3489, pruned_loss=0.1328, over 4718440.68 frames. ], batch size: 322, lr: 3.43e-02, grad_scale: 32.0 2023-09-28 15:48:57,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 15:48:58,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:49:01,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:49:01,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:49:06,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:49:11,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:49:13,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:49:16,064 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:49:16,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:49:16,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:49:19,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:49:19,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:49:21,183 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=65813.33333333333, ans=0.2 2023-09-28 15:49:23,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:49:26,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 15:49:27,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 15:49:27,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 15:49:29,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 15:49:33,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 15:49:33,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:49:37,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 15:49:38,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:49:41,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:49:42,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:49:42,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 15:49:44,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 15:49:48,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:49:48,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=65946.66666666667, ans=0.125 2023-09-28 15:49:49,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:49:49,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:49:51,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:49:52,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 15:49:52,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 15:49:54,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:49:54,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 15:49:56,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 15:49:58,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:49:58,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:49:58,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:50:00,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:50:01,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:50:01,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 15:50:03,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 15:50:05,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=66013.33333333333, ans=0.125 2023-09-28 15:50:06,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:50:06,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 15:50:06,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 15:50:06,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:50:06,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 15:50:10,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:50:11,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:50:13,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:50:13,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:50:14,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 15:50:17,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:50:18,873 INFO [train.py:1039] (3/4) Epoch 2, batch 4600, loss[loss=0.2878, simple_loss=0.2881, pruned_loss=0.1437, over 19284.00 frames. ], tot_loss[loss=0.3053, simple_loss=0.3467, pruned_loss=0.1319, over 4700013.44 frames. ], batch size: 388, lr: 3.43e-02, grad_scale: 32.0 2023-09-28 15:50:19,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 15:50:20,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:50:22,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:50:25,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:50:25,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:50:26,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:50:28,423 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 15:50:28,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=66080.0, ans=0.05 2023-09-28 15:50:29,711 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.215e+02 2.622e+02 3.070e+02 3.813e+02 6.355e+02, threshold=6.141e+02, percent-clipped=0.0 2023-09-28 15:50:30,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:50:33,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:50:33,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:50:34,105 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=66146.66666666667, ans=0.0 2023-09-28 15:50:35,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:50:43,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 15:50:45,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:50:48,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:50:48,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_na.min_abs, batch_count=66146.66666666667, ans=0.02 2023-09-28 15:50:51,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:50:51,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:50:51,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=66213.33333333333, ans=0.1 2023-09-28 15:50:53,617 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=66213.33333333333, ans=0.2 2023-09-28 15:50:56,646 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=66213.33333333333, ans=0.125 2023-09-28 15:50:57,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 15:50:57,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:50:57,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:51:04,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:51:05,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:51:06,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:51:11,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 15:51:12,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 15:51:16,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:51:16,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:51:20,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:51:20,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 15:51:20,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:51:22,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 15:51:22,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:51:23,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:51:23,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=66346.66666666667, ans=0.1 2023-09-28 15:51:24,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:51:26,322 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:51:27,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:51:27,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 15:51:29,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 15:51:29,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 15:51:29,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:51:32,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:51:32,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:51:33,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:51:34,508 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=66346.66666666667, ans=0.125 2023-09-28 15:51:41,406 INFO [train.py:1039] (3/4) Epoch 2, batch 4650, loss[loss=0.3075, simple_loss=0.3428, pruned_loss=0.1361, over 23848.00 frames. ], tot_loss[loss=0.304, simple_loss=0.3458, pruned_loss=0.1311, over 4710761.92 frames. ], batch size: 164, lr: 3.42e-02, grad_scale: 32.0 2023-09-28 15:51:44,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:51:45,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=66413.33333333333, ans=0.2 2023-09-28 15:51:47,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:51:49,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:51:49,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:51:50,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:51:50,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:51:50,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:51:56,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 15:51:58,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:51:59,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 15:51:59,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:52:01,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 15:52:01,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:52:01,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=66480.0, ans=0.125 2023-09-28 15:52:02,928 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 15:52:02,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 15:52:02,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:52:04,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:52:07,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 15:52:08,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:52:09,719 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 15:52:14,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:52:16,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 15:52:18,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:52:18,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:52:19,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 15:52:21,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:52:24,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:52:29,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:52:33,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:52:36,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:52:36,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:52:36,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:52:39,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 15:52:39,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 15:52:39,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 15:52:39,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 15:52:43,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:52:48,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:52:48,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:52:48,859 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 15:52:50,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:52:51,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:52:51,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:52:53,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 15:52:55,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 15:52:55,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:52:55,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:52:58,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=66680.0, ans=0.0 2023-09-28 15:53:01,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:53:01,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:53:01,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 15:53:03,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 15:53:03,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 15:53:05,189 INFO [train.py:1039] (3/4) Epoch 2, batch 4700, loss[loss=0.3254, simple_loss=0.376, pruned_loss=0.1374, over 24564.00 frames. ], tot_loss[loss=0.3052, simple_loss=0.3473, pruned_loss=0.1315, over 4717481.47 frames. ], batch size: 71, lr: 3.42e-02, grad_scale: 32.0 2023-09-28 15:53:06,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 15:53:10,606 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.39 vs. limit=12.0 2023-09-28 15:53:14,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:53:15,873 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.037e+02 2.802e+02 3.291e+02 3.873e+02 6.346e+02, threshold=6.582e+02, percent-clipped=1.0 2023-09-28 15:53:15,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:53:16,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:53:18,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:53:19,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 15:53:26,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 15:53:26,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 15:53:29,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:53:30,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:53:30,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:53:34,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:53:40,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:53:41,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 15:53:44,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:53:52,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 15:53:54,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 15:53:55,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:53:59,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 15:54:00,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:54:02,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=66946.66666666667, ans=0.0 2023-09-28 15:54:05,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:54:06,428 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.95 vs. limit=6.0 2023-09-28 15:54:06,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 15:54:07,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:54:08,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:54:10,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:54:10,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:54:10,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 15:54:12,395 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 15:54:13,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:54:14,431 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.17 vs. limit=15.0 2023-09-28 15:54:17,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:54:17,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:54:17,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 15:54:18,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:54:20,844 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=67013.33333333333, ans=0.125 2023-09-28 15:54:22,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 15:54:23,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:54:24,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:54:27,487 INFO [train.py:1039] (3/4) Epoch 2, batch 4750, loss[loss=0.3256, simple_loss=0.3686, pruned_loss=0.1413, over 23438.00 frames. ], tot_loss[loss=0.3052, simple_loss=0.3476, pruned_loss=0.1314, over 4717741.23 frames. ], batch size: 93, lr: 3.41e-02, grad_scale: 32.0 2023-09-28 15:54:31,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:54:31,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:54:32,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 15:54:34,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:54:38,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 15:54:40,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:54:40,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:54:41,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:54:44,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=67146.66666666667, ans=0.125 2023-09-28 15:54:47,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 15:54:51,794 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 15:54:52,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 15:54:54,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:54:55,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:54:55,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:54:57,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:54:59,133 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 15:54:59,149 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 15:54:59,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=67213.33333333333, ans=0.0 2023-09-28 15:55:04,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 15:55:06,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:55:07,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:55:10,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:55:10,674 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 15:55:10,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:55:12,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 15:55:17,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 15:55:18,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 15:55:18,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 15:55:20,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:55:20,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:55:20,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:55:22,819 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=67280.0, ans=15.0 2023-09-28 15:55:23,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 15:55:23,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 15:55:26,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 15:55:28,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:55:31,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=67346.66666666667, ans=0.125 2023-09-28 15:55:33,098 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:55:33,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 15:55:33,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:55:35,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:55:36,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 15:55:38,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:55:38,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 15:55:40,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=67346.66666666667, ans=0.125 2023-09-28 15:55:43,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:55:43,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 15:55:44,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 15:55:46,427 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 15:55:48,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 15:55:48,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:55:49,536 INFO [train.py:1039] (3/4) Epoch 2, batch 4800, loss[loss=0.2629, simple_loss=0.3146, pruned_loss=0.1056, over 24308.00 frames. ], tot_loss[loss=0.3058, simple_loss=0.3484, pruned_loss=0.1316, over 4715977.37 frames. ], batch size: 61, lr: 3.41e-02, grad_scale: 32.0 2023-09-28 15:55:51,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 15:55:56,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:55:57,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:56:01,841 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.159e+02 2.813e+02 3.481e+02 4.018e+02 6.093e+02, threshold=6.961e+02, percent-clipped=0.0 2023-09-28 15:56:03,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 15:56:05,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:56:05,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:56:07,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 15:56:07,446 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=67480.0, ans=0.125 2023-09-28 15:56:08,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:56:08,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 15:56:11,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 15:56:15,452 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:56:15,988 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.63 vs. limit=15.0 2023-09-28 15:56:18,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:56:18,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 15:56:20,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:56:20,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 15:56:20,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:56:21,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:56:24,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:56:27,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:56:29,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 15:56:29,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:56:30,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=67546.66666666667, ans=0.2 2023-09-28 15:56:32,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 15:56:33,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:56:33,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=67546.66666666667, ans=0.1 2023-09-28 15:56:34,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 15:56:36,148 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 15:56:37,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:56:37,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:56:37,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:56:37,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:56:37,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:56:39,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:56:41,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:56:46,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:56:48,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:56:50,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:56:55,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 15:56:55,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:56:57,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:56:57,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:56:57,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=67680.0, ans=0.0 2023-09-28 15:56:58,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:57:02,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:57:04,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 15:57:04,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:57:04,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 15:57:05,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 15:57:05,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 15:57:10,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:57:10,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:57:10,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:57:11,585 INFO [train.py:1039] (3/4) Epoch 2, batch 4850, loss[loss=0.2936, simple_loss=0.3283, pruned_loss=0.1294, over 23762.00 frames. ], tot_loss[loss=0.3058, simple_loss=0.3481, pruned_loss=0.1317, over 4714031.24 frames. ], batch size: 164, lr: 3.40e-02, grad_scale: 32.0 2023-09-28 15:57:11,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 15:57:14,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 15:57:14,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:57:14,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:57:15,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:57:15,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:57:18,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:57:24,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=67746.66666666667, ans=0.125 2023-09-28 15:57:26,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 15:57:26,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=67746.66666666667, ans=0.0 2023-09-28 15:57:27,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:57:33,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:57:33,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 15:57:33,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:57:38,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=67813.33333333333, ans=0.125 2023-09-28 15:57:40,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 15:57:40,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 15:57:41,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 15:57:41,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 15:57:46,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:57:47,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 15:57:47,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 15:57:48,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=67880.0, ans=0.125 2023-09-28 15:57:49,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 15:57:49,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 15:57:49,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=67880.0, ans=0.125 2023-09-28 15:57:52,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 15:57:52,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:57:56,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:57:56,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 15:57:56,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 15:57:58,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 15:57:59,643 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=24.90 vs. limit=15.0 2023-09-28 15:58:06,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:58:06,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 15:58:08,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 15:58:08,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 15:58:11,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 15:58:14,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 15:58:14,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:58:16,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 15:58:16,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:58:16,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:58:18,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 15:58:25,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:58:31,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 15:58:31,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:58:34,400 INFO [train.py:1039] (3/4) Epoch 2, batch 4900, loss[loss=0.3105, simple_loss=0.3198, pruned_loss=0.1506, over 19363.00 frames. ], tot_loss[loss=0.3057, simple_loss=0.3476, pruned_loss=0.1319, over 4691551.89 frames. ], batch size: 388, lr: 3.39e-02, grad_scale: 32.0 2023-09-28 15:58:37,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 15:58:37,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 15:58:37,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=68080.0, ans=0.125 2023-09-28 15:58:42,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:58:42,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:58:42,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:58:46,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 15:58:47,477 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.052e+02 2.694e+02 3.057e+02 3.718e+02 7.972e+02, threshold=6.114e+02, percent-clipped=1.0 2023-09-28 15:58:50,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 15:58:54,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 15:58:55,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 15:58:57,062 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:58:57,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 15:58:57,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:58:57,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:58:57,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 15:58:59,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 15:59:03,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 15:59:03,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 15:59:06,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 15:59:06,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 15:59:09,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 15:59:09,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:59:10,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:59:10,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 15:59:12,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 15:59:14,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 15:59:14,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 15:59:14,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 15:59:17,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 15:59:20,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 15:59:22,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 15:59:22,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 15:59:22,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=68280.0, ans=0.1 2023-09-28 15:59:23,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 15:59:23,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 15:59:25,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 15:59:25,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 15:59:26,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:59:28,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 15:59:31,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 15:59:35,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 15:59:35,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 15:59:37,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 15:59:38,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 15:59:40,556 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=68346.66666666667, ans=0.125 2023-09-28 15:59:40,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=68346.66666666667, ans=0.125 2023-09-28 15:59:44,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:59:46,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 15:59:47,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 15:59:47,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 15:59:47,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 15:59:50,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 15:59:53,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=68346.66666666667, ans=0.0 2023-09-28 15:59:54,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 15:59:54,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 15:59:54,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 15:59:54,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 15:59:55,940 INFO [train.py:1039] (3/4) Epoch 2, batch 4950, loss[loss=0.3051, simple_loss=0.3398, pruned_loss=0.1352, over 23863.00 frames. ], tot_loss[loss=0.3042, simple_loss=0.3461, pruned_loss=0.1311, over 4695478.03 frames. ], batch size: 212, lr: 3.39e-02, grad_scale: 32.0 2023-09-28 15:59:57,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 16:00:00,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:00:00,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 16:00:02,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=68413.33333333333, ans=0.125 2023-09-28 16:00:03,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 16:00:03,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 16:00:03,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:00:05,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 16:00:05,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:05,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:00:06,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:00:08,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:00:10,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:00:11,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:00:13,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:00:15,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:00:16,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:18,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:00:19,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 16:00:25,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:26,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:00:28,515 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:29,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:00:31,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:00:31,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 16:00:33,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 16:00:36,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:00:36,652 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=68546.66666666667, ans=0.0 2023-09-28 16:00:37,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:00:37,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:00:38,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:00:38,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:00:38,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=68546.66666666667, ans=0.1 2023-09-28 16:00:39,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:00:41,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:00:45,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:00:47,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:00:48,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:00:50,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:00:50,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 16:00:51,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:00:53,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:00:58,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:00:58,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=68613.33333333333, ans=0.1 2023-09-28 16:01:00,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:01:01,383 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:01:01,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:01:01,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:01:02,445 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.69 vs. limit=15.0 2023-09-28 16:01:03,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:01:04,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:01:04,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=68680.0, ans=0.0 2023-09-28 16:01:05,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:01:06,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:01:06,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=68680.0, ans=10.0 2023-09-28 16:01:07,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 16:01:10,716 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:01:15,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 16:01:15,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 16:01:16,658 INFO [train.py:1039] (3/4) Epoch 2, batch 5000, loss[loss=0.3012, simple_loss=0.358, pruned_loss=0.1222, over 24575.00 frames. ], tot_loss[loss=0.3027, simple_loss=0.3457, pruned_loss=0.1299, over 4709062.52 frames. ], batch size: 71, lr: 3.38e-02, grad_scale: 32.0 2023-09-28 16:01:22,584 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:01:22,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:01:24,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 16:01:25,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 16:01:27,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:01:30,660 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.879e+02 2.855e+02 3.346e+02 4.050e+02 6.399e+02, threshold=6.691e+02, percent-clipped=1.0 2023-09-28 16:01:30,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 16:01:30,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:01:31,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:01:33,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 16:01:33,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:01:34,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:01:34,761 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.73 vs. limit=15.0 2023-09-28 16:01:35,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 16:01:35,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:01:36,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:01:38,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 16:01:38,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 16:01:38,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:01:40,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 16:01:40,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 16:01:40,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:01:41,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:01:41,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 16:01:41,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 16:01:42,003 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:01:44,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 16:01:44,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:01:45,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:01:46,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 16:01:48,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:01:51,051 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:01:52,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:01:54,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 16:01:56,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 16:01:58,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:01:58,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:02:03,472 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 16:02:08,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:02:10,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:02:10,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:13,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 16:02:13,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:02:13,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:02:15,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:02:16,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 16:02:17,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:02:19,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:02:21,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:02:27,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 16:02:27,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=69013.33333333333, ans=0.035 2023-09-28 16:02:31,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:39,041 INFO [train.py:1039] (3/4) Epoch 2, batch 5050, loss[loss=0.3109, simple_loss=0.3488, pruned_loss=0.1365, over 23468.00 frames. ], tot_loss[loss=0.303, simple_loss=0.3465, pruned_loss=0.1298, over 4717904.36 frames. ], batch size: 134, lr: 3.38e-02, grad_scale: 32.0 2023-09-28 16:02:41,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:02:41,482 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:42,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:02:42,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:02:42,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:02:42,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:02:43,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:47,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:02:47,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 16:02:47,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:02:48,654 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.90 vs. limit=22.5 2023-09-28 16:02:51,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:02:52,408 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=69080.0, ans=0.1 2023-09-28 16:02:53,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:02:53,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 16:02:53,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:02:55,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:02:55,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=69146.66666666667, ans=0.0 2023-09-28 16:02:56,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:02:58,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:02:59,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:03:01,326 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.88 vs. limit=22.5 2023-09-28 16:03:09,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 16:03:09,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 16:03:09,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:03:11,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 16:03:11,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:03:13,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:03:14,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:03:14,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:03:14,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 16:03:16,142 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 16:03:17,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:03:19,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:03:23,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:03:23,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 16:03:24,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:03:28,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 16:03:28,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:03:28,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:03:30,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:03:31,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:03:32,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:03:34,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=69280.0, ans=0.0 2023-09-28 16:03:34,740 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=6.21 vs. limit=6.0 2023-09-28 16:03:35,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:03:35,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:03:35,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:03:35,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:03:35,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 16:03:37,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:03:39,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:03:43,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:03:43,381 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 16:03:43,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:03:44,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:03:46,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:03:46,332 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 16:03:49,960 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.44 vs. limit=6.0 2023-09-28 16:03:50,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:03:50,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 16:03:50,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:03:53,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:03:55,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:03:55,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 16:03:56,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 16:03:57,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=69346.66666666667, ans=0.0 2023-09-28 16:04:00,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:04:00,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:04:00,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:04:01,412 INFO [train.py:1039] (3/4) Epoch 2, batch 5100, loss[loss=0.334, simple_loss=0.3636, pruned_loss=0.1522, over 23585.00 frames. ], tot_loss[loss=0.3053, simple_loss=0.3479, pruned_loss=0.1313, over 4704627.95 frames. ], batch size: 149, lr: 3.37e-02, grad_scale: 32.0 2023-09-28 16:04:03,204 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 16:04:04,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:04:10,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 16:04:10,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 16:04:10,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:04:13,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:04:15,245 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.986e+02 2.824e+02 3.084e+02 3.697e+02 6.472e+02, threshold=6.168e+02, percent-clipped=0.0 2023-09-28 16:04:17,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:04:17,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 16:04:17,590 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 16:04:24,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:04:24,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:04:26,607 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.03 vs. limit=15.0 2023-09-28 16:04:27,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:04:32,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 16:04:32,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:04:32,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:04:32,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 16:04:35,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:04:36,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:04:36,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 16:04:39,816 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 16:04:39,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:04:41,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 16:04:41,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 16:04:46,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:04:54,467 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.12 vs. limit=15.0 2023-09-28 16:04:55,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:04:59,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 16:04:59,779 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 16:04:59,801 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 16:05:01,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 16:05:01,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:05:04,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 16:05:07,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 16:05:07,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=69680.0, ans=0.0 2023-09-28 16:05:08,385 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.45 vs. limit=6.0 2023-09-28 16:05:10,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 16:05:10,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=69680.0, ans=0.125 2023-09-28 16:05:12,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:05:15,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 16:05:17,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:05:18,722 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 16:05:23,788 INFO [train.py:1039] (3/4) Epoch 2, batch 5150, loss[loss=0.2953, simple_loss=0.3511, pruned_loss=0.1198, over 24639.00 frames. ], tot_loss[loss=0.3036, simple_loss=0.3477, pruned_loss=0.1298, over 4730436.10 frames. ], batch size: 68, lr: 3.37e-02, grad_scale: 32.0 2023-09-28 16:05:25,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:05:25,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:05:25,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:05:25,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:05:25,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:05:27,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:05:30,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 16:05:30,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 16:05:30,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 16:05:30,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:05:30,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 16:05:31,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:05:32,270 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=69746.66666666667, ans=0.125 2023-09-28 16:05:33,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 16:05:35,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:05:37,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:05:41,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 16:05:41,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 16:05:44,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:05:45,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:05:47,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:05:47,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:05:47,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:05:48,299 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.60 vs. limit=15.0 2023-09-28 16:05:48,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:05:48,913 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:05:48,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 16:05:50,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:05:50,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=69813.33333333333, ans=0.95 2023-09-28 16:05:52,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:05:53,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 16:05:54,117 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 16:05:56,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:06:02,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:06:02,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 16:06:06,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:06:12,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:06:13,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:06:16,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:06:18,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:06:20,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 16:06:25,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:06:27,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:06:27,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:06:27,905 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=15.46 vs. limit=22.5 2023-09-28 16:06:30,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:06:30,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:06:32,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 16:06:37,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:06:39,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 16:06:42,108 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:06:42,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:06:43,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:06:43,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:06:43,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:06:43,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:06:45,229 INFO [train.py:1039] (3/4) Epoch 2, batch 5200, loss[loss=0.3037, simple_loss=0.3343, pruned_loss=0.1365, over 23686.00 frames. ], tot_loss[loss=0.3056, simple_loss=0.3489, pruned_loss=0.1311, over 4716687.71 frames. ], batch size: 232, lr: 3.36e-02, grad_scale: 32.0 2023-09-28 16:06:48,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:06:48,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:06:53,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:06:55,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 16:06:56,546 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=6.04 vs. limit=10.0 2023-09-28 16:06:57,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:06:58,450 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.056e+02 2.942e+02 3.378e+02 4.176e+02 6.037e+02, threshold=6.756e+02, percent-clipped=0.0 2023-09-28 16:06:58,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:07:00,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:07:02,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:07:02,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:07:04,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 16:07:07,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 16:07:08,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:07:12,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 16:07:14,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:07:14,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 16:07:16,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 16:07:17,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 16:07:19,498 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=70213.33333333333, ans=0.1 2023-09-28 16:07:20,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 16:07:22,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:07:22,289 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 16:07:22,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:07:23,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:07:23,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:07:25,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 16:07:25,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:07:27,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:07:32,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 16:07:32,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 16:07:32,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 16:07:36,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 16:07:37,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:07:39,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=70280.0, ans=0.1 2023-09-28 16:07:42,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:07:44,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:07:44,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=70280.0, ans=0.125 2023-09-28 16:07:45,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 16:07:47,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:07:47,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 16:07:47,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:07:47,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:07:52,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:07:53,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:07:55,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:07:58,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:07:58,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:08:00,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=70346.66666666667, ans=0.0 2023-09-28 16:08:03,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:08:04,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 16:08:06,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:08:06,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:08:08,683 INFO [train.py:1039] (3/4) Epoch 2, batch 5250, loss[loss=0.2935, simple_loss=0.354, pruned_loss=0.1165, over 24665.00 frames. ], tot_loss[loss=0.3046, simple_loss=0.3471, pruned_loss=0.131, over 4702606.13 frames. ], batch size: 73, lr: 3.36e-02, grad_scale: 32.0 2023-09-28 16:08:08,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:08:08,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:08:10,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:08:12,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:08:16,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:08:16,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:08:18,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:08:25,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:08:26,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:08:28,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:08:30,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:08:33,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 16:08:33,277 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:08:34,727 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:08:48,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=70546.66666666667, ans=0.125 2023-09-28 16:09:16,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=70680.0, ans=0.07 2023-09-28 16:09:19,010 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=70680.0, ans=0.1 2023-09-28 16:09:21,770 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:09:22,854 INFO [train.py:1039] (3/4) Epoch 2, batch 5300, loss[loss=0.2649, simple_loss=0.3195, pruned_loss=0.1052, over 24581.00 frames. ], tot_loss[loss=0.3034, simple_loss=0.3457, pruned_loss=0.1305, over 4701863.98 frames. ], batch size: 60, lr: 3.35e-02, grad_scale: 32.0 2023-09-28 16:09:32,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=70746.66666666667, ans=0.125 2023-09-28 16:09:34,301 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.804e+02 2.707e+02 3.072e+02 3.599e+02 7.324e+02, threshold=6.143e+02, percent-clipped=3.0 2023-09-28 16:09:37,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:09:37,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 16:09:37,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 16:09:38,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:09:38,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:09:38,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:09:38,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:09:38,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:09:38,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:09:38,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:09:38,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:09:39,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:09:39,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 16:09:39,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 16:09:40,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 16:09:40,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:09:40,258 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 16:09:40,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 16:09:40,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:09:41,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:09:41,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:09:41,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:09:41,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:09:41,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:09:41,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:09:41,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:09:42,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:09:42,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:09:42,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:09:42,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:09:42,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:09:43,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 16:09:43,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:09:44,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:09:44,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 16:09:44,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 16:09:44,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:09:44,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:09:44,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 16:09:44,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 16:09:44,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 16:09:45,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:09:45,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:09:45,888 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 16:09:46,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 16:09:46,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 16:09:46,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:09:46,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 16:09:46,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 16:09:46,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 16:09:47,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 16:09:56,584 INFO [train.py:1039] (3/4) Epoch 3, batch 0, loss[loss=0.3244, simple_loss=0.3573, pruned_loss=0.1458, over 23403.00 frames. ], tot_loss[loss=0.3244, simple_loss=0.3573, pruned_loss=0.1458, over 23403.00 frames. ], batch size: 285, lr: 3.18e-02, grad_scale: 32.0 2023-09-28 16:09:56,585 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-28 16:10:11,595 INFO [train.py:1071] (3/4) Epoch 3, validation: loss=0.3974, simple_loss=0.3654, pruned_loss=0.2147, over 1125622.00 frames. 2023-09-28 16:10:11,596 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-28 16:10:14,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 16:10:15,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=70826.66666666667, ans=0.0 2023-09-28 16:10:16,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:10:17,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:10:19,686 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=70826.66666666667, ans=0.2 2023-09-28 16:10:23,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:10:23,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:10:23,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:10:23,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 16:10:26,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 16:10:29,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:10:31,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:10:35,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:10:36,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:10:37,192 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=70893.33333333333, ans=0.1 2023-09-28 16:10:38,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:10:38,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:10:39,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 16:10:40,410 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.13 vs. limit=10.0 2023-09-28 16:10:42,027 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.40 vs. limit=15.0 2023-09-28 16:10:44,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:10:51,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:10:51,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:10:53,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 16:10:55,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=70960.0, ans=0.125 2023-09-28 16:10:58,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:10:58,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:10:58,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:11:02,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:11:04,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=71026.66666666667, ans=0.0 2023-09-28 16:11:04,226 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.49 vs. limit=12.0 2023-09-28 16:11:05,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:11:05,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=71026.66666666667, ans=0.1 2023-09-28 16:11:09,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 16:11:12,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 16:11:12,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:11:12,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:11:15,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:11:15,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:11:16,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=71093.33333333333, ans=0.125 2023-09-28 16:11:17,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 16:11:19,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:11:21,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:11:24,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:11:30,038 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 16:11:31,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:11:32,938 INFO [train.py:1039] (3/4) Epoch 3, batch 50, loss[loss=0.2915, simple_loss=0.3537, pruned_loss=0.1146, over 24311.00 frames. ], tot_loss[loss=0.3061, simple_loss=0.35, pruned_loss=0.1311, over 1067574.70 frames. ], batch size: 74, lr: 3.18e-02, grad_scale: 32.0 2023-09-28 16:11:33,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:11:36,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:11:36,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 16:11:37,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:11:37,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:11:38,720 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.08 vs. limit=22.5 2023-09-28 16:11:39,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:11:39,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:11:43,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:11:45,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 16:11:45,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:11:51,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:11:52,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 16:11:54,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 16:11:56,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=71226.66666666667, ans=0.1 2023-09-28 16:11:57,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:11:57,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=71226.66666666667, ans=0.125 2023-09-28 16:11:59,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:11:59,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:11:59,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=71226.66666666667, ans=0.2 2023-09-28 16:12:01,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:12:02,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=71226.66666666667, ans=0.09899494936611666 2023-09-28 16:12:03,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:12:03,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 16:12:03,411 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:12:11,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:12:12,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:12:12,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:12:14,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 16:12:16,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:12:17,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:12:17,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 16:12:17,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:12:19,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 16:12:21,716 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=5.34 vs. limit=15.0 2023-09-28 16:12:27,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:12:27,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:12:27,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:12:29,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:12:31,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:12:33,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 16:12:33,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 16:12:35,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:12:36,621 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:12:38,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:12:40,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:12:40,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 16:12:42,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 16:12:43,542 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 16:12:44,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:12:46,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:12:47,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 16:12:47,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 16:12:49,333 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.184e+02 2.852e+02 3.312e+02 4.404e+02 9.515e+02, threshold=6.623e+02, percent-clipped=7.0 2023-09-28 16:12:49,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:12:49,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:12:51,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 16:12:51,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:12:54,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:12:55,770 INFO [train.py:1039] (3/4) Epoch 3, batch 100, loss[loss=0.3203, simple_loss=0.3502, pruned_loss=0.1452, over 23576.00 frames. ], tot_loss[loss=0.3031, simple_loss=0.3484, pruned_loss=0.1289, over 1879017.99 frames. ], batch size: 256, lr: 3.17e-02, grad_scale: 32.0 2023-09-28 16:12:57,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:13:01,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:13:04,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 16:13:04,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:13:08,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:13:08,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:13:08,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:13:08,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:13:08,743 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:13:10,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 16:13:15,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 16:13:15,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:13:16,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:13:16,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:13:21,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 16:13:22,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:13:23,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:13:23,749 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.61 vs. limit=15.0 2023-09-28 16:13:24,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 16:13:26,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:13:30,746 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 16:13:30,770 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 16:13:32,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:13:32,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:13:32,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=71626.66666666667, ans=0.0 2023-09-28 16:13:36,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:13:38,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:13:40,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:13:45,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:13:47,225 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 16:13:49,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 16:13:54,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:13:54,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:13:56,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:14:00,914 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:14:03,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:14:05,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:14:08,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:14:10,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:14:10,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:14:11,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:14:11,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:14:11,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 16:14:11,894 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 16:14:11,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:14:13,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:14:15,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:15,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:14:15,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 16:14:15,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 16:14:15,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:14:16,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:16,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:14:18,167 INFO [train.py:1039] (3/4) Epoch 3, batch 150, loss[loss=0.3213, simple_loss=0.3502, pruned_loss=0.1462, over 23797.00 frames. ], tot_loss[loss=0.3047, simple_loss=0.3485, pruned_loss=0.1305, over 2502775.36 frames. ], batch size: 164, lr: 3.17e-02, grad_scale: 32.0 2023-09-28 16:14:18,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:14:19,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:14:20,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:14:23,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:14:27,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:14:27,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:14:27,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:32,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=71826.66666666667, ans=0.125 2023-09-28 16:14:33,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:14:33,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:38,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:14:38,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:41,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 16:14:41,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 16:14:41,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 16:14:44,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:14:44,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:14:46,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:14:48,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:14:48,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:14:49,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:49,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:14:49,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=71960.0, ans=0.2 2023-09-28 16:14:50,091 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.32 vs. limit=15.0 2023-09-28 16:14:52,719 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 16:14:54,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:14:57,512 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=71960.0, ans=0.125 2023-09-28 16:14:59,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:15:06,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:15:07,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 16:15:11,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:15:11,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:15:11,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:15:13,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:15:15,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:15:16,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:15:16,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:15:18,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 16:15:22,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:15:22,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:15:22,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:15:23,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=72093.33333333333, ans=0.0 2023-09-28 16:15:24,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:15:25,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:15:27,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 16:15:29,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:15:31,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:15:33,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:15:34,797 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.108e+02 2.675e+02 3.139e+02 3.901e+02 5.670e+02, threshold=6.278e+02, percent-clipped=0.0 2023-09-28 16:15:37,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:15:37,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 16:15:37,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:15:38,555 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 16:15:42,126 INFO [train.py:1039] (3/4) Epoch 3, batch 200, loss[loss=0.2985, simple_loss=0.3402, pruned_loss=0.1284, over 23331.00 frames. ], tot_loss[loss=0.3024, simple_loss=0.3471, pruned_loss=0.1289, over 3003009.60 frames. ], batch size: 134, lr: 3.16e-02, grad_scale: 32.0 2023-09-28 16:15:42,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:15:46,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:15:46,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:15:48,648 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=72160.0, ans=0.0 2023-09-28 16:15:49,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 16:15:51,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:15:51,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:15:54,399 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 16:15:54,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:15:56,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:15:57,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:16:02,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:16:02,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:16:02,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:16:03,982 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=72226.66666666667, ans=0.0 2023-09-28 16:16:13,060 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:16:25,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:16:26,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:16:26,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:16:26,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:16:27,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 16:16:27,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:16:28,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:16:31,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:16:31,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:16:31,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:16:31,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=72360.0, ans=0.125 2023-09-28 16:16:33,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 16:16:34,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 16:16:34,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:16:37,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:16:47,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:16:55,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:16:55,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:16:56,554 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=3.29 vs. limit=12.0 2023-09-28 16:17:00,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:17:02,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=72493.33333333333, ans=0.0 2023-09-28 16:17:03,673 INFO [train.py:1039] (3/4) Epoch 3, batch 250, loss[loss=0.2941, simple_loss=0.3469, pruned_loss=0.1207, over 24645.00 frames. ], tot_loss[loss=0.3018, simple_loss=0.3469, pruned_loss=0.1283, over 3392647.54 frames. ], batch size: 65, lr: 3.16e-02, grad_scale: 32.0 2023-09-28 16:17:03,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 16:17:05,336 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:17:05,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:17:05,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:17:06,912 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:17:07,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 16:17:07,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:17:08,527 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 16:17:10,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:17:11,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:17:13,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:17:13,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:17:15,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:17:16,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:17:18,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:17:19,061 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=16.70 vs. limit=15.0 2023-09-28 16:17:24,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:17:24,983 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=72560.0, ans=0.125 2023-09-28 16:17:34,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:17:36,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:17:36,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:17:42,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:17:42,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:17:42,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=72626.66666666667, ans=0.0 2023-09-28 16:17:44,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:17:44,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=72626.66666666667, ans=0.1 2023-09-28 16:17:45,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:17:47,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:17:47,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:17:48,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:17:50,594 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:17:52,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=72693.33333333333, ans=0.0 2023-09-28 16:17:55,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 16:17:55,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:17:57,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:17:57,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:17:57,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:17:57,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:18:00,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:18:01,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:18:04,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:18:05,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:18:06,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:18:09,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:18:11,522 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.09 vs. limit=15.0 2023-09-28 16:18:13,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:18:14,071 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:18:14,624 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.45 vs. limit=10.0 2023-09-28 16:18:16,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:18:19,837 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.841e+02 2.635e+02 3.105e+02 3.716e+02 7.443e+02, threshold=6.210e+02, percent-clipped=1.0 2023-09-28 16:18:20,711 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.24 vs. limit=15.0 2023-09-28 16:18:22,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:18:23,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=72760.0, ans=0.0 2023-09-28 16:18:25,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:18:26,508 INFO [train.py:1039] (3/4) Epoch 3, batch 300, loss[loss=0.3086, simple_loss=0.3497, pruned_loss=0.1338, over 23939.00 frames. ], tot_loss[loss=0.2996, simple_loss=0.3448, pruned_loss=0.1272, over 3684739.49 frames. ], batch size: 80, lr: 3.15e-02, grad_scale: 32.0 2023-09-28 16:18:27,511 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=23.24 vs. limit=22.5 2023-09-28 16:18:28,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 16:18:29,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:18:29,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:18:31,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 16:18:32,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 16:18:32,441 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=72826.66666666667, ans=0.125 2023-09-28 16:18:34,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:18:34,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 16:18:37,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:18:40,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:18:42,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:18:42,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 16:18:43,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:18:45,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:18:45,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 16:18:45,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:18:47,447 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=72893.33333333333, ans=0.0 2023-09-28 16:18:48,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=72893.33333333333, ans=0.125 2023-09-28 16:18:50,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 16:18:54,764 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:18:54,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 16:18:59,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 16:19:00,614 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.14 vs. limit=15.0 2023-09-28 16:19:01,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:01,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=72960.0, ans=0.2 2023-09-28 16:19:03,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:19:07,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:07,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 16:19:07,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:19:08,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:19:10,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:19:12,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:19:15,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 16:19:15,451 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 16:19:16,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:19:18,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:20,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 16:19:20,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:19:24,921 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:19:28,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:19:28,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 16:19:32,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:32,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:19:34,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=73093.33333333333, ans=0.125 2023-09-28 16:19:36,028 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:36,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=73093.33333333333, ans=0.125 2023-09-28 16:19:38,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:19:38,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 16:19:38,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 16:19:38,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:19:39,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 16:19:41,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:19:42,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:19:44,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:19:44,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:19:44,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:19:49,015 INFO [train.py:1039] (3/4) Epoch 3, batch 350, loss[loss=0.2845, simple_loss=0.3393, pruned_loss=0.1149, over 24662.00 frames. ], tot_loss[loss=0.296, simple_loss=0.3418, pruned_loss=0.1251, over 3922915.86 frames. ], batch size: 65, lr: 3.15e-02, grad_scale: 32.0 2023-09-28 16:19:49,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:19:49,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 16:19:51,004 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:19:58,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:20:01,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:20:01,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:20:04,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 16:20:06,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:20:06,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 16:20:10,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:20:11,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 16:20:13,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:20:14,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 16:20:16,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:20:18,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:20:19,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:20:21,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:20:21,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:20:21,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:20:21,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:20:22,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:20:24,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:20:24,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:20:31,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:20:31,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:20:32,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:20:34,117 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:20:40,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 16:20:40,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:20:45,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:20:45,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:20:45,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:20:47,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 16:20:51,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:20:52,707 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 16:20:52,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 16:20:52,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:20:57,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:20:57,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 16:21:00,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:21:02,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:21:03,443 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.805e+02 2.765e+02 3.239e+02 3.985e+02 6.243e+02, threshold=6.477e+02, percent-clipped=2.0 2023-09-28 16:21:03,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:21:05,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:21:05,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:21:08,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:21:10,296 INFO [train.py:1039] (3/4) Epoch 3, batch 400, loss[loss=0.3106, simple_loss=0.3475, pruned_loss=0.1368, over 23419.00 frames. ], tot_loss[loss=0.2944, simple_loss=0.3407, pruned_loss=0.1241, over 4113364.29 frames. ], batch size: 285, lr: 3.14e-02, grad_scale: 32.0 2023-09-28 16:21:10,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:21:10,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=73493.33333333333, ans=0.1 2023-09-28 16:21:13,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:21:15,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 16:21:15,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:21:15,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:21:17,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:21:18,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:21:20,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:21:23,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:21:23,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 16:21:25,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 16:21:25,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:21:26,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 16:21:26,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:21:30,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:21:30,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:21:30,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 16:21:30,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:21:30,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:21:30,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:21:31,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:21:33,277 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 16:21:34,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 16:21:35,408 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.97 vs. limit=15.0 2023-09-28 16:21:35,470 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.28 vs. limit=22.5 2023-09-28 16:21:39,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:21:41,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:21:42,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 16:21:43,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 16:21:46,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:21:49,764 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:21:56,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=73626.66666666667, ans=0.0 2023-09-28 16:21:57,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 16:21:59,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=73626.66666666667, ans=0.07 2023-09-28 16:22:00,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 16:22:03,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 16:22:06,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:22:07,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:22:07,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 16:22:11,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:22:11,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=73693.33333333333, ans=0.0 2023-09-28 16:22:14,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:22:15,572 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.59 vs. limit=6.0 2023-09-28 16:22:16,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:22:19,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:22:19,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 16:22:19,399 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 16:22:20,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 16:22:22,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:22:22,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:22:25,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 16:22:26,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:22:26,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:22:28,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 16:22:30,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 16:22:30,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:22:32,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:22:32,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:22:32,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 16:22:32,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:22:33,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:22:35,085 INFO [train.py:1039] (3/4) Epoch 3, batch 450, loss[loss=0.2729, simple_loss=0.3297, pruned_loss=0.108, over 24467.00 frames. ], tot_loss[loss=0.2949, simple_loss=0.3408, pruned_loss=0.1245, over 4247672.32 frames. ], batch size: 66, lr: 3.14e-02, grad_scale: 32.0 2023-09-28 16:22:36,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:22:43,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=73826.66666666667, ans=0.125 2023-09-28 16:22:46,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:22:46,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:22:48,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 16:22:49,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 16:22:52,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:22:56,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:22:59,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:23:05,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:23:06,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:23:08,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=73960.0, ans=0.1 2023-09-28 16:23:09,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 16:23:09,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 16:23:11,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 16:23:11,489 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:23:11,792 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=73960.0, ans=0.125 2023-09-28 16:23:13,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:23:14,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:23:16,045 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 16:23:16,060 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 16:23:16,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:23:16,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=73960.0, ans=0.125 2023-09-28 16:23:17,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:23:19,620 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 16:23:22,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:23:22,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:23:24,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 16:23:24,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 16:23:24,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=74026.66666666667, ans=0.125 2023-09-28 16:23:26,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:23:29,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 16:23:29,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:23:32,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 16:23:36,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:23:38,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 16:23:38,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=74026.66666666667, ans=0.2 2023-09-28 16:23:40,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 16:23:41,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:23:46,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:23:47,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:23:49,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:23:51,042 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 16:23:52,499 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.112e+02 2.606e+02 2.993e+02 3.540e+02 4.868e+02, threshold=5.986e+02, percent-clipped=0.0 2023-09-28 16:23:53,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=74093.33333333333, ans=0.0 2023-09-28 16:23:54,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:23:54,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:23:54,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:23:54,791 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 16:23:57,544 INFO [train.py:1039] (3/4) Epoch 3, batch 500, loss[loss=0.2677, simple_loss=0.3148, pruned_loss=0.1103, over 24433.00 frames. ], tot_loss[loss=0.2942, simple_loss=0.3407, pruned_loss=0.1238, over 4358936.99 frames. ], batch size: 58, lr: 3.13e-02, grad_scale: 16.0 2023-09-28 16:23:57,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 16:23:57,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:24:00,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 16:24:05,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 16:24:07,400 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:24:10,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:24:10,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:24:11,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:21,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:24:22,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:24:22,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 16:24:22,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:24:24,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 16:24:24,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:24:27,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:24:28,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:24:28,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:24:30,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:24:30,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 16:24:35,589 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 16:24:37,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:24:38,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:38,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:40,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:40,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:24:44,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 16:24:47,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:24:48,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:24:53,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:24:54,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:24:56,723 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=74360.0, ans=0.1 2023-09-28 16:24:59,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:25:04,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 16:25:04,032 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:25:04,064 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:25:10,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 16:25:10,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 16:25:12,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:25:16,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 16:25:18,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 16:25:18,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:25:18,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 16:25:20,333 INFO [train.py:1039] (3/4) Epoch 3, batch 550, loss[loss=0.2628, simple_loss=0.3315, pruned_loss=0.09707, over 24481.00 frames. ], tot_loss[loss=0.2966, simple_loss=0.3428, pruned_loss=0.1252, over 4435025.01 frames. ], batch size: 66, lr: 3.13e-02, grad_scale: 16.0 2023-09-28 16:25:20,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:25:20,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:25:22,010 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:25:23,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:25:23,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:25:25,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:25:28,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:25:30,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 16:25:30,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:25:34,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:25:36,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:25:39,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:25:40,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:25:44,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 16:25:44,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 16:25:47,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:25:52,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:25:52,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:25:54,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:25:58,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:25:58,232 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 16:26:00,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:26:01,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 16:26:05,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:26:06,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:26:06,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:26:08,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:26:09,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 16:26:11,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 16:26:12,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:26:12,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:26:14,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:26:14,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:26:14,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=74693.33333333333, ans=0.125 2023-09-28 16:26:17,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:26:18,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:26:22,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:26:22,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:26:24,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 16:26:24,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:26:27,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:26:28,712 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:26:28,987 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=74760.0, ans=0.125 2023-09-28 16:26:30,089 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:26:32,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:26:32,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 16:26:32,879 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.41 vs. limit=15.0 2023-09-28 16:26:36,871 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.890e+02 2.622e+02 3.187e+02 4.101e+02 6.995e+02, threshold=6.373e+02, percent-clipped=4.0 2023-09-28 16:26:39,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 16:26:41,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 16:26:42,505 INFO [train.py:1039] (3/4) Epoch 3, batch 600, loss[loss=0.316, simple_loss=0.3484, pruned_loss=0.1418, over 22725.00 frames. ], tot_loss[loss=0.2964, simple_loss=0.3431, pruned_loss=0.1248, over 4506424.35 frames. ], batch size: 322, lr: 3.13e-02, grad_scale: 16.0 2023-09-28 16:26:42,713 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:26:44,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:26:44,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:26:50,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:26:51,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:26:53,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 16:26:56,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:26:58,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:26:58,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:27:02,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 16:27:02,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:27:08,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=74893.33333333333, ans=0.1 2023-09-28 16:27:08,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=74893.33333333333, ans=0.0 2023-09-28 16:27:11,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 16:27:14,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:27:14,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:27:14,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:27:19,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:27:21,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:27:21,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:27:24,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=74960.0, ans=0.125 2023-09-28 16:27:27,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:27:33,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:27:33,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:27:33,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:27:42,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 16:27:48,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:27:48,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:27:52,513 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.83 vs. limit=15.0 2023-09-28 16:27:53,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 16:27:53,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:27:55,980 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=24.38 vs. limit=22.5 2023-09-28 16:27:56,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 16:27:56,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:27:58,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:28:03,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 16:28:04,669 INFO [train.py:1039] (3/4) Epoch 3, batch 650, loss[loss=0.2939, simple_loss=0.317, pruned_loss=0.1354, over 23570.00 frames. ], tot_loss[loss=0.2953, simple_loss=0.3413, pruned_loss=0.1247, over 4545473.65 frames. ], batch size: 256, lr: 3.12e-02, grad_scale: 16.0 2023-09-28 16:28:04,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 16:28:06,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:28:09,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:28:11,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:28:12,364 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.18 vs. limit=15.0 2023-09-28 16:28:13,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 16:28:15,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:28:18,707 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.85 vs. limit=6.0 2023-09-28 16:28:20,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:28:20,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:28:24,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:28:27,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 16:28:30,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:28:30,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=75226.66666666667, ans=0.1 2023-09-28 16:28:32,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:28:35,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:28:36,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 16:28:38,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:28:38,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=75293.33333333333, ans=0.2 2023-09-28 16:28:39,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:28:39,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 16:28:41,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:28:43,102 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:28:45,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:28:45,341 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 16:28:45,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:28:45,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:28:50,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:28:52,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:28:53,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:28:53,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:28:54,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=75360.0, ans=0.125 2023-09-28 16:28:55,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 16:28:56,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:28:56,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:28:58,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:28:58,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:28:59,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 16:29:01,481 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 16:29:03,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 16:29:03,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:03,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:29:03,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:29:04,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:29:04,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:29:11,298 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:11,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:29:11,633 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=75426.66666666667, ans=0.0 2023-09-28 16:29:12,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:29:15,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:29:15,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 16:29:15,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:29:23,200 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.199e+02 2.685e+02 3.190e+02 3.569e+02 4.758e+02, threshold=6.380e+02, percent-clipped=0.0 2023-09-28 16:29:23,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:29:23,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:29:24,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:29:24,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:29:27,709 INFO [train.py:1039] (3/4) Epoch 3, batch 700, loss[loss=0.2797, simple_loss=0.3199, pruned_loss=0.1198, over 23761.00 frames. ], tot_loss[loss=0.293, simple_loss=0.3396, pruned_loss=0.1232, over 4594387.83 frames. ], batch size: 212, lr: 3.12e-02, grad_scale: 16.0 2023-09-28 16:29:29,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 16:29:31,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 16:29:34,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 16:29:34,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:36,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:29:36,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=75493.33333333333, ans=0.1 2023-09-28 16:29:39,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 16:29:43,974 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.53 vs. limit=15.0 2023-09-28 16:29:44,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:29:46,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:29:47,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:49,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:29:50,045 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=23.21 vs. limit=22.5 2023-09-28 16:29:50,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:29:54,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:29:56,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 16:29:57,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:29:59,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 16:30:03,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 16:30:06,459 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:30:06,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:30:08,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:30:11,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:30:11,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 16:30:16,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:30:17,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:30:17,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 16:30:22,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:30:22,771 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=75693.33333333333, ans=0.0 2023-09-28 16:30:24,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:30:27,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:30:31,553 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.96 vs. limit=10.0 2023-09-28 16:30:33,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:30:34,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 16:30:36,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 16:30:36,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 16:30:40,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:30:42,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:30:44,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:30:44,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:30:44,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 16:30:49,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=75826.66666666667, ans=0.0 2023-09-28 16:30:50,479 INFO [train.py:1039] (3/4) Epoch 3, batch 750, loss[loss=0.2953, simple_loss=0.3325, pruned_loss=0.1291, over 23454.00 frames. ], tot_loss[loss=0.2931, simple_loss=0.3392, pruned_loss=0.1235, over 4620648.91 frames. ], batch size: 285, lr: 3.11e-02, grad_scale: 16.0 2023-09-28 16:30:50,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 16:30:50,620 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 16:30:50,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 16:30:52,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 16:30:52,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 16:30:53,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:30:55,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 16:30:55,522 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:30:56,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:30:58,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:30:58,503 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=75826.66666666667, ans=0.125 2023-09-28 16:31:00,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:31:01,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:31:01,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:31:02,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:31:05,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:31:07,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:31:08,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:31:12,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:31:12,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:31:14,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 16:31:15,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:31:15,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:31:16,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=75893.33333333333, ans=0.0 2023-09-28 16:31:16,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=75893.33333333333, ans=0.125 2023-09-28 16:31:19,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:31:20,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 16:31:21,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=75893.33333333333, ans=0.125 2023-09-28 16:31:22,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 16:31:22,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:31:24,560 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.05 vs. limit=15.0 2023-09-28 16:31:26,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 16:31:26,652 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 16:31:28,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 16:31:28,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:31:28,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 16:31:29,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=75960.0, ans=0.2 2023-09-28 16:31:31,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:31:33,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=75960.0, ans=0.125 2023-09-28 16:31:38,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:31:38,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:31:38,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:31:40,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=76026.66666666667, ans=0.0 2023-09-28 16:31:41,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:31:42,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:31:44,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 16:31:45,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:31:47,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 16:31:49,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:31:52,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:31:53,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 16:31:53,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:31:56,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:31:58,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:31:59,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:32:01,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:32:04,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 16:32:04,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:32:06,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:32:06,736 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=76093.33333333333, ans=0.0 2023-09-28 16:32:07,688 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.054e+02 2.659e+02 2.971e+02 3.538e+02 5.180e+02, threshold=5.942e+02, percent-clipped=0.0 2023-09-28 16:32:07,889 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:32:07,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:32:10,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:32:10,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:32:12,844 INFO [train.py:1039] (3/4) Epoch 3, batch 800, loss[loss=0.3159, simple_loss=0.3501, pruned_loss=0.1408, over 23792.00 frames. ], tot_loss[loss=0.293, simple_loss=0.3392, pruned_loss=0.1234, over 4650900.63 frames. ], batch size: 212, lr: 3.11e-02, grad_scale: 32.0 2023-09-28 16:32:23,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:32:23,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:32:25,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:32:25,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:32:26,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:32:26,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:32:28,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:32:29,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=76226.66666666667, ans=0.125 2023-09-28 16:32:31,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:32:33,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:32:36,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 16:32:37,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:32:39,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:32:39,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:32:39,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:32:39,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 16:32:41,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:32:41,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 16:32:45,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:32:48,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:32:50,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:32:50,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:32:56,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:32:56,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:32:58,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=76293.33333333333, ans=0.0 2023-09-28 16:33:00,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:33:00,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:33:00,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 16:33:01,388 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.26 vs. limit=12.0 2023-09-28 16:33:02,926 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 16:33:02,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 16:33:04,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:33:04,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:33:06,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:33:06,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:33:06,947 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:33:12,680 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 16:33:12,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 16:33:14,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:33:14,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=76360.0, ans=0.1 2023-09-28 16:33:15,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:33:20,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:33:23,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:33:25,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 16:33:25,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:33:30,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 16:33:33,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:33:35,172 INFO [train.py:1039] (3/4) Epoch 3, batch 850, loss[loss=0.2962, simple_loss=0.3344, pruned_loss=0.129, over 23297.00 frames. ], tot_loss[loss=0.2934, simple_loss=0.3393, pruned_loss=0.1238, over 4667808.61 frames. ], batch size: 105, lr: 3.10e-02, grad_scale: 32.0 2023-09-28 16:33:37,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:33:38,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 16:33:38,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:33:40,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:33:42,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 16:33:43,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:33:43,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:33:45,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:33:46,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:33:48,546 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:33:50,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 16:33:50,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 16:33:50,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 16:33:51,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:33:53,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:33:54,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:33:54,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:33:54,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:33:58,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:33:58,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:34:00,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 16:34:04,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 16:34:05,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:34:09,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 16:34:12,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 16:34:13,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 16:34:15,954 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 16:34:15,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:34:15,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:34:15,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 16:34:18,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:34:20,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:34:20,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 16:34:23,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:34:25,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:34:25,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:34:25,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:34:28,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:34:29,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 16:34:29,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 16:34:35,306 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.06 vs. limit=15.0 2023-09-28 16:34:35,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:34:35,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:34:35,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:34:35,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:34:37,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:34:40,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:34:43,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:34:44,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:34:46,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:34:47,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 16:34:47,844 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=76760.0, ans=0.125 2023-09-28 16:34:53,429 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.514e+02 2.970e+02 3.562e+02 5.095e+02, threshold=5.941e+02, percent-clipped=0.0 2023-09-28 16:34:55,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 16:34:57,913 INFO [train.py:1039] (3/4) Epoch 3, batch 900, loss[loss=0.2757, simple_loss=0.338, pruned_loss=0.1067, over 24539.00 frames. ], tot_loss[loss=0.2938, simple_loss=0.3398, pruned_loss=0.1239, over 4693405.72 frames. ], batch size: 71, lr: 3.10e-02, grad_scale: 32.0 2023-09-28 16:34:57,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:34:58,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 16:34:58,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:34:59,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:35:01,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 16:35:03,729 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.14 vs. limit=6.0 2023-09-28 16:35:07,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:35:10,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:35:12,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 16:35:17,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:35:17,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 16:35:18,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 16:35:19,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:35:19,869 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:35:19,931 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 16:35:19,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:35:31,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:35:31,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:35:31,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:35:34,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:35:39,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 16:35:42,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:35:45,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=76960.0, ans=0.125 2023-09-28 16:35:46,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:35:46,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:35:48,189 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 16:35:48,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 16:35:55,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 16:35:55,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:35:55,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:36:01,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:36:01,889 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:36:03,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 16:36:03,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:36:03,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=77026.66666666667, ans=0.125 2023-09-28 16:36:07,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 16:36:09,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:36:09,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:36:11,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:36:11,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:36:16,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 16:36:16,491 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 16:36:19,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 16:36:19,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 16:36:19,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=77093.33333333333, ans=0.125 2023-09-28 16:36:22,406 INFO [train.py:1039] (3/4) Epoch 3, batch 950, loss[loss=0.2966, simple_loss=0.3248, pruned_loss=0.1342, over 23859.00 frames. ], tot_loss[loss=0.2956, simple_loss=0.3411, pruned_loss=0.1251, over 4696206.38 frames. ], batch size: 212, lr: 3.09e-02, grad_scale: 32.0 2023-09-28 16:36:22,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:36:27,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 16:36:31,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:36:32,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:36:33,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:36:35,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 16:36:36,683 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 16:36:39,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:36:41,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:36:42,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:36:42,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:36:42,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 16:36:44,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 16:36:47,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:36:47,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 16:36:49,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:36:52,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:36:52,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:36:52,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:36:54,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 16:36:57,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 16:37:00,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:37:03,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:37:03,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=77293.33333333333, ans=0.125 2023-09-28 16:37:07,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:37:07,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:37:08,231 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.54 vs. limit=15.0 2023-09-28 16:37:10,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 16:37:15,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 16:37:15,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:37:15,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:37:15,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:37:15,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:37:17,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=77360.0, ans=0.0 2023-09-28 16:37:20,361 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.65 vs. limit=12.0 2023-09-28 16:37:21,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 16:37:21,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:37:25,343 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:37:26,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:37:26,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 16:37:26,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:37:26,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:37:26,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 16:37:33,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:37:35,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:37:40,046 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.993e+02 2.741e+02 3.253e+02 3.972e+02 7.741e+02, threshold=6.506e+02, percent-clipped=1.0 2023-09-28 16:37:40,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:37:43,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 16:37:43,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 16:37:45,294 INFO [train.py:1039] (3/4) Epoch 3, batch 1000, loss[loss=0.2934, simple_loss=0.3565, pruned_loss=0.1151, over 24494.00 frames. ], tot_loss[loss=0.2945, simple_loss=0.3404, pruned_loss=0.1243, over 4715192.91 frames. ], batch size: 69, lr: 3.09e-02, grad_scale: 32.0 2023-09-28 16:37:47,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:37:50,149 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 16:37:50,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:37:53,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:37:56,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 16:37:56,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 16:38:01,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:38:02,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:38:02,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:38:04,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=77560.0, ans=0.125 2023-09-28 16:38:07,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 16:38:12,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 16:38:13,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 16:38:13,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:38:15,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 16:38:18,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 16:38:19,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 16:38:20,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:38:20,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:38:28,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:38:29,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:38:29,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:38:29,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:38:29,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 16:38:30,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:38:32,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:38:33,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:38:33,606 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 16:38:36,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 16:38:38,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 16:38:39,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 16:38:43,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:38:43,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=77693.33333333333, ans=0.0 2023-09-28 16:38:50,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:38:50,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:38:50,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:38:51,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:38:52,725 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.87 vs. limit=22.5 2023-09-28 16:38:53,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 16:38:54,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:38:54,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 16:38:55,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 16:38:57,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:38:57,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:39:00,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:39:02,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:39:04,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:39:08,074 INFO [train.py:1039] (3/4) Epoch 3, batch 1050, loss[loss=0.2739, simple_loss=0.3077, pruned_loss=0.1201, over 23776.00 frames. ], tot_loss[loss=0.2925, simple_loss=0.3386, pruned_loss=0.1232, over 4717957.90 frames. ], batch size: 232, lr: 3.08e-02, grad_scale: 32.0 2023-09-28 16:39:09,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:39:09,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:39:11,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:39:13,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:39:14,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:39:16,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 16:39:18,264 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=77826.66666666667, ans=0.1 2023-09-28 16:39:19,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 16:39:22,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:39:22,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:39:22,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:39:24,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:39:25,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 16:39:26,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:39:26,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 16:39:27,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=77893.33333333333, ans=0.1 2023-09-28 16:39:29,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:39:29,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 16:39:29,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 16:39:35,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:39:36,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:39:36,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:39:40,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 16:39:40,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 16:39:40,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:39:45,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 16:39:47,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 16:39:48,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:39:52,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 16:39:53,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 16:39:53,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:39:55,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:39:58,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:40:03,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 16:40:03,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 16:40:03,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 16:40:05,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:40:05,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:40:08,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 16:40:14,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:40:14,985 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=16.11 vs. limit=22.5 2023-09-28 16:40:15,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:40:15,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:40:15,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:40:15,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:40:16,186 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=78093.33333333333, ans=0.0 2023-09-28 16:40:19,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:40:19,100 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 16:40:20,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 16:40:20,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 16:40:22,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 16:40:22,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:40:24,888 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=17.85 vs. limit=15.0 2023-09-28 16:40:25,633 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.910e+02 2.729e+02 3.108e+02 3.500e+02 5.269e+02, threshold=6.215e+02, percent-clipped=0.0 2023-09-28 16:40:25,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:40:31,232 INFO [train.py:1039] (3/4) Epoch 3, batch 1100, loss[loss=0.2636, simple_loss=0.3177, pruned_loss=0.1047, over 24637.00 frames. ], tot_loss[loss=0.2927, simple_loss=0.3382, pruned_loss=0.1236, over 4710124.58 frames. ], batch size: 60, lr: 3.08e-02, grad_scale: 32.0 2023-09-28 16:40:31,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:40:36,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=78160.0, ans=0.04949747468305833 2023-09-28 16:40:37,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:40:38,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:40:38,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:40:40,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 16:40:41,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:40:45,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 16:40:49,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:40:50,109 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.17 vs. limit=22.5 2023-09-28 16:40:53,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:40:54,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 16:40:55,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 16:40:57,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:40:57,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:40:59,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:41:01,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 16:41:02,874 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=78293.33333333333, ans=0.125 2023-09-28 16:41:03,273 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.63 vs. limit=15.0 2023-09-28 16:41:05,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:41:07,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=78293.33333333333, ans=0.125 2023-09-28 16:41:09,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 16:41:09,293 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 16:41:10,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:41:12,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:41:13,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:41:13,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:41:16,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 16:41:16,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:41:16,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:41:16,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:41:17,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:41:19,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 16:41:23,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:41:23,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 16:41:27,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:41:32,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:41:35,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 16:41:36,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 16:41:38,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:41:40,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:41:40,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:41:42,631 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:41:43,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 16:41:43,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:41:43,932 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:41:45,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 16:41:45,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:41:45,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 16:41:47,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:41:47,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:41:48,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=78426.66666666667, ans=0.0 2023-09-28 16:41:48,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=78426.66666666667, ans=0.1 2023-09-28 16:41:49,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:41:53,550 INFO [train.py:1039] (3/4) Epoch 3, batch 1150, loss[loss=0.3345, simple_loss=0.3583, pruned_loss=0.1553, over 22791.00 frames. ], tot_loss[loss=0.2939, simple_loss=0.339, pruned_loss=0.1244, over 4712721.74 frames. ], batch size: 322, lr: 3.07e-02, grad_scale: 32.0 2023-09-28 16:41:55,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:41:58,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:42:00,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:42:00,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:42:01,824 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 16:42:01,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:42:03,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=78493.33333333333, ans=0.125 2023-09-28 16:42:04,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 16:42:05,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:42:05,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:42:13,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 16:42:16,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:42:19,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:42:21,305 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:42:21,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 16:42:21,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:42:21,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:42:24,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 16:42:26,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:42:28,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:42:31,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=78626.66666666667, ans=0.125 2023-09-28 16:42:38,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:42:46,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:42:46,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 16:42:48,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:42:48,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:42:48,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=78693.33333333333, ans=0.2 2023-09-28 16:42:54,778 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 16:42:55,568 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.56 vs. limit=15.0 2023-09-28 16:42:56,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:43:01,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=78760.0, ans=0.125 2023-09-28 16:43:04,624 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 16:43:07,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:43:09,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:43:09,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:43:11,194 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.090e+02 2.632e+02 2.933e+02 3.650e+02 8.073e+02, threshold=5.867e+02, percent-clipped=1.0 2023-09-28 16:43:11,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:43:14,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:43:15,750 INFO [train.py:1039] (3/4) Epoch 3, batch 1200, loss[loss=0.2775, simple_loss=0.3157, pruned_loss=0.1197, over 23686.00 frames. ], tot_loss[loss=0.2946, simple_loss=0.34, pruned_loss=0.1246, over 4713420.82 frames. ], batch size: 149, lr: 3.07e-02, grad_scale: 32.0 2023-09-28 16:43:21,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:43:21,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:43:21,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=78826.66666666667, ans=0.125 2023-09-28 16:43:22,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:43:22,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:43:22,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:43:25,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:43:27,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:43:29,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:43:29,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:43:32,571 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 16:43:34,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=78893.33333333333, ans=0.2 2023-09-28 16:43:36,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 16:43:39,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:43:42,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:43:44,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=78893.33333333333, ans=0.1 2023-09-28 16:43:45,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:43:45,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:43:45,429 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 16:43:46,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:43:55,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 16:43:55,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:43:55,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 16:43:57,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:44:02,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 16:44:05,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 16:44:05,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:44:07,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:44:08,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:44:08,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:44:11,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:44:11,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:44:12,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:44:13,828 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 16:44:13,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:44:13,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:44:14,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 16:44:18,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:44:18,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:44:23,499 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 16:44:25,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:44:26,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 16:44:30,612 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 16:44:32,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:44:35,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:44:35,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=79093.33333333333, ans=10.0 2023-09-28 16:44:36,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:44:38,703 INFO [train.py:1039] (3/4) Epoch 3, batch 1250, loss[loss=0.2386, simple_loss=0.2931, pruned_loss=0.09202, over 24313.00 frames. ], tot_loss[loss=0.2959, simple_loss=0.3409, pruned_loss=0.1255, over 4701209.77 frames. ], batch size: 56, lr: 3.06e-02, grad_scale: 32.0 2023-09-28 16:44:38,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:44:41,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 16:44:45,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:44:47,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:44:47,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 16:44:50,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:44:50,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:44:55,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 16:44:55,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:44:57,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:44:57,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:45:02,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 16:45:04,743 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.06 vs. limit=22.5 2023-09-28 16:45:07,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 16:45:07,062 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:45:07,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:45:08,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:45:08,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:45:12,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:45:13,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 16:45:20,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 16:45:20,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:45:25,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:45:26,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 16:45:26,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:45:26,845 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 16:45:26,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:45:26,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:45:30,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:45:35,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:45:35,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:45:37,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 16:45:37,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 16:45:37,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 16:45:41,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:45:43,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 16:45:43,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:45:47,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 16:45:47,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:45:50,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 16:45:51,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 16:45:52,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:45:52,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 16:45:53,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:45:55,646 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.921e+02 2.589e+02 2.905e+02 3.561e+02 6.488e+02, threshold=5.810e+02, percent-clipped=2.0 2023-09-28 16:45:55,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 16:45:58,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:45:58,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:45:59,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:46:01,423 INFO [train.py:1039] (3/4) Epoch 3, batch 1300, loss[loss=0.2555, simple_loss=0.3228, pruned_loss=0.09411, over 24641.00 frames. ], tot_loss[loss=0.2943, simple_loss=0.3402, pruned_loss=0.1242, over 4716105.82 frames. ], batch size: 73, lr: 3.06e-02, grad_scale: 32.0 2023-09-28 16:46:02,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 16:46:06,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:46:06,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 16:46:12,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:46:14,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 16:46:14,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:46:16,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=79560.0, ans=0.125 2023-09-28 16:46:17,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:46:18,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:46:18,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 16:46:24,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:46:24,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:46:25,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 16:46:30,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 16:46:32,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:46:34,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:46:35,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:46:35,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:46:37,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:46:37,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 16:46:38,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 16:46:45,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:46:45,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 16:46:47,111 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 16:46:47,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 16:46:50,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:46:53,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:46:53,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 16:46:54,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:46:54,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 16:46:56,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:46:59,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:46:59,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:47:03,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 16:47:04,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 16:47:04,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 16:47:09,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:47:11,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 16:47:14,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=79760.0, ans=0.125 2023-09-28 16:47:15,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:47:22,592 INFO [train.py:1039] (3/4) Epoch 3, batch 1350, loss[loss=0.2464, simple_loss=0.3082, pruned_loss=0.09227, over 24556.00 frames. ], tot_loss[loss=0.2937, simple_loss=0.3396, pruned_loss=0.1239, over 4714688.94 frames. ], batch size: 60, lr: 3.05e-02, grad_scale: 32.0 2023-09-28 16:47:22,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 16:47:28,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:47:28,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:47:32,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:47:32,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:47:33,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=79826.66666666667, ans=0.125 2023-09-28 16:47:36,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:47:36,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:47:40,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:47:40,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 16:47:44,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:47:45,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:47:49,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 16:47:50,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:47:51,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:47:51,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 16:47:52,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 16:47:55,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 16:47:57,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:47:57,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 16:48:11,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:48:14,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=80026.66666666667, ans=0.0 2023-09-28 16:48:20,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:48:22,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:48:22,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 16:48:25,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:48:26,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=80026.66666666667, ans=0.04949747468305833 2023-09-28 16:48:27,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 16:48:27,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 16:48:28,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:48:30,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:48:33,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 16:48:35,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:48:35,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=80093.33333333333, ans=0.2 2023-09-28 16:48:41,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 16:48:42,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 16:48:44,357 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.017e+02 2.667e+02 3.027e+02 3.668e+02 6.120e+02, threshold=6.055e+02, percent-clipped=2.0 2023-09-28 16:48:44,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=80093.33333333333, ans=0.125 2023-09-28 16:48:48,462 INFO [train.py:1039] (3/4) Epoch 3, batch 1400, loss[loss=0.2933, simple_loss=0.3371, pruned_loss=0.1248, over 23368.00 frames. ], tot_loss[loss=0.2923, simple_loss=0.3381, pruned_loss=0.1233, over 4703021.32 frames. ], batch size: 93, lr: 3.05e-02, grad_scale: 16.0 2023-09-28 16:48:48,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 16:48:48,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=80160.0, ans=0.2 2023-09-28 16:48:50,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:48:54,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=80160.0, ans=0.0 2023-09-28 16:48:55,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:48:55,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:49:00,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 16:49:02,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 16:49:05,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=80226.66666666667, ans=0.125 2023-09-28 16:49:08,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=80226.66666666667, ans=0.125 2023-09-28 16:49:11,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:49:12,170 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=9.83 vs. limit=10.0 2023-09-28 16:49:12,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:49:14,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:49:14,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 16:49:18,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:49:19,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 16:49:24,054 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.23 vs. limit=22.5 2023-09-28 16:49:30,089 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:49:30,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:49:34,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 16:49:34,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:49:34,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:49:36,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:49:37,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:49:39,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:49:39,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:49:39,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:49:40,193 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.60 vs. limit=22.5 2023-09-28 16:49:40,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 16:49:40,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:49:46,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:49:52,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:50:00,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 16:50:02,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 16:50:03,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:50:06,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 16:50:07,164 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.14 vs. limit=10.0 2023-09-28 16:50:07,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:50:07,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:50:10,877 INFO [train.py:1039] (3/4) Epoch 3, batch 1450, loss[loss=0.2688, simple_loss=0.3336, pruned_loss=0.102, over 24437.00 frames. ], tot_loss[loss=0.2919, simple_loss=0.337, pruned_loss=0.1234, over 4696049.70 frames. ], batch size: 69, lr: 3.05e-02, grad_scale: 16.0 2023-09-28 16:50:12,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:50:12,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:50:12,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:14,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 16:50:18,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:50:19,978 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 16:50:20,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:50:20,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 16:50:22,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 16:50:23,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 16:50:23,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:26,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:50:26,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 16:50:27,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:50:27,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 16:50:28,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 16:50:28,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:50:28,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:50:31,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:33,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:50:37,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=80560.0, ans=0.0 2023-09-28 16:50:38,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:50:38,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:50:40,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:50:40,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:43,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:50:43,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:50:43,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:50:45,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:50:48,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 16:50:51,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:50:53,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=80626.66666666667, ans=0.125 2023-09-28 16:50:54,923 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 16:50:56,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:50:56,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:50:57,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:51:01,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 16:51:05,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:51:07,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 16:51:09,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 16:51:11,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:51:11,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=80693.33333333333, ans=0.1 2023-09-28 16:51:14,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:51:14,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:51:15,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 16:51:18,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 16:51:18,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 16:51:18,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=80760.0, ans=0.0 2023-09-28 16:51:19,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:51:21,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 16:51:28,713 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.864e+02 2.628e+02 3.276e+02 3.890e+02 6.376e+02, threshold=6.552e+02, percent-clipped=1.0 2023-09-28 16:51:32,269 INFO [train.py:1039] (3/4) Epoch 3, batch 1500, loss[loss=0.3075, simple_loss=0.3435, pruned_loss=0.1357, over 23496.00 frames. ], tot_loss[loss=0.2923, simple_loss=0.3379, pruned_loss=0.1233, over 4702681.31 frames. ], batch size: 285, lr: 3.04e-02, grad_scale: 16.0 2023-09-28 16:51:35,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 16:51:35,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 16:51:35,518 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:51:35,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:51:37,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:51:39,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 16:51:39,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 16:51:43,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 16:51:43,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 16:51:43,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:51:44,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:51:46,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:51:46,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:51:53,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:51:53,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 16:51:53,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:51:54,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:51:54,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:51:58,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 16:52:02,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 16:52:04,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:52:05,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 16:52:07,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 16:52:12,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:52:12,462 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:52:13,233 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.51 vs. limit=15.0 2023-09-28 16:52:13,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:52:15,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 16:52:15,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:52:15,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:52:15,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 16:52:16,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:52:19,231 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=80960.0, ans=0.1 2023-09-28 16:52:22,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 16:52:22,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 16:52:28,868 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 16:52:31,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 16:52:35,008 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=81026.66666666667, ans=0.125 2023-09-28 16:52:36,390 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 16:52:37,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:52:37,840 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 16:52:39,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:52:42,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:52:44,352 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 16:52:44,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 16:52:47,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 16:52:49,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:52:52,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:52:52,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:52:52,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:52:54,272 INFO [train.py:1039] (3/4) Epoch 3, batch 1550, loss[loss=0.2782, simple_loss=0.3262, pruned_loss=0.1151, over 23256.00 frames. ], tot_loss[loss=0.2922, simple_loss=0.3383, pruned_loss=0.1231, over 4716436.85 frames. ], batch size: 105, lr: 3.04e-02, grad_scale: 16.0 2023-09-28 16:52:54,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:52:54,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 16:52:57,805 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 16:52:57,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 16:52:57,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:52:59,399 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 16:53:00,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 16:53:03,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:53:05,482 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=17.82 vs. limit=22.5 2023-09-28 16:53:06,061 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:53:06,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:53:06,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:53:07,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:53:08,245 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.01 vs. limit=15.0 2023-09-28 16:53:09,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:53:12,050 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 16:53:12,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:53:12,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 16:53:13,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 16:53:16,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 16:53:16,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 16:53:16,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=81226.66666666667, ans=0.125 2023-09-28 16:53:18,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:53:18,267 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 16:53:20,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 16:53:20,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 16:53:21,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:53:23,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:53:25,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:53:29,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 16:53:29,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 16:53:38,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:53:42,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:53:42,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 16:53:42,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:53:42,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=81360.0, ans=0.1 2023-09-28 16:53:44,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 16:53:47,419 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=81360.0, ans=0.0 2023-09-28 16:53:48,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 16:53:50,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:53:51,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=81360.0, ans=0.1 2023-09-28 16:53:53,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:53:55,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:53:55,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:53:55,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 16:53:55,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:53:55,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=81360.0, ans=0.95 2023-09-28 16:54:00,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:54:00,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:54:00,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 16:54:00,582 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 16:54:03,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:54:08,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 16:54:13,772 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.933e+02 2.706e+02 3.074e+02 3.869e+02 6.821e+02, threshold=6.147e+02, percent-clipped=1.0 2023-09-28 16:54:14,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:54:15,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:54:16,852 INFO [train.py:1039] (3/4) Epoch 3, batch 1600, loss[loss=0.3044, simple_loss=0.3534, pruned_loss=0.1277, over 23995.00 frames. ], tot_loss[loss=0.2932, simple_loss=0.3391, pruned_loss=0.1237, over 4698961.29 frames. ], batch size: 80, lr: 3.03e-02, grad_scale: 32.0 2023-09-28 16:54:16,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 16:54:18,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:54:19,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:54:19,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:54:20,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:54:21,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:54:24,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:54:24,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 16:54:26,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 16:54:28,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 16:54:31,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:54:32,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 16:54:33,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:54:36,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:54:36,732 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:54:39,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=81560.0, ans=0.125 2023-09-28 16:54:41,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:54:44,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 16:54:47,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:54:48,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 16:54:49,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:54:49,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 16:54:54,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 16:55:03,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:55:03,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 16:55:05,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:55:05,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:55:05,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:55:08,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 16:55:11,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 16:55:12,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:55:13,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:55:14,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:55:14,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 16:55:18,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 16:55:18,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 16:55:18,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=81693.33333333333, ans=0.0 2023-09-28 16:55:21,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 16:55:25,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=81760.0, ans=0.07 2023-09-28 16:55:27,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:55:29,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:55:29,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=81760.0, ans=0.1 2023-09-28 16:55:30,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 16:55:30,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:55:33,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 16:55:37,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:55:39,155 INFO [train.py:1039] (3/4) Epoch 3, batch 1650, loss[loss=0.2768, simple_loss=0.3407, pruned_loss=0.1064, over 24537.00 frames. ], tot_loss[loss=0.2933, simple_loss=0.3399, pruned_loss=0.1234, over 4708207.02 frames. ], batch size: 71, lr: 3.03e-02, grad_scale: 16.0 2023-09-28 16:55:40,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:55:42,883 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.14 vs. limit=10.0 2023-09-28 16:55:43,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:55:43,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 16:55:43,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 16:55:43,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 16:55:43,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 16:55:46,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:55:48,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:55:48,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:55:48,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 16:55:51,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:55:53,597 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 16:55:55,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:55:55,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:55:55,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:55:55,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 16:55:56,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 16:55:58,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 16:56:02,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 16:56:03,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=81893.33333333333, ans=0.1 2023-09-28 16:56:03,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=81893.33333333333, ans=0.0 2023-09-28 16:56:04,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 16:56:14,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 16:56:16,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:56:17,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 16:56:21,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:56:24,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:56:24,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:56:24,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:56:26,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 16:56:26,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:56:30,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:56:30,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:56:31,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:56:31,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:56:32,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:56:33,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 16:56:37,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 16:56:39,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 16:56:41,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:56:42,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 16:56:42,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 16:56:42,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 16:56:42,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:56:44,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:56:46,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:56:48,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:56:48,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 16:56:51,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:56:52,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:56:53,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:56:56,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 16:56:59,514 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.012e+02 2.428e+02 2.816e+02 3.293e+02 5.315e+02, threshold=5.632e+02, percent-clipped=0.0 2023-09-28 16:57:01,803 INFO [train.py:1039] (3/4) Epoch 3, batch 1700, loss[loss=0.3171, simple_loss=0.372, pruned_loss=0.1311, over 24362.00 frames. ], tot_loss[loss=0.2915, simple_loss=0.3386, pruned_loss=0.1222, over 4703802.35 frames. ], batch size: 77, lr: 3.02e-02, grad_scale: 16.0 2023-09-28 16:57:01,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:57:01,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:57:01,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 16:57:02,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:57:02,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:57:02,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:57:02,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=82160.0, ans=0.0 2023-09-28 16:57:06,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 16:57:06,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 16:57:06,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 16:57:08,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 16:57:10,314 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:57:16,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:57:20,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 16:57:27,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:57:27,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:57:27,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 16:57:28,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:57:32,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 16:57:34,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 16:57:35,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:57:37,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 16:57:37,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 16:57:39,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 16:57:40,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 16:57:43,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:57:45,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 16:57:45,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 16:57:50,802 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.38 vs. limit=15.0 2023-09-28 16:57:53,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:57:54,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=82360.0, ans=0.125 2023-09-28 16:57:57,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:57:57,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 16:58:00,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 16:58:00,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 16:58:00,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:58:01,980 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:58:01,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 16:58:03,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:58:03,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:58:03,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:58:03,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:58:05,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:58:05,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 16:58:06,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=82426.66666666667, ans=0.0 2023-09-28 16:58:07,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:58:09,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 16:58:11,054 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:58:12,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=82426.66666666667, ans=0.2 2023-09-28 16:58:15,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:58:15,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 16:58:18,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:58:20,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:58:21,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 16:58:23,945 INFO [train.py:1039] (3/4) Epoch 3, batch 1750, loss[loss=0.2928, simple_loss=0.3221, pruned_loss=0.1318, over 23453.00 frames. ], tot_loss[loss=0.2898, simple_loss=0.3371, pruned_loss=0.1212, over 4704958.07 frames. ], batch size: 285, lr: 3.02e-02, grad_scale: 16.0 2023-09-28 16:58:25,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=82493.33333333333, ans=0.0 2023-09-28 16:58:28,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:58:32,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:58:32,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 16:58:32,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 16:58:32,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:58:35,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 16:58:37,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:58:42,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 16:58:43,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:58:45,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 16:58:45,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:58:48,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:58:51,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 16:58:53,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 16:58:55,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 16:58:56,411 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 16:59:00,438 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 16:59:05,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 16:59:07,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:59:07,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:59:10,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:59:11,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 16:59:13,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 16:59:14,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:59:17,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:59:18,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 16:59:20,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 16:59:23,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 16:59:25,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 16:59:26,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:59:28,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:59:28,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 16:59:32,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 16:59:33,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 16:59:35,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:59:36,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 16:59:41,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 16:59:43,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 16:59:43,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=82760.0, ans=0.125 2023-09-28 16:59:44,789 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 2.581e+02 2.939e+02 3.799e+02 7.676e+02, threshold=5.877e+02, percent-clipped=7.0 2023-09-28 16:59:46,362 INFO [train.py:1039] (3/4) Epoch 3, batch 1800, loss[loss=0.2842, simple_loss=0.3475, pruned_loss=0.1104, over 24462.00 frames. ], tot_loss[loss=0.2893, simple_loss=0.3364, pruned_loss=0.1211, over 4697533.84 frames. ], batch size: 69, lr: 3.01e-02, grad_scale: 16.0 2023-09-28 16:59:46,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 16:59:46,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 16:59:46,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 16:59:48,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 16:59:48,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 16:59:48,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 16:59:48,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 16:59:48,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=82826.66666666667, ans=0.125 2023-09-28 16:59:49,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 16:59:51,271 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 16:59:53,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 16:59:55,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 16:59:56,071 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.49 vs. limit=15.0 2023-09-28 16:59:56,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:00:00,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:00:01,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:00:05,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:00:06,117 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=11.93 vs. limit=15.0 2023-09-28 17:00:08,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:00:08,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:00:08,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:00:11,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:00:12,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 17:00:12,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:00:16,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:00:18,172 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.47 vs. limit=6.0 2023-09-28 17:00:21,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 17:00:23,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 17:00:23,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 17:00:23,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:00:24,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:00:24,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:00:25,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=82960.0, ans=0.2 2023-09-28 17:00:26,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:00:32,941 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 17:00:34,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:00:37,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:00:37,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 17:00:39,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 17:00:39,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:00:39,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:00:41,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:00:46,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 17:00:46,943 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=83026.66666666667, ans=0.0 2023-09-28 17:00:51,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:00:52,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 17:00:52,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:00:52,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:00:52,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:00:54,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 17:00:54,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=83093.33333333333, ans=0.0 2023-09-28 17:00:58,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:00:58,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:00:59,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 17:00:59,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:01:02,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:01:02,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:01:02,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:01:02,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:01:02,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:01:06,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:01:06,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:01:06,370 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:01:09,494 INFO [train.py:1039] (3/4) Epoch 3, batch 1850, loss[loss=0.3771, simple_loss=0.3798, pruned_loss=0.1872, over 19049.00 frames. ], tot_loss[loss=0.29, simple_loss=0.3372, pruned_loss=0.1214, over 4692080.62 frames. ], batch size: 388, lr: 3.01e-02, grad_scale: 16.0 2023-09-28 17:01:09,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:01:11,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:01:17,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:01:17,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 17:01:21,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=83160.0, ans=0.125 2023-09-28 17:01:22,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 17:01:25,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 17:01:28,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:01:28,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 17:01:28,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 17:01:40,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:01:42,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 17:01:45,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:01:45,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:01:48,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=83293.33333333333, ans=0.1 2023-09-28 17:01:49,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 17:01:49,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:01:49,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:01:51,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:01:53,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:01:56,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:02:00,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:02:00,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:02:00,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 17:02:00,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:02:02,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:02:04,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:02:08,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 17:02:08,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:02:12,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:02:14,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:02:14,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 17:02:14,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 17:02:17,674 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 17:02:19,211 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 17:02:21,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:02:21,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:02:21,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:02:22,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:02:24,524 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 17:02:24,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:02:24,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:02:26,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 17:02:27,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:02:30,304 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.984e+02 2.645e+02 2.967e+02 3.523e+02 5.465e+02, threshold=5.934e+02, percent-clipped=0.0 2023-09-28 17:02:30,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:02:30,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 17:02:31,881 INFO [train.py:1039] (3/4) Epoch 3, batch 1900, loss[loss=0.3029, simple_loss=0.3538, pruned_loss=0.126, over 24060.00 frames. ], tot_loss[loss=0.2903, simple_loss=0.3379, pruned_loss=0.1214, over 4705053.22 frames. ], batch size: 80, lr: 3.01e-02, grad_scale: 16.0 2023-09-28 17:02:32,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:02:32,170 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 17:02:32,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:02:33,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:02:34,292 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.93 vs. limit=15.0 2023-09-28 17:02:39,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:02:41,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:02:43,739 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 17:02:43,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 17:02:47,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:02:47,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:02:47,461 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 17:02:48,893 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 17:02:51,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 17:02:52,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:02:55,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 17:02:59,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 17:03:04,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=83626.66666666667, ans=0.2 2023-09-28 17:03:10,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 17:03:13,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 17:03:13,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:03:13,298 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 17:03:13,305 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 17:03:13,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 17:03:14,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 17:03:14,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:03:19,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 17:03:22,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:03:26,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:03:26,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 17:03:26,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:03:29,590 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.93 vs. limit=22.5 2023-09-28 17:03:32,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 17:03:33,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:03:40,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:03:41,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:03:41,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:03:43,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:03:44,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:03:44,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:03:46,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:03:47,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:03:47,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:03:48,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=83760.0, ans=0.1 2023-09-28 17:03:51,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:03:51,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:03:52,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:03:54,189 INFO [train.py:1039] (3/4) Epoch 3, batch 1950, loss[loss=0.2837, simple_loss=0.3345, pruned_loss=0.1165, over 23646.00 frames. ], tot_loss[loss=0.2904, simple_loss=0.3381, pruned_loss=0.1214, over 4709878.98 frames. ], batch size: 85, lr: 3.00e-02, grad_scale: 16.0 2023-09-28 17:03:54,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:03:56,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=83826.66666666667, ans=0.0 2023-09-28 17:03:57,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=83826.66666666667, ans=0.0 2023-09-28 17:04:00,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:04:02,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:04:02,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:04:02,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:04:05,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 17:04:05,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 17:04:07,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:04:07,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:04:10,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:04:10,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:04:10,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:04:12,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=83893.33333333333, ans=0.0 2023-09-28 17:04:14,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:04:17,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:04:17,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:04:17,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:04:17,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:04:21,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:04:24,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:04:24,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:04:24,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:04:24,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 17:04:26,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:04:27,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:04:27,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:04:31,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:04:35,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:04:40,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:04:44,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:04:44,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:04:44,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 17:04:45,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:04:49,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:04:50,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:04:52,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:04:55,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=84026.66666666667, ans=0.0 2023-09-28 17:04:59,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:05:00,050 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:05:03,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:05:03,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=84093.33333333333, ans=0.125 2023-09-28 17:05:06,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:05:08,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:05:09,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:05:09,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 17:05:09,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:05:11,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:05:12,933 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 17:05:14,269 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.050e+02 2.608e+02 2.981e+02 3.638e+02 7.272e+02, threshold=5.963e+02, percent-clipped=1.0 2023-09-28 17:05:14,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:05:16,367 INFO [train.py:1039] (3/4) Epoch 3, batch 2000, loss[loss=0.2986, simple_loss=0.3582, pruned_loss=0.1195, over 24665.00 frames. ], tot_loss[loss=0.2896, simple_loss=0.3378, pruned_loss=0.1207, over 4724009.33 frames. ], batch size: 73, lr: 3.00e-02, grad_scale: 32.0 2023-09-28 17:05:18,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:05:19,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:05:19,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:05:20,938 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.51 vs. limit=6.0 2023-09-28 17:05:21,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:05:23,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:05:26,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 17:05:26,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:05:27,335 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.65 vs. limit=6.0 2023-09-28 17:05:29,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:05:30,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 17:05:31,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:05:31,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:05:35,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:05:36,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 17:05:37,283 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=84226.66666666667, ans=0.2 2023-09-28 17:05:38,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:05:40,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:05:40,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:05:41,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 17:05:41,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:05:44,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 17:05:44,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:05:46,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=84226.66666666667, ans=0.125 2023-09-28 17:05:48,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:05:49,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 17:05:49,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:05:51,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:05:51,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:05:53,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 17:05:58,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 17:05:58,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:05:58,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:06:04,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:06:05,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:06:05,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:06:05,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:06:08,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:06:09,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:06:09,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:06:09,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:06:09,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=84360.0, ans=0.125 2023-09-28 17:06:09,979 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:06:11,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:15,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:06:15,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 17:06:19,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:06:21,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:06:25,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:06:27,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:06:30,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:33,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:06:33,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:34,170 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.57 vs. limit=22.5 2023-09-28 17:06:35,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 17:06:35,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:06:38,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:06:39,527 INFO [train.py:1039] (3/4) Epoch 3, batch 2050, loss[loss=0.2729, simple_loss=0.3229, pruned_loss=0.1115, over 23703.00 frames. ], tot_loss[loss=0.2885, simple_loss=0.3364, pruned_loss=0.1203, over 4722788.21 frames. ], batch size: 149, lr: 2.99e-02, grad_scale: 32.0 2023-09-28 17:06:39,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:40,740 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.46 vs. limit=15.0 2023-09-28 17:06:43,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:06:43,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:50,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:06:53,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:06:53,451 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:06:54,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:06:56,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 17:06:56,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:06:58,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:06:58,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:07:10,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:07:10,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:07:13,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 17:07:15,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:07:15,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 17:07:17,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:07:17,971 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.12 vs. limit=12.0 2023-09-28 17:07:18,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:07:21,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:07:22,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 17:07:22,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:07:23,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:07:25,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:07:25,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:07:28,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:07:30,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:07:32,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:07:35,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:07:40,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:07:45,400 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:07:46,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 17:07:47,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=84760.0, ans=0.0 2023-09-28 17:07:48,921 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=84760.0, ans=0.125 2023-09-28 17:07:51,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:07:53,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:07:56,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:07:57,874 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=84760.0, ans=0.125 2023-09-28 17:07:59,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 17:07:59,386 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=84760.0, ans=0.09899494936611666 2023-09-28 17:08:01,497 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 17:08:01,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:08:01,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=84826.66666666667, ans=0.125 2023-09-28 17:08:02,774 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.062e+02 2.817e+02 3.171e+02 3.803e+02 7.947e+02, threshold=6.342e+02, percent-clipped=1.0 2023-09-28 17:08:02,816 INFO [train.py:1039] (3/4) Epoch 3, batch 2100, loss[loss=0.2603, simple_loss=0.314, pruned_loss=0.1033, over 24325.00 frames. ], tot_loss[loss=0.2872, simple_loss=0.3356, pruned_loss=0.1194, over 4732115.19 frames. ], batch size: 61, lr: 2.99e-02, grad_scale: 16.0 2023-09-28 17:08:02,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:08:03,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:08:04,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:08:04,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 17:08:04,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 17:08:07,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:08:10,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:08:11,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:08:15,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:08:16,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:08:16,731 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 17:08:16,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:08:16,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 17:08:16,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 17:08:19,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:08:19,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:08:19,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 17:08:20,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 17:08:28,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 17:08:28,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:08:31,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:08:31,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:08:34,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:08:36,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 17:08:36,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:08:36,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 17:08:39,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 17:08:39,518 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:08:39,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 17:08:41,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 17:08:41,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 17:08:41,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=84960.0, ans=0.125 2023-09-28 17:08:42,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:08:44,358 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:08:48,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:08:49,479 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.04 vs. limit=15.0 2023-09-28 17:08:50,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:08:53,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:08:54,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:08:54,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 17:08:54,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:08:54,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:08:55,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:08:55,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 17:08:57,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 17:08:58,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 17:09:03,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:09:06,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=85026.66666666667, ans=0.0 2023-09-28 17:09:08,054 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:09:09,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 17:09:15,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:09:16,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:09:18,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:09:18,225 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:09:18,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 17:09:18,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:09:19,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:09:19,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:09:21,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:09:23,584 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:09:23,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 17:09:25,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 17:09:25,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:09:27,911 INFO [train.py:1039] (3/4) Epoch 3, batch 2150, loss[loss=0.2639, simple_loss=0.3188, pruned_loss=0.1044, over 24331.00 frames. ], tot_loss[loss=0.2867, simple_loss=0.3351, pruned_loss=0.1192, over 4725128.45 frames. ], batch size: 61, lr: 2.98e-02, grad_scale: 16.0 2023-09-28 17:09:31,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:09:31,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:09:31,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:09:32,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:09:37,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 17:09:40,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:09:40,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:09:42,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=85226.66666666667, ans=0.0 2023-09-28 17:09:43,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:09:43,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:09:43,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:09:47,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:09:47,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:09:47,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:09:52,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:09:52,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 17:09:55,445 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.42 vs. limit=22.5 2023-09-28 17:09:57,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:10:00,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:10:01,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:01,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:10:01,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:01,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:10:04,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:10:04,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:10:05,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:10:07,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 17:10:08,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:10:10,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:10:10,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:10:11,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:10:13,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:10:16,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:10:16,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:10:18,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:10:18,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 17:10:18,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:10:21,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:10:21,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:22,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:10:23,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:10:24,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:10:24,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:25,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=85360.0, ans=0.0 2023-09-28 17:10:26,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 17:10:28,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 17:10:28,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:10:29,674 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 17:10:29,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:10:29,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:10:31,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 17:10:31,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:10:31,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 17:10:31,311 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 17:10:31,312 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 17:10:33,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 17:10:35,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:10:35,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:10:35,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:10:37,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:38,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 17:10:40,289 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=16.83 vs. limit=15.0 2023-09-28 17:10:40,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:10:40,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:41,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=85426.66666666667, ans=0.125 2023-09-28 17:10:44,969 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.82 vs. limit=6.0 2023-09-28 17:10:46,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=85426.66666666667, ans=0.0 2023-09-28 17:10:48,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:10:49,145 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=85493.33333333333, ans=0.2 2023-09-28 17:10:50,164 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.981e+02 2.450e+02 2.912e+02 3.382e+02 5.716e+02, threshold=5.824e+02, percent-clipped=0.0 2023-09-28 17:10:50,237 INFO [train.py:1039] (3/4) Epoch 3, batch 2200, loss[loss=0.2817, simple_loss=0.3204, pruned_loss=0.1215, over 23718.00 frames. ], tot_loss[loss=0.2859, simple_loss=0.3351, pruned_loss=0.1183, over 4741176.98 frames. ], batch size: 212, lr: 2.98e-02, grad_scale: 16.0 2023-09-28 17:10:50,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 17:10:53,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:10:56,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:10:58,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:10:59,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:11:01,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:11:05,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:11:06,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:11:06,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 17:11:12,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 17:11:14,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 17:11:16,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=85560.0, ans=0.125 2023-09-28 17:11:19,078 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:11:20,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 17:11:20,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=85560.0, ans=0.1 2023-09-28 17:11:21,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:11:23,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:11:23,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:11:28,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:11:28,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 17:11:29,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=85626.66666666667, ans=0.125 2023-09-28 17:11:30,775 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.01 vs. limit=6.0 2023-09-28 17:11:31,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:11:32,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:11:34,343 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 17:11:34,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=85626.66666666667, ans=0.2 2023-09-28 17:11:38,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:11:39,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:11:41,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:11:42,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:11:45,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 17:11:46,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:11:48,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 17:11:50,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:11:50,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:11:52,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:11:52,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=85693.33333333333, ans=0.2 2023-09-28 17:11:53,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:11:55,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:11:55,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:11:55,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:11:58,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 17:11:58,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:11:59,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:12:02,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 17:12:02,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:12:06,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:12:07,723 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 17:12:09,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:12:11,178 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 17:12:13,189 INFO [train.py:1039] (3/4) Epoch 3, batch 2250, loss[loss=0.3083, simple_loss=0.3681, pruned_loss=0.1243, over 24558.00 frames. ], tot_loss[loss=0.2859, simple_loss=0.3353, pruned_loss=0.1182, over 4750015.92 frames. ], batch size: 71, lr: 2.97e-02, grad_scale: 16.0 2023-09-28 17:12:13,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 17:12:13,363 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 17:12:14,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:12:15,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:12:16,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:12:18,598 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 17:12:21,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:12:23,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:12:28,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:12:29,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:12:34,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:12:34,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:12:34,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:12:34,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=85893.33333333333, ans=0.0 2023-09-28 17:12:37,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 17:12:37,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:12:37,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:12:40,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 17:12:40,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:12:40,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:12:43,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:12:45,830 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=85960.0, ans=0.125 2023-09-28 17:12:47,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:12:49,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 17:12:49,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 17:12:50,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 17:12:52,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:12:55,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:13:01,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:13:04,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:13:04,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:13:04,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:13:07,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:13:09,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:13:09,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=86026.66666666667, ans=0.1 2023-09-28 17:13:12,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:13:15,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 17:13:16,456 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.48 vs. limit=22.5 2023-09-28 17:13:22,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:13:22,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:13:22,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:13:26,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=86093.33333333333, ans=0.0 2023-09-28 17:13:29,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 17:13:32,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:13:32,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 17:13:32,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:13:32,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:13:35,616 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.002e+02 2.481e+02 2.992e+02 3.507e+02 5.214e+02, threshold=5.985e+02, percent-clipped=0.0 2023-09-28 17:13:35,666 INFO [train.py:1039] (3/4) Epoch 3, batch 2300, loss[loss=0.2876, simple_loss=0.328, pruned_loss=0.1236, over 23303.00 frames. ], tot_loss[loss=0.287, simple_loss=0.3358, pruned_loss=0.1191, over 4751247.53 frames. ], batch size: 119, lr: 2.97e-02, grad_scale: 16.0 2023-09-28 17:13:35,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 17:13:36,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=86160.0, ans=0.125 2023-09-28 17:13:38,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:13:39,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:13:43,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:13:43,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:13:45,550 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 17:13:47,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:13:55,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:13:55,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 17:13:55,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:13:57,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:13:57,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 17:13:59,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:14:02,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:14:02,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:14:04,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=86226.66666666667, ans=0.95 2023-09-28 17:14:07,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:14:10,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:14:13,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:14:21,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:14:21,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:14:24,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:14:26,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:14:31,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:14:31,728 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=86360.0, ans=0.1 2023-09-28 17:14:33,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:14:33,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:14:33,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 17:14:38,782 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 17:14:38,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:14:38,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:14:38,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:14:38,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:14:39,712 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.08 vs. limit=15.0 2023-09-28 17:14:40,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 17:14:40,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:14:40,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 17:14:40,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:14:40,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:14:43,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 17:14:49,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:14:52,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:14:57,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:14:57,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:14:57,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:14:58,592 INFO [train.py:1039] (3/4) Epoch 3, batch 2350, loss[loss=0.2839, simple_loss=0.3398, pruned_loss=0.114, over 24395.00 frames. ], tot_loss[loss=0.2892, simple_loss=0.3374, pruned_loss=0.1205, over 4738029.39 frames. ], batch size: 77, lr: 2.97e-02, grad_scale: 16.0 2023-09-28 17:14:58,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:14:58,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:15:00,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:15:00,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 17:15:04,856 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.38 vs. limit=22.5 2023-09-28 17:15:07,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:15:07,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 17:15:12,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 17:15:16,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:15:19,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:15:19,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:15:19,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:15:19,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:15:21,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 17:15:24,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:15:26,287 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=86560.0, ans=22.5 2023-09-28 17:15:30,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 17:15:33,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:15:34,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:15:34,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:15:38,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:15:38,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=86626.66666666667, ans=0.1 2023-09-28 17:15:39,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 17:15:41,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:15:44,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=86626.66666666667, ans=0.0 2023-09-28 17:15:45,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:15:45,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:15:45,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:15:48,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:15:50,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 17:15:52,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:15:53,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:15:55,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:15:55,762 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=86693.33333333333, ans=0.125 2023-09-28 17:15:56,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 17:15:57,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:16:01,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 17:16:01,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:16:06,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 17:16:07,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 17:16:09,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:16:09,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 17:16:10,631 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 17:16:10,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 17:16:13,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 17:16:16,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:16:20,374 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.814e+02 2.689e+02 3.044e+02 3.623e+02 6.836e+02, threshold=6.088e+02, percent-clipped=1.0 2023-09-28 17:16:20,415 INFO [train.py:1039] (3/4) Epoch 3, batch 2400, loss[loss=0.2391, simple_loss=0.3016, pruned_loss=0.08833, over 24623.00 frames. ], tot_loss[loss=0.2882, simple_loss=0.3368, pruned_loss=0.1198, over 4741315.70 frames. ], batch size: 60, lr: 2.96e-02, grad_scale: 32.0 2023-09-28 17:16:21,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:16:24,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=86826.66666666667, ans=0.125 2023-09-28 17:16:27,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:16:28,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:16:28,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 17:16:30,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 17:16:36,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:16:36,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:16:36,971 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=86893.33333333333, ans=0.2 2023-09-28 17:16:39,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 17:16:39,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:16:39,931 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:16:39,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 17:16:41,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=86893.33333333333, ans=0.125 2023-09-28 17:16:44,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:16:49,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 17:16:55,020 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:16:56,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:17:00,753 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 17:17:01,502 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.92 vs. limit=22.5 2023-09-28 17:17:05,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:17:05,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:17:09,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:17:09,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 17:17:10,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:17:16,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:17:16,733 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=87026.66666666667, ans=0.0 2023-09-28 17:17:19,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:17:21,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:17:23,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:17:23,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 17:17:23,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:17:23,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:17:24,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:17:24,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:17:29,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:17:31,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:17:31,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 17:17:34,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 17:17:35,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:17:37,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:17:37,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 17:17:38,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 17:17:38,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 17:17:38,870 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 17:17:39,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 17:17:39,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=87093.33333333333, ans=0.0 2023-09-28 17:17:40,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:17:40,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:17:40,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:17:42,261 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 17:17:43,658 INFO [train.py:1039] (3/4) Epoch 3, batch 2450, loss[loss=0.2828, simple_loss=0.3073, pruned_loss=0.1292, over 23594.00 frames. ], tot_loss[loss=0.2872, simple_loss=0.3344, pruned_loss=0.12, over 4704240.93 frames. ], batch size: 256, lr: 2.96e-02, grad_scale: 32.0 2023-09-28 17:17:43,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:17:43,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 17:17:48,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:17:48,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:17:51,833 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=87160.0, ans=0.125 2023-09-28 17:17:53,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:17:53,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:17:53,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 17:17:57,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:17:57,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:18:00,699 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.97 vs. limit=6.0 2023-09-28 17:18:03,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:18:03,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:18:03,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:18:03,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 17:18:09,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:18:10,538 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.95 vs. limit=6.0 2023-09-28 17:18:11,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:18:11,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=87226.66666666667, ans=0.1 2023-09-28 17:18:12,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:18:15,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:18:15,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:18:15,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:18:17,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:18:18,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 17:18:20,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:18:29,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:18:29,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:18:31,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:18:31,283 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:18:33,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:18:35,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:18:36,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 17:18:40,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:18:40,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:18:42,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:18:43,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:18:49,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:18:49,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 17:18:51,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:18:52,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:18:52,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 17:18:54,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:18:54,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:18:54,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=87426.66666666667, ans=0.1 2023-09-28 17:18:57,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:19:00,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:19:00,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:19:04,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 17:19:05,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:19:07,574 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.996e+02 2.571e+02 3.066e+02 3.811e+02 5.963e+02, threshold=6.132e+02, percent-clipped=0.0 2023-09-28 17:19:07,626 INFO [train.py:1039] (3/4) Epoch 3, batch 2500, loss[loss=0.3208, simple_loss=0.3474, pruned_loss=0.1471, over 22850.00 frames. ], tot_loss[loss=0.2856, simple_loss=0.3331, pruned_loss=0.119, over 4703594.24 frames. ], batch size: 322, lr: 2.95e-02, grad_scale: 32.0 2023-09-28 17:19:13,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:19:22,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:19:22,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:19:23,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:19:23,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 17:19:31,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:19:33,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:19:33,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 17:19:34,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 17:19:37,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 17:19:37,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:19:38,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:19:38,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 17:19:38,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:19:38,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 17:19:40,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:19:42,634 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.92 vs. limit=6.0 2023-09-28 17:19:44,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:19:44,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:19:48,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:19:49,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 17:19:51,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:19:51,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:19:56,319 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:19:59,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:20:02,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:20:08,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:20:13,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 17:20:13,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:20:13,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:20:14,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:20:14,869 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:20:15,716 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 17:20:15,717 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 17:20:15,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 17:20:19,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:20:21,023 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=7.35 vs. limit=15.0 2023-09-28 17:20:21,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 17:20:21,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 17:20:23,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:20:24,772 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 17:20:27,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 17:20:30,812 INFO [train.py:1039] (3/4) Epoch 3, batch 2550, loss[loss=0.2395, simple_loss=0.2989, pruned_loss=0.09002, over 24561.00 frames. ], tot_loss[loss=0.2863, simple_loss=0.3336, pruned_loss=0.1195, over 4698585.67 frames. ], batch size: 60, lr: 2.95e-02, grad_scale: 32.0 2023-09-28 17:20:30,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:20:33,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:20:35,400 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:20:35,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:20:37,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 17:20:37,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:20:41,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 17:20:43,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:20:45,554 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:20:47,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:20:47,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 17:20:49,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 17:20:49,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:20:49,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:20:53,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:20:53,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 17:20:53,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:20:53,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:20:53,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 17:20:53,912 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.95 vs. limit=6.0 2023-09-28 17:21:08,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:21:13,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:21:13,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:21:13,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:21:15,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:21:22,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:21:25,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 17:21:25,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:21:25,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:21:25,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 17:21:26,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:21:29,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:21:29,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:21:34,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:21:36,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 17:21:36,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:21:37,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:21:37,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:21:39,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:21:40,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:21:49,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:21:51,283 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:21:53,263 INFO [train.py:1039] (3/4) Epoch 3, batch 2600, loss[loss=0.2694, simple_loss=0.3331, pruned_loss=0.1028, over 24488.00 frames. ], tot_loss[loss=0.2853, simple_loss=0.3336, pruned_loss=0.1185, over 4714490.23 frames. ], batch size: 66, lr: 2.95e-02, grad_scale: 16.0 2023-09-28 17:21:54,708 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.952e+02 2.618e+02 3.140e+02 3.668e+02 6.690e+02, threshold=6.281e+02, percent-clipped=1.0 2023-09-28 17:21:54,949 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 17:21:58,527 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 17:21:58,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:22:00,071 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 17:22:00,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 17:22:00,221 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 17:22:03,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:22:03,337 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 17:22:05,515 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 17:22:07,044 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 17:22:09,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:22:10,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 17:22:12,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 17:22:13,816 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:22:13,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 17:22:16,839 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 17:22:16,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 17:22:24,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:22:24,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:22:24,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:22:24,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 17:22:27,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:22:35,239 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 17:22:40,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:22:42,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:22:42,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 17:22:42,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:22:42,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:22:44,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 17:22:46,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:22:46,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:22:47,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:22:52,211 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 17:22:53,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:22:53,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:23:00,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:23:00,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:23:00,465 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 17:23:01,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:23:03,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:23:05,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:23:11,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 17:23:11,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:23:13,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=88426.66666666667, ans=0.0 2023-09-28 17:23:15,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:23:15,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=88493.33333333333, ans=0.0 2023-09-28 17:23:16,474 INFO [train.py:1039] (3/4) Epoch 3, batch 2650, loss[loss=0.3154, simple_loss=0.3484, pruned_loss=0.1412, over 24013.00 frames. ], tot_loss[loss=0.2857, simple_loss=0.3344, pruned_loss=0.1185, over 4717471.62 frames. ], batch size: 196, lr: 2.94e-02, grad_scale: 16.0 2023-09-28 17:23:20,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 17:23:21,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:23:21,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:23:22,778 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.53 vs. limit=10.0 2023-09-28 17:23:23,368 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 17:23:23,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:23:24,311 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.87 vs. limit=15.0 2023-09-28 17:23:24,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:23:25,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=88493.33333333333, ans=0.0 2023-09-28 17:23:28,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:23:29,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:23:32,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:23:34,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 17:23:34,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:23:34,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:23:37,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 17:23:39,588 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 17:23:43,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:23:46,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 17:23:46,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:23:46,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 17:23:51,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:23:51,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:23:51,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:23:51,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:23:52,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=88626.66666666667, ans=0.125 2023-09-28 17:23:53,844 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.82 vs. limit=15.0 2023-09-28 17:23:56,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 17:23:58,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 17:23:59,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:24:02,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 17:24:02,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:24:02,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:02,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:24:04,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:24:04,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:24:05,395 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.18 vs. limit=15.0 2023-09-28 17:24:06,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:24:08,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:24:09,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:24:09,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:24:11,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:24:13,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:24:14,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:24:14,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:24:16,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:24:16,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 17:24:20,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:23,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:24:24,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:24:24,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 17:24:27,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:24:29,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:32,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:34,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:24:35,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:24:35,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:24:37,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:24:37,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 17:24:38,891 INFO [train.py:1039] (3/4) Epoch 3, batch 2700, loss[loss=0.2364, simple_loss=0.3, pruned_loss=0.08639, over 24332.00 frames. ], tot_loss[loss=0.2864, simple_loss=0.3349, pruned_loss=0.1189, over 4721915.32 frames. ], batch size: 61, lr: 2.94e-02, grad_scale: 16.0 2023-09-28 17:24:40,991 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.036e+02 2.674e+02 3.068e+02 3.788e+02 5.664e+02, threshold=6.136e+02, percent-clipped=0.0 2023-09-28 17:24:41,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:24:41,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=88826.66666666667, ans=0.125 2023-09-28 17:24:42,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 17:24:44,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:24:45,172 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.75 vs. limit=6.0 2023-09-28 17:24:46,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:24:46,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:24:49,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:24:49,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:24:49,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:24:49,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 17:24:50,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 17:24:52,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:24:52,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:24:54,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:24:54,247 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:24:58,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:25:01,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 17:25:01,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:25:05,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:25:05,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:25:08,793 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.09 vs. limit=10.0 2023-09-28 17:25:12,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:25:12,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:25:14,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:25:14,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:25:17,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:25:21,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:25:22,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:25:22,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:25:27,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:25:27,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:25:34,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:25:36,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:25:39,775 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:25:39,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:25:44,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:25:44,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:25:46,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:25:48,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:25:49,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:25:49,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:25:53,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:25:54,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:25:54,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:25:56,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=89093.33333333333, ans=0.1 2023-09-28 17:25:57,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 17:25:59,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:26:02,663 INFO [train.py:1039] (3/4) Epoch 3, batch 2750, loss[loss=0.3096, simple_loss=0.3421, pruned_loss=0.1385, over 23832.00 frames. ], tot_loss[loss=0.2856, simple_loss=0.3341, pruned_loss=0.1186, over 4733420.68 frames. ], batch size: 164, lr: 2.93e-02, grad_scale: 16.0 2023-09-28 17:26:02,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:26:02,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 17:26:04,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 17:26:04,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:26:07,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:26:07,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:26:11,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:11,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:26:11,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:15,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:26:17,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 17:26:17,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:26:17,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:17,637 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 17:26:17,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:26:17,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:26:19,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=89226.66666666667, ans=0.0 2023-09-28 17:26:24,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 17:26:27,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:26:27,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:29,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:26:29,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 17:26:30,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:26:32,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:26:32,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:26:34,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:26:37,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:26:37,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 17:26:37,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=89293.33333333333, ans=0.1 2023-09-28 17:26:37,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=89293.33333333333, ans=0.1 2023-09-28 17:26:39,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:26:39,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:42,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 17:26:47,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:26:49,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:26:49,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:26:54,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:26:54,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:26:54,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:26:54,885 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=89360.0, ans=0.125 2023-09-28 17:26:59,903 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=89360.0, ans=0.0 2023-09-28 17:27:01,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:27:03,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:27:03,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 17:27:07,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:27:09,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 17:27:13,194 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=89426.66666666667, ans=0.0 2023-09-28 17:27:14,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 17:27:17,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:27:17,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 17:27:19,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:27:23,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:27:23,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 17:27:23,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:27:26,210 INFO [train.py:1039] (3/4) Epoch 3, batch 2800, loss[loss=0.2876, simple_loss=0.3441, pruned_loss=0.1156, over 24452.00 frames. ], tot_loss[loss=0.2851, simple_loss=0.333, pruned_loss=0.1186, over 4724306.09 frames. ], batch size: 69, lr: 2.93e-02, grad_scale: 32.0 2023-09-28 17:27:27,576 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.002e+02 2.563e+02 3.005e+02 3.573e+02 5.260e+02, threshold=6.010e+02, percent-clipped=0.0 2023-09-28 17:27:27,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 17:27:27,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:27:27,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:27:29,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 17:27:29,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:27:29,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:27:31,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:27:32,603 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 17:27:32,604 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 17:27:35,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:27:37,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:27:37,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:27:42,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:27:42,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=89560.0, ans=0.125 2023-09-28 17:27:44,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 17:27:47,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 17:27:49,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 17:27:50,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:27:50,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:27:50,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:27:54,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:27:54,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:27:54,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 17:27:56,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:28:04,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:28:07,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:28:10,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:28:10,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:28:11,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:28:17,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:28:17,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 17:28:17,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:28:20,276 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=89693.33333333333, ans=0.05 2023-09-28 17:28:21,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:28:21,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:28:23,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=89693.33333333333, ans=0.0 2023-09-28 17:28:24,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:28:24,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=89693.33333333333, ans=0.0 2023-09-28 17:28:25,183 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.49 vs. limit=15.0 2023-09-28 17:28:25,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:28:26,840 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.51 vs. limit=12.0 2023-09-28 17:28:30,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:28:32,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:28:32,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:28:32,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:28:32,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:28:32,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:28:34,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:28:34,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 17:28:34,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:28:36,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:28:36,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:28:38,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 17:28:39,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:28:39,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:28:40,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:28:43,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 17:28:49,357 INFO [train.py:1039] (3/4) Epoch 3, batch 2850, loss[loss=0.2765, simple_loss=0.3186, pruned_loss=0.1172, over 23774.00 frames. ], tot_loss[loss=0.2839, simple_loss=0.3326, pruned_loss=0.1177, over 4733831.03 frames. ], batch size: 232, lr: 2.92e-02, grad_scale: 32.0 2023-09-28 17:28:49,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:28:49,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:28:51,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:28:52,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:28:56,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:28:56,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:28:56,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:29:01,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:29:01,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:29:02,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:29:02,746 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 17:29:08,903 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=89893.33333333333, ans=0.125 2023-09-28 17:29:10,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 17:29:10,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:29:12,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 17:29:13,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:29:16,885 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.35 vs. limit=15.0 2023-09-28 17:29:17,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 17:29:17,727 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 17:29:19,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:29:22,455 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.22 vs. limit=12.0 2023-09-28 17:29:28,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=89960.0, ans=0.0 2023-09-28 17:29:30,173 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=89960.0, ans=0.125 2023-09-28 17:29:31,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:29:33,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:29:33,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:29:34,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 17:29:34,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:29:34,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:29:37,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:29:37,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 17:29:41,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:29:41,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:29:41,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:29:41,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:29:44,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:29:46,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:29:46,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:29:48,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:29:51,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:29:52,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:29:52,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:29:53,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:29:58,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:30:00,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 17:30:00,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 17:30:03,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:30:05,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:30:05,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 17:30:05,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:30:06,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:30:06,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:30:06,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:30:06,928 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 17:30:08,394 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 17:30:08,400 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:30:08,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:30:13,040 INFO [train.py:1039] (3/4) Epoch 3, batch 2900, loss[loss=0.2963, simple_loss=0.3563, pruned_loss=0.1182, over 23996.00 frames. ], tot_loss[loss=0.2837, simple_loss=0.3327, pruned_loss=0.1174, over 4741720.99 frames. ], batch size: 80, lr: 2.92e-02, grad_scale: 32.0 2023-09-28 17:30:13,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 17:30:15,029 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.860e+02 2.599e+02 2.941e+02 3.399e+02 5.344e+02, threshold=5.883e+02, percent-clipped=0.0 2023-09-28 17:30:15,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:30:15,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:30:15,932 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.56 vs. limit=15.0 2023-09-28 17:30:17,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 17:30:22,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:30:22,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 17:30:22,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 17:30:24,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:30:24,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:30:26,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:30:27,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:30:32,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:30:32,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:30:32,407 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:30:37,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:30:37,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 17:30:38,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:30:40,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:30:43,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 17:30:43,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 17:30:48,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:30:48,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 17:30:48,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:30:49,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:30:51,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 17:30:52,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:30:53,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:30:53,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=90293.33333333333, ans=0.07 2023-09-28 17:30:55,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:30:55,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=90293.33333333333, ans=0.0 2023-09-28 17:30:59,004 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:31:00,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 17:31:00,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 17:31:00,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:31:04,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:31:06,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 17:31:06,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:31:11,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:31:21,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:31:21,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:31:23,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 17:31:23,671 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=90426.66666666667, ans=0.1 2023-09-28 17:31:23,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=90426.66666666667, ans=0.125 2023-09-28 17:31:27,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:31:27,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 17:31:28,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:31:29,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:31:33,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=90426.66666666667, ans=0.2 2023-09-28 17:31:35,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:31:36,416 INFO [train.py:1039] (3/4) Epoch 3, batch 2950, loss[loss=0.2869, simple_loss=0.3333, pruned_loss=0.1202, over 23563.00 frames. ], tot_loss[loss=0.2863, simple_loss=0.3346, pruned_loss=0.119, over 4728768.33 frames. ], batch size: 256, lr: 2.92e-02, grad_scale: 32.0 2023-09-28 17:31:36,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 17:31:36,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=90493.33333333333, ans=0.5 2023-09-28 17:31:38,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:31:38,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:31:39,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:31:41,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:31:43,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 17:31:44,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 17:31:46,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:31:46,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:31:46,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=90493.33333333333, ans=0.125 2023-09-28 17:31:47,205 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.48 vs. limit=15.0 2023-09-28 17:31:52,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:31:55,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:31:57,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:31:57,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:32:02,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:32:02,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:32:04,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:32:06,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:32:06,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:32:06,473 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:32:07,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 17:32:12,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 17:32:12,784 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 17:32:12,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:32:13,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=90626.66666666667, ans=0.125 2023-09-28 17:32:13,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=90626.66666666667, ans=0.1 2023-09-28 17:32:14,472 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 17:32:15,265 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.32 vs. limit=15.0 2023-09-28 17:32:15,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 17:32:16,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:32:18,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:32:18,014 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 17:32:18,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 17:32:22,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 17:32:22,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:32:22,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:32:25,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:32:28,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:32:28,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:32:28,821 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 17:32:28,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:32:30,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 17:32:34,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_na.min_abs, batch_count=90693.33333333333, ans=0.02 2023-09-28 17:32:36,046 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:32:38,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:32:38,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 17:32:38,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:32:40,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 17:32:42,280 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=90760.0, ans=0.0 2023-09-28 17:32:43,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:32:43,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:32:45,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:32:46,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:32:46,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 17:32:47,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:32:47,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=90760.0, ans=0.125 2023-09-28 17:32:47,722 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=14.17 vs. limit=15.0 2023-09-28 17:32:48,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:32:48,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:32:49,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:32:50,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:32:51,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:32:54,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:32:54,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 17:32:56,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:32:59,409 INFO [train.py:1039] (3/4) Epoch 3, batch 3000, loss[loss=0.296, simple_loss=0.3599, pruned_loss=0.1161, over 24447.00 frames. ], tot_loss[loss=0.2867, simple_loss=0.3354, pruned_loss=0.1191, over 4734408.67 frames. ], batch size: 69, lr: 2.91e-02, grad_scale: 32.0 2023-09-28 17:32:59,409 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-28 17:33:13,931 INFO [train.py:1071] (3/4) Epoch 3, validation: loss=0.3974, simple_loss=0.3326, pruned_loss=0.2311, over 1125622.00 frames. 2023-09-28 17:33:13,932 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-28 17:33:15,398 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.839e+02 2.502e+02 2.937e+02 3.419e+02 4.607e+02, threshold=5.874e+02, percent-clipped=0.0 2023-09-28 17:33:15,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:33:16,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:33:18,704 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 17:33:20,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 17:33:20,726 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.44 vs. limit=22.5 2023-09-28 17:33:23,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:33:23,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:33:24,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 17:33:24,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:33:32,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:33:42,914 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:33:48,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 17:33:49,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=90960.0, ans=0.125 2023-09-28 17:33:50,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:33:54,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:33:54,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:33:54,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:33:57,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:33:57,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 17:33:57,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=90960.0, ans=0.125 2023-09-28 17:34:00,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 17:34:03,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:34:03,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 17:34:05,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:34:05,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:34:07,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:34:07,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:34:09,439 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.94 vs. limit=15.0 2023-09-28 17:34:10,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:34:10,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:34:10,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:34:11,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:34:13,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 17:34:15,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:34:15,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:34:16,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:34:21,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:34:21,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:34:22,805 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 17:34:22,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 17:34:25,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:34:25,143 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 17:34:25,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:34:30,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 17:34:31,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:34:33,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:34:33,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 17:34:34,802 INFO [train.py:1039] (3/4) Epoch 3, batch 3050, loss[loss=0.2775, simple_loss=0.3398, pruned_loss=0.1076, over 24550.00 frames. ], tot_loss[loss=0.2885, simple_loss=0.3368, pruned_loss=0.1201, over 4721846.93 frames. ], batch size: 71, lr: 2.91e-02, grad_scale: 32.0 2023-09-28 17:34:34,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 17:34:34,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 17:34:36,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:34:38,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:34:38,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:34:38,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:34:40,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:34:41,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 17:34:43,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:34:46,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:34:47,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:34:48,396 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=28.04 vs. limit=15.0 2023-09-28 17:34:51,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:34:53,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=91226.66666666667, ans=0.2 2023-09-28 17:34:54,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 17:35:02,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 17:35:02,526 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 17:35:02,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:35:07,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:35:09,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:35:09,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:35:11,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:35:12,965 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=91293.33333333333, ans=0.0 2023-09-28 17:35:14,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:35:16,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:35:16,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:35:16,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:35:16,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:35:16,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=91293.33333333333, ans=0.0 2023-09-28 17:35:17,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:35:20,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:35:22,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:35:22,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 17:35:22,725 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=91360.0, ans=0.1 2023-09-28 17:35:23,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:35:23,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:35:27,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:35:27,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:35:27,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:35:29,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:35:33,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:35:34,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:35:39,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:35:40,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:35:40,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:35:42,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:35:42,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 17:35:42,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:35:43,611 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.24 vs. limit=22.5 2023-09-28 17:35:44,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 17:35:46,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:35:46,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:35:46,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=91426.66666666667, ans=0.2 2023-09-28 17:35:47,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 17:35:49,346 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=91426.66666666667, ans=0.125 2023-09-28 17:35:50,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:35:52,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=91426.66666666667, ans=0.125 2023-09-28 17:35:55,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:35:57,506 INFO [train.py:1039] (3/4) Epoch 3, batch 3100, loss[loss=0.3007, simple_loss=0.3448, pruned_loss=0.1284, over 23174.00 frames. ], tot_loss[loss=0.2878, simple_loss=0.3357, pruned_loss=0.1199, over 4709961.60 frames. ], batch size: 105, lr: 2.90e-02, grad_scale: 16.0 2023-09-28 17:35:57,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:35:59,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 17:36:00,684 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.012e+02 2.573e+02 3.095e+02 3.783e+02 7.787e+02, threshold=6.189e+02, percent-clipped=2.0 2023-09-28 17:36:00,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 17:36:03,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 17:36:05,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 17:36:07,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:36:09,612 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=14.09 vs. limit=22.5 2023-09-28 17:36:10,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:36:12,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:36:13,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 17:36:19,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:36:25,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 17:36:25,288 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=91560.0, ans=0.125 2023-09-28 17:36:29,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 17:36:31,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:36:32,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:36:33,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:36:33,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 17:36:35,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:36:35,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 17:36:35,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:36:36,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:36:39,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 17:36:39,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:36:43,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:36:43,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 17:36:45,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 17:36:47,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:36:47,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:36:50,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:36:50,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:36:50,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:36:53,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:36:53,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:36:54,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:36:55,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:36:55,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:36:55,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 17:37:00,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:37:02,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 17:37:05,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:37:05,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 17:37:06,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:37:07,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:37:08,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 17:37:19,673 INFO [train.py:1039] (3/4) Epoch 3, batch 3150, loss[loss=0.271, simple_loss=0.3394, pruned_loss=0.1013, over 24576.00 frames. ], tot_loss[loss=0.2851, simple_loss=0.3324, pruned_loss=0.119, over 4710821.35 frames. ], batch size: 71, lr: 2.90e-02, grad_scale: 16.0 2023-09-28 17:37:19,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 17:37:20,298 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=91826.66666666667, ans=0.125 2023-09-28 17:37:22,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:37:23,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:37:25,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:37:25,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:37:25,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 17:37:27,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:37:27,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 17:37:28,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 17:37:30,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:37:31,543 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.07 vs. limit=22.5 2023-09-28 17:37:32,356 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 17:37:36,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 17:37:36,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:37:39,121 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 17:37:39,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 17:37:40,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 17:37:40,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 17:37:40,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 17:37:40,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:37:40,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:37:42,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:37:45,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 17:37:47,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:37:47,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:37:48,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:37:50,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 17:37:54,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 17:37:54,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:37:57,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:37:57,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:37:59,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 17:38:02,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 17:38:04,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:38:04,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 17:38:04,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 17:38:06,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:38:06,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:38:06,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:38:07,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 17:38:08,413 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.41 vs. limit=15.0 2023-09-28 17:38:09,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 17:38:09,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:38:10,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:11,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:38:11,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:38:13,147 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 17:38:13,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:38:14,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 17:38:16,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:17,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 17:38:19,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 17:38:20,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:38:20,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:38:21,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 17:38:22,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 17:38:22,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:38:25,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:38:27,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:27,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:38:34,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:38:34,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:37,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 17:38:41,062 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.25 vs. limit=22.5 2023-09-28 17:38:43,064 INFO [train.py:1039] (3/4) Epoch 3, batch 3200, loss[loss=0.2494, simple_loss=0.3105, pruned_loss=0.09421, over 24342.00 frames. ], tot_loss[loss=0.2825, simple_loss=0.3302, pruned_loss=0.1174, over 4709490.94 frames. ], batch size: 56, lr: 2.90e-02, grad_scale: 32.0 2023-09-28 17:38:43,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:38:43,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 17:38:46,886 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.841e+02 2.531e+02 2.998e+02 3.452e+02 5.958e+02, threshold=5.995e+02, percent-clipped=0.0 2023-09-28 17:38:47,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:38:48,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:38:48,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 17:38:51,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:38:54,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:38:59,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:39:08,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:39:08,730 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=92226.66666666667, ans=0.125 2023-09-28 17:39:19,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 17:39:21,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:39:24,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 17:39:24,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 17:39:26,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=92293.33333333333, ans=0.125 2023-09-28 17:39:27,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:39:27,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:39:29,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:39:32,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 17:39:34,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 17:39:34,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=92360.0, ans=0.125 2023-09-28 17:39:38,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 17:39:43,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 17:39:45,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:39:47,219 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.97 vs. limit=15.0 2023-09-28 17:39:51,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:39:51,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 17:39:51,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:39:52,109 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 17:39:52,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:39:55,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:39:56,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 17:39:56,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 17:39:58,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 17:39:59,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 17:40:01,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:40:05,806 INFO [train.py:1039] (3/4) Epoch 3, batch 3250, loss[loss=0.2789, simple_loss=0.3315, pruned_loss=0.1132, over 23451.00 frames. ], tot_loss[loss=0.2827, simple_loss=0.3306, pruned_loss=0.1174, over 4699716.62 frames. ], batch size: 93, lr: 2.89e-02, grad_scale: 32.0 2023-09-28 17:40:05,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:40:05,904 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 17:40:05,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:40:05,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:07,465 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 17:40:10,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:40:15,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:40:17,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=92493.33333333333, ans=0.125 2023-09-28 17:40:20,703 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=92493.33333333333, ans=0.125 2023-09-28 17:40:22,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:40:22,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 17:40:23,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:40:23,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:40:25,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:40:27,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:40:27,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:40:30,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:30,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:40:30,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:40:30,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:30,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:30,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:40:32,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:40:33,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:40:35,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:40:35,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:40:37,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:40:37,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:40:37,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:40:42,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 17:40:42,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:40:42,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:40:46,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:40:46,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:40:47,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=92626.66666666667, ans=0.2 2023-09-28 17:40:52,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:40:53,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=92626.66666666667, ans=0.125 2023-09-28 17:40:54,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=92626.66666666667, ans=0.0 2023-09-28 17:40:57,988 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.85 vs. limit=15.0 2023-09-28 17:41:00,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:41:00,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:41:00,619 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 17:41:00,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:41:01,290 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.15 vs. limit=15.0 2023-09-28 17:41:02,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 17:41:02,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:41:05,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 17:41:05,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 17:41:06,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:41:06,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:41:08,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:41:08,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 17:41:08,498 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=92693.33333333333, ans=0.125 2023-09-28 17:41:09,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:41:12,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:41:12,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:41:15,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 17:41:15,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:41:18,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:41:18,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 17:41:23,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:41:23,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 17:41:25,391 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 17:41:26,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 17:41:26,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:41:29,926 INFO [train.py:1039] (3/4) Epoch 3, batch 3300, loss[loss=0.2868, simple_loss=0.3563, pruned_loss=0.1086, over 24417.00 frames. ], tot_loss[loss=0.2849, simple_loss=0.3328, pruned_loss=0.1185, over 4699706.79 frames. ], batch size: 69, lr: 2.89e-02, grad_scale: 32.0 2023-09-28 17:41:30,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:41:32,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:41:32,468 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:41:32,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=92826.66666666667, ans=0.0 2023-09-28 17:41:33,735 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.895e+02 2.576e+02 3.097e+02 3.556e+02 6.978e+02, threshold=6.193e+02, percent-clipped=2.0 2023-09-28 17:41:34,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 17:41:35,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:41:37,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:41:40,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:41:43,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 17:41:44,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:41:44,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:41:47,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:41:47,697 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 17:41:49,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:41:49,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 17:41:51,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:41:51,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:41:51,493 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 17:41:51,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=92893.33333333333, ans=0.1 2023-09-28 17:41:58,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:41:58,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:42:01,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:42:01,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 17:42:03,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 17:42:03,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:42:04,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:42:06,444 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 17:42:09,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 17:42:09,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:42:13,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 17:42:16,601 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.37 vs. limit=10.0 2023-09-28 17:42:17,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:42:19,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 17:42:20,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:42:21,300 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.66 vs. limit=22.5 2023-09-28 17:42:23,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:42:23,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:42:23,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:42:23,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:42:26,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:42:26,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:42:26,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:42:26,994 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=93026.66666666667, ans=0.1 2023-09-28 17:42:28,321 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 17:42:29,227 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.70 vs. limit=15.0 2023-09-28 17:42:30,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 17:42:30,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=93026.66666666667, ans=0.2 2023-09-28 17:42:32,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 17:42:32,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:42:32,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:42:34,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:42:34,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:42:36,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:42:37,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:42:37,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:42:37,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:42:39,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:42:42,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 17:42:44,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:42:44,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:42:47,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 17:42:47,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:42:50,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:42:52,058 INFO [train.py:1039] (3/4) Epoch 3, batch 3350, loss[loss=0.2748, simple_loss=0.3193, pruned_loss=0.1151, over 23390.00 frames. ], tot_loss[loss=0.2844, simple_loss=0.3331, pruned_loss=0.1179, over 4713794.85 frames. ], batch size: 134, lr: 2.88e-02, grad_scale: 32.0 2023-09-28 17:42:52,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:42:52,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:42:53,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:42:55,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:42:56,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:42:59,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:02,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:43:05,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:43:05,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:43:06,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 17:43:08,366 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 17:43:08,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:43:13,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=93226.66666666667, ans=0.1 2023-09-28 17:43:14,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 17:43:14,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 17:43:14,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:43:15,737 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.36 vs. limit=15.0 2023-09-28 17:43:16,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:43:16,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:43:16,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=93226.66666666667, ans=0.2 2023-09-28 17:43:17,166 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.66 vs. limit=12.0 2023-09-28 17:43:17,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 17:43:17,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:18,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:43:20,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:22,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:43:22,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:24,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:43:27,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:43:29,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:43:30,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:43:33,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:43:34,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=93293.33333333333, ans=0.125 2023-09-28 17:43:34,317 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:43:35,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:43:39,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:43:39,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:43:42,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:43:45,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 17:43:45,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:43:45,548 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 17:43:45,599 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:43:47,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 17:43:49,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:43:50,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:43:57,088 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=9.58 vs. limit=15.0 2023-09-28 17:43:57,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:43:57,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 17:43:57,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 17:43:58,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=93426.66666666667, ans=0.0 2023-09-28 17:43:59,383 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:44:00,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:44:05,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:44:08,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 17:44:10,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:44:10,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:44:11,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:44:11,968 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=93493.33333333333, ans=0.0 2023-09-28 17:44:12,979 INFO [train.py:1039] (3/4) Epoch 3, batch 3400, loss[loss=0.4084, simple_loss=0.4157, pruned_loss=0.2006, over 19654.00 frames. ], tot_loss[loss=0.2867, simple_loss=0.335, pruned_loss=0.1192, over 4715377.70 frames. ], batch size: 388, lr: 2.88e-02, grad_scale: 32.0 2023-09-28 17:44:13,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 17:44:13,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:44:13,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 17:44:15,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:44:15,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:44:15,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 17:44:17,385 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.863e+02 2.557e+02 2.981e+02 3.725e+02 6.496e+02, threshold=5.961e+02, percent-clipped=1.0 2023-09-28 17:44:17,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:44:18,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 17:44:19,915 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.60 vs. limit=15.0 2023-09-28 17:44:22,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 17:44:22,706 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 17:44:22,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:44:27,232 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.21 vs. limit=15.0 2023-09-28 17:44:27,267 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.25 vs. limit=15.0 2023-09-28 17:44:28,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:44:28,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 17:44:28,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:44:29,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:44:33,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=93560.0, ans=0.125 2023-09-28 17:44:35,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:44:37,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 17:44:43,581 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:44:45,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:44:46,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:44:46,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 17:44:49,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=93626.66666666667, ans=0.0 2023-09-28 17:44:55,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:45:00,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 17:45:04,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:45:05,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:45:05,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 17:45:07,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:45:07,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:45:07,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:45:07,921 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=93693.33333333333, ans=0.0 2023-09-28 17:45:09,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:45:11,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:45:16,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:45:16,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:45:22,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:45:24,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 17:45:24,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=93760.0, ans=0.09899494936611666 2023-09-28 17:45:33,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 17:45:36,560 INFO [train.py:1039] (3/4) Epoch 3, batch 3450, loss[loss=0.2901, simple_loss=0.3328, pruned_loss=0.1237, over 23562.00 frames. ], tot_loss[loss=0.2852, simple_loss=0.3335, pruned_loss=0.1185, over 4724068.39 frames. ], batch size: 134, lr: 2.88e-02, grad_scale: 32.0 2023-09-28 17:45:37,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=93826.66666666667, ans=0.1 2023-09-28 17:45:39,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 17:45:42,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 17:45:43,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:45:45,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:45:45,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 17:45:46,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:45:49,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:45:55,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:45:55,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:45:55,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:45:55,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:45:59,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:46:04,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 17:46:12,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 17:46:12,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 17:46:12,148 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:46:13,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:46:20,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 17:46:21,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:46:25,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:46:25,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:46:27,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 17:46:28,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:46:30,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 17:46:30,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:46:30,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:46:31,332 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.36 vs. limit=22.5 2023-09-28 17:46:35,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:46:39,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 17:46:42,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:46:49,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:46:49,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:46:52,728 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:46:54,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=94093.33333333333, ans=0.1 2023-09-28 17:46:57,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:46:57,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:46:57,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:46:58,926 INFO [train.py:1039] (3/4) Epoch 3, batch 3500, loss[loss=0.2947, simple_loss=0.3557, pruned_loss=0.1169, over 24344.00 frames. ], tot_loss[loss=0.2843, simple_loss=0.3321, pruned_loss=0.1182, over 4713619.89 frames. ], batch size: 74, lr: 2.87e-02, grad_scale: 16.0 2023-09-28 17:46:59,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:47:02,946 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.02 vs. limit=15.0 2023-09-28 17:47:03,554 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.085e+02 2.532e+02 3.066e+02 3.931e+02 6.870e+02, threshold=6.132e+02, percent-clipped=2.0 2023-09-28 17:47:03,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:47:07,459 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:47:07,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 17:47:09,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 17:47:14,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 17:47:15,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:47:15,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 17:47:19,181 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=12.09 vs. limit=15.0 2023-09-28 17:47:23,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:47:23,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:47:25,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:47:25,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:47:25,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:47:25,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:26,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:47:26,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 17:47:27,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=94226.66666666667, ans=0.2 2023-09-28 17:47:29,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:29,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:47:29,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:47:31,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=94293.33333333333, ans=0.125 2023-09-28 17:47:33,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:34,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 17:47:36,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:47:39,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:47:41,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:47:43,421 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:45,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:47:45,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:47:45,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 17:47:46,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 17:47:48,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 17:47:49,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:47:50,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:47:52,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:47:53,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 17:47:56,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 17:47:56,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:48:00,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=94360.0, ans=0.0 2023-09-28 17:48:02,943 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:48:04,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 17:48:04,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 17:48:04,511 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:48:06,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:48:07,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:48:07,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=94426.66666666667, ans=0.0 2023-09-28 17:48:09,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:48:11,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=94426.66666666667, ans=0.0 2023-09-28 17:48:12,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 17:48:12,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:48:14,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:48:16,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 17:48:18,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 17:48:21,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:48:22,669 INFO [train.py:1039] (3/4) Epoch 3, batch 3550, loss[loss=0.2713, simple_loss=0.3383, pruned_loss=0.1022, over 24666.00 frames. ], tot_loss[loss=0.2816, simple_loss=0.3305, pruned_loss=0.1163, over 4716448.11 frames. ], batch size: 73, lr: 2.87e-02, grad_scale: 16.0 2023-09-28 17:48:22,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:48:22,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:48:24,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:48:28,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 17:48:36,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=94493.33333333333, ans=0.125 2023-09-28 17:48:38,595 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.43 vs. limit=15.0 2023-09-28 17:48:39,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:48:42,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 17:48:43,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:48:45,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 17:48:46,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:48:49,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:48:49,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:48:52,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:48:52,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:48:52,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:48:52,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 17:48:54,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:48:54,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=94626.66666666667, ans=0.0 2023-09-28 17:48:59,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:48:59,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 17:49:02,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:49:02,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:49:04,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:49:04,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 17:49:04,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:49:05,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:49:06,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 17:49:08,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=94626.66666666667, ans=0.0 2023-09-28 17:49:12,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:49:14,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:49:14,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:49:16,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 17:49:17,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:49:19,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 17:49:21,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 17:49:22,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:49:24,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:49:26,089 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 17:49:27,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:49:34,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:49:36,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 17:49:36,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:49:43,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:49:44,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 17:49:46,257 INFO [train.py:1039] (3/4) Epoch 3, batch 3600, loss[loss=0.3048, simple_loss=0.3399, pruned_loss=0.1348, over 23667.00 frames. ], tot_loss[loss=0.2814, simple_loss=0.3307, pruned_loss=0.116, over 4731264.13 frames. ], batch size: 164, lr: 2.86e-02, grad_scale: 32.0 2023-09-28 17:49:50,280 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.50 vs. limit=15.0 2023-09-28 17:49:50,962 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.901e+02 2.527e+02 2.760e+02 3.413e+02 5.643e+02, threshold=5.521e+02, percent-clipped=0.0 2023-09-28 17:49:51,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 17:49:52,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:49:54,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:49:55,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:49:55,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:49:57,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:49:58,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=94826.66666666667, ans=0.125 2023-09-28 17:50:00,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:50:02,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:50:02,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:50:04,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:50:04,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:50:04,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 17:50:04,686 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=94893.33333333333, ans=0.125 2023-09-28 17:50:09,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:50:10,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:50:14,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:50:17,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:50:19,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 17:50:19,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:50:19,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 17:50:21,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 17:50:21,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:50:22,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 17:50:25,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:50:27,418 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:50:29,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:50:30,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 17:50:32,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=94960.0, ans=0.0 2023-09-28 17:50:36,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:50:37,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 17:50:37,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 17:50:37,875 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:50:42,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:50:48,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:50:48,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=95026.66666666667, ans=0.5 2023-09-28 17:50:53,284 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:50:59,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 17:50:59,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:50:59,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 17:51:01,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 17:51:01,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 17:51:03,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:51:03,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=95093.33333333333, ans=0.1 2023-09-28 17:51:04,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:51:06,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 17:51:06,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:51:07,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:51:07,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:51:08,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 17:51:09,413 INFO [train.py:1039] (3/4) Epoch 3, batch 3650, loss[loss=0.2684, simple_loss=0.3199, pruned_loss=0.1084, over 23342.00 frames. ], tot_loss[loss=0.2819, simple_loss=0.3315, pruned_loss=0.1161, over 4729556.26 frames. ], batch size: 105, lr: 2.86e-02, grad_scale: 32.0 2023-09-28 17:51:09,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 17:51:12,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:51:14,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 17:51:19,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 17:51:21,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:51:24,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 17:51:24,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 17:51:29,772 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:51:29,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 17:51:29,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:51:32,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 17:51:34,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:51:34,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 17:51:36,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 17:51:37,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:51:38,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 17:51:39,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:51:40,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:51:41,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:51:41,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:51:44,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 17:51:44,440 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 17:51:45,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:51:48,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 17:51:49,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:51:49,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:51:56,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:51:57,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:51:57,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 17:51:59,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 17:52:00,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=95360.0, ans=0.125 2023-09-28 17:52:00,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=95360.0, ans=0.0 2023-09-28 17:52:01,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:52:03,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:52:06,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:52:08,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:52:08,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:52:10,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 17:52:12,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:52:12,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:52:18,764 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 17:52:23,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:52:23,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:52:24,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=95426.66666666667, ans=0.125 2023-09-28 17:52:25,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 17:52:25,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:52:26,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 17:52:28,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:52:30,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 17:52:30,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:52:31,718 INFO [train.py:1039] (3/4) Epoch 3, batch 3700, loss[loss=0.2608, simple_loss=0.3261, pruned_loss=0.09777, over 24481.00 frames. ], tot_loss[loss=0.2829, simple_loss=0.3324, pruned_loss=0.1167, over 4735736.49 frames. ], batch size: 63, lr: 2.86e-02, grad_scale: 32.0 2023-09-28 17:52:33,399 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:52:35,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:52:37,005 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 2.521e+02 2.916e+02 3.663e+02 5.180e+02, threshold=5.833e+02, percent-clipped=0.0 2023-09-28 17:52:37,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:52:37,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=95493.33333333333, ans=0.2 2023-09-28 17:52:38,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:52:38,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 17:52:38,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:52:39,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=95493.33333333333, ans=0.0 2023-09-28 17:52:40,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 17:52:40,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 17:52:43,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 17:52:46,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:52:48,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:52:49,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 17:52:49,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:52:51,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:52:52,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:52:55,077 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 17:53:01,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:53:01,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 17:53:03,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 17:53:04,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 17:53:04,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:53:08,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:53:09,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 17:53:13,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:53:13,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:53:16,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:53:18,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:53:21,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 17:53:26,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:53:26,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 17:53:27,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:53:27,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 17:53:31,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=95693.33333333333, ans=0.95 2023-09-28 17:53:33,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:53:33,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:53:36,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:53:36,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 17:53:39,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:53:39,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 17:53:39,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:53:39,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:53:40,082 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.82 vs. limit=15.0 2023-09-28 17:53:41,755 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=95760.0, ans=0.05 2023-09-28 17:53:45,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:53:45,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 17:53:47,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 17:53:47,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:53:47,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:53:49,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 17:53:51,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:53:52,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=95826.66666666667, ans=0.025 2023-09-28 17:53:54,542 INFO [train.py:1039] (3/4) Epoch 3, batch 3750, loss[loss=0.2502, simple_loss=0.3092, pruned_loss=0.0956, over 24506.00 frames. ], tot_loss[loss=0.2832, simple_loss=0.3327, pruned_loss=0.1168, over 4734482.76 frames. ], batch size: 63, lr: 2.85e-02, grad_scale: 32.0 2023-09-28 17:53:54,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:53:54,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:53:55,139 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=95826.66666666667, ans=0.0 2023-09-28 17:53:57,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:54:00,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 17:54:00,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 17:54:01,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=95826.66666666667, ans=0.0 2023-09-28 17:54:03,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 17:54:03,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 17:54:04,644 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.61 vs. limit=22.5 2023-09-28 17:54:05,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:54:07,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:54:08,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:54:11,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:54:14,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:54:18,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 17:54:18,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:54:20,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:54:23,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:54:23,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 17:54:25,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:54:27,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:54:27,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:54:29,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=95960.0, ans=0.125 2023-09-28 17:54:30,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 17:54:31,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=95960.0, ans=0.1 2023-09-28 17:54:35,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 17:54:37,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:54:37,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:54:39,310 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 17:54:40,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:54:44,738 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.19 vs. limit=15.0 2023-09-28 17:54:45,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:54:45,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 17:54:50,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 17:54:52,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:54:57,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 17:54:57,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:54:57,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=96026.66666666667, ans=0.09899494936611666 2023-09-28 17:55:00,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 17:55:03,390 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=96093.33333333333, ans=0.0 2023-09-28 17:55:04,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 17:55:06,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 17:55:09,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 17:55:10,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 17:55:14,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 17:55:17,125 INFO [train.py:1039] (3/4) Epoch 3, batch 3800, loss[loss=0.2809, simple_loss=0.3105, pruned_loss=0.1256, over 23693.00 frames. ], tot_loss[loss=0.2842, simple_loss=0.3328, pruned_loss=0.1177, over 4733870.13 frames. ], batch size: 232, lr: 2.85e-02, grad_scale: 16.0 2023-09-28 17:55:23,802 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.013e+02 2.428e+02 2.901e+02 3.496e+02 5.183e+02, threshold=5.803e+02, percent-clipped=0.0 2023-09-28 17:55:23,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:55:26,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:55:27,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 17:55:27,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 17:55:28,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:55:30,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:55:32,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 17:55:32,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=96226.66666666667, ans=0.125 2023-09-28 17:55:33,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 17:55:33,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:55:36,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 17:55:37,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:55:37,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 17:55:37,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:55:39,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 17:55:43,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 17:55:43,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:55:47,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:55:47,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=96226.66666666667, ans=0.125 2023-09-28 17:55:49,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 17:55:49,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 17:55:53,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 17:55:53,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:55:56,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:55:57,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:56:03,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 17:56:03,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 17:56:05,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:56:12,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:56:12,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=96360.0, ans=0.125 2023-09-28 17:56:17,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:56:20,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 17:56:20,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=96360.0, ans=0.0 2023-09-28 17:56:22,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 17:56:23,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:56:25,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:56:25,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:56:26,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 17:56:30,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 17:56:31,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 17:56:31,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:56:33,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:56:39,351 INFO [train.py:1039] (3/4) Epoch 3, batch 3850, loss[loss=0.2777, simple_loss=0.3288, pruned_loss=0.1133, over 24318.00 frames. ], tot_loss[loss=0.2832, simple_loss=0.3322, pruned_loss=0.1171, over 4723152.76 frames. ], batch size: 61, lr: 2.84e-02, grad_scale: 16.0 2023-09-28 17:56:39,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:56:41,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 17:56:47,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 17:56:47,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 17:56:48,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:56:48,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:56:49,109 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=96493.33333333333, ans=0.2 2023-09-28 17:56:50,971 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.53 vs. limit=15.0 2023-09-28 17:56:53,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 17:56:53,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=96493.33333333333, ans=0.2 2023-09-28 17:56:58,225 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:56:58,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=96560.0, ans=0.05 2023-09-28 17:56:58,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=96560.0, ans=0.04949747468305833 2023-09-28 17:56:59,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 17:57:01,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 17:57:03,995 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.22 vs. limit=12.0 2023-09-28 17:57:08,377 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:09,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:57:11,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:57:13,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 17:57:16,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:18,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 17:57:18,714 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=15.01 vs. limit=15.0 2023-09-28 17:57:20,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:57:20,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 17:57:20,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:57:21,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:57:21,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:21,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 17:57:22,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 17:57:23,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 17:57:23,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:57:23,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:26,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:57:26,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:26,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 17:57:30,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 17:57:31,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:57:33,468 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 17:57:36,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 17:57:42,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:57:43,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:57:48,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:57:49,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 17:57:51,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 17:57:53,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:57:54,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:57:58,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 17:57:58,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 17:57:59,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:01,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:01,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:58:01,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 17:58:02,463 INFO [train.py:1039] (3/4) Epoch 3, batch 3900, loss[loss=0.2571, simple_loss=0.3186, pruned_loss=0.09782, over 24623.00 frames. ], tot_loss[loss=0.2816, simple_loss=0.3315, pruned_loss=0.1158, over 4744470.12 frames. ], batch size: 68, lr: 2.84e-02, grad_scale: 16.0 2023-09-28 17:58:02,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:58:03,090 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=96826.66666666667, ans=0.125 2023-09-28 17:58:04,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 17:58:04,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:04,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:58:07,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 17:58:07,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:09,131 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.036e+02 2.471e+02 2.886e+02 3.509e+02 5.748e+02, threshold=5.772e+02, percent-clipped=0.0 2023-09-28 17:58:09,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:58:10,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 17:58:10,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:58:10,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:58:10,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 17:58:12,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:15,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:58:15,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:58:15,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 17:58:17,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:58:20,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 17:58:21,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:25,062 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 17:58:25,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=96893.33333333333, ans=0.0 2023-09-28 17:58:26,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 17:58:26,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:58:28,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 17:58:28,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 17:58:29,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 17:58:31,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 17:58:34,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:58:36,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:58:36,117 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:58:37,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:58:38,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=96960.0, ans=10.0 2023-09-28 17:58:40,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 17:58:43,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 17:58:45,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 17:58:45,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:58:47,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 17:58:54,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:58:55,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 17:59:03,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 17:59:05,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 17:59:15,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:59:18,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:59:18,400 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 17:59:20,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 17:59:20,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 17:59:21,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 17:59:22,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=97093.33333333333, ans=0.125 2023-09-28 17:59:23,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 17:59:25,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 17:59:27,206 INFO [train.py:1039] (3/4) Epoch 3, batch 3950, loss[loss=0.291, simple_loss=0.3054, pruned_loss=0.1383, over 19415.00 frames. ], tot_loss[loss=0.2803, simple_loss=0.3307, pruned_loss=0.1149, over 4745583.25 frames. ], batch size: 388, lr: 2.84e-02, grad_scale: 16.0 2023-09-28 17:59:33,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 17:59:34,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 17:59:35,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 17:59:38,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 17:59:39,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 17:59:40,007 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=97160.0, ans=0.2 2023-09-28 17:59:44,561 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 17:59:45,223 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.62 vs. limit=15.0 2023-09-28 17:59:45,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:59:46,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 17:59:47,504 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 17:59:47,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 17:59:51,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:59:51,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 17:59:51,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 17:59:55,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 17:59:57,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 17:59:57,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 17:59:57,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 17:59:58,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 17:59:58,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:00:03,122 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:00:12,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:00:12,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:00:17,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 18:00:23,704 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 18:00:23,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 18:00:23,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:00:25,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:00:33,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:00:33,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:00:33,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:00:33,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:00:35,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 18:00:41,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:00:42,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:00:46,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 18:00:50,719 INFO [train.py:1039] (3/4) Epoch 3, batch 4000, loss[loss=0.269, simple_loss=0.3349, pruned_loss=0.1015, over 24324.00 frames. ], tot_loss[loss=0.2809, simple_loss=0.3317, pruned_loss=0.1151, over 4748312.84 frames. ], batch size: 74, lr: 2.83e-02, grad_scale: 32.0 2023-09-28 18:00:55,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:00:56,950 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.085e+02 2.653e+02 3.032e+02 3.720e+02 5.555e+02, threshold=6.065e+02, percent-clipped=0.0 2023-09-28 18:01:03,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:01:08,067 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.25 vs. limit=15.0 2023-09-28 18:01:09,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:01:09,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:01:10,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:01:10,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 18:01:11,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=97560.0, ans=0.125 2023-09-28 18:01:13,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:01:13,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 18:01:13,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:01:14,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 18:01:16,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:01:19,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:01:19,580 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=97560.0, ans=0.0 2023-09-28 18:01:20,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:01:20,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:01:20,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:01:20,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 18:01:22,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:01:24,474 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 18:01:25,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:01:27,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:01:29,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=97626.66666666667, ans=0.04949747468305833 2023-09-28 18:01:30,342 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 18:01:31,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 18:01:31,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:01:38,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 18:01:38,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:01:41,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:01:41,648 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 18:01:43,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:01:43,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 18:01:43,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:01:43,549 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=97693.33333333333, ans=0.125 2023-09-28 18:01:46,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:01:47,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:01:48,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=97693.33333333333, ans=0.125 2023-09-28 18:01:49,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:01:49,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:01:49,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:01:49,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 18:01:50,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:01:52,577 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 18:01:55,082 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=12.58 vs. limit=15.0 2023-09-28 18:01:58,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:02:03,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 18:02:05,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:02:06,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:02:06,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:02:08,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:02:08,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=97760.0, ans=0.125 2023-09-28 18:02:12,360 INFO [train.py:1039] (3/4) Epoch 3, batch 4050, loss[loss=0.2565, simple_loss=0.3064, pruned_loss=0.1033, over 23183.00 frames. ], tot_loss[loss=0.2832, simple_loss=0.3329, pruned_loss=0.1167, over 4730354.91 frames. ], batch size: 119, lr: 2.83e-02, grad_scale: 32.0 2023-09-28 18:02:16,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:02:19,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 18:02:19,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 18:02:20,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:02:23,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:02:24,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:02:24,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:02:27,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:02:30,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:02:31,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:02:32,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 18:02:34,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:02:34,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=97893.33333333333, ans=0.0 2023-09-28 18:02:35,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:02:39,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:02:39,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=97893.33333333333, ans=0.1 2023-09-28 18:02:40,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:02:41,743 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.35 vs. limit=15.0 2023-09-28 18:02:43,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 18:02:45,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 18:02:45,679 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 18:02:47,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=97960.0, ans=0.025 2023-09-28 18:02:50,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:02:55,261 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.12 vs. limit=15.0 2023-09-28 18:02:56,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=97960.0, ans=0.125 2023-09-28 18:02:57,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 18:02:59,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:02:59,755 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:03:01,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:03:03,132 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.71 vs. limit=22.5 2023-09-28 18:03:04,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:03:05,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:03:05,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:03:08,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:03:12,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 18:03:12,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:03:13,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:03:15,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 18:03:18,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:03:19,181 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=10.11 vs. limit=15.0 2023-09-28 18:03:25,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 18:03:27,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:03:27,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:03:28,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 18:03:30,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 18:03:30,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:03:32,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:03:33,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:03:33,975 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:03:35,370 INFO [train.py:1039] (3/4) Epoch 3, batch 4100, loss[loss=0.2757, simple_loss=0.3261, pruned_loss=0.1127, over 23518.00 frames. ], tot_loss[loss=0.2832, simple_loss=0.3333, pruned_loss=0.1165, over 4737138.83 frames. ], batch size: 149, lr: 2.82e-02, grad_scale: 32.0 2023-09-28 18:03:42,052 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.855e+02 2.385e+02 2.703e+02 3.359e+02 5.329e+02, threshold=5.406e+02, percent-clipped=0.0 2023-09-28 18:03:43,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 18:03:45,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 18:03:46,012 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.76 vs. limit=22.5 2023-09-28 18:03:48,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 18:03:49,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 18:03:49,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:03:49,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:03:49,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:03:51,259 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:03:52,748 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 18:03:55,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:03:56,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:03:58,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:03:58,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:04:01,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=98226.66666666667, ans=0.0 2023-09-28 18:04:02,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:04:02,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:04:02,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:04:04,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 18:04:05,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:04:05,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:04:05,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:04:05,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:04:06,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 18:04:09,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:04:11,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 18:04:12,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:04:16,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:04:16,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 18:04:18,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:04:18,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:04:18,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:04:19,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 18:04:22,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:04:24,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:04:25,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 18:04:27,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:04:27,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:04:29,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:04:34,638 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=98360.0, ans=0.0 2023-09-28 18:04:35,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:04:39,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:04:39,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:04:44,373 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.24 vs. limit=15.0 2023-09-28 18:04:48,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:04:48,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:04:51,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:04:53,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:04:57,940 INFO [train.py:1039] (3/4) Epoch 3, batch 4150, loss[loss=0.2513, simple_loss=0.3047, pruned_loss=0.09889, over 24625.00 frames. ], tot_loss[loss=0.2831, simple_loss=0.3331, pruned_loss=0.1166, over 4728651.46 frames. ], batch size: 60, lr: 2.82e-02, grad_scale: 32.0 2023-09-28 18:04:58,088 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:04:59,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:04:59,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:04:59,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:05:04,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 18:05:04,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:05:06,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 18:05:07,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 18:05:08,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=98493.33333333333, ans=22.5 2023-09-28 18:05:08,629 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.67 vs. limit=22.5 2023-09-28 18:05:08,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 18:05:10,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:05:14,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:05:15,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:05:17,245 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=22.60 vs. limit=22.5 2023-09-28 18:05:18,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:05:19,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:05:19,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 18:05:21,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:05:21,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:05:23,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 18:05:27,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:05:32,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:05:33,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 18:05:35,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 18:05:36,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:05:36,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 18:05:36,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:05:36,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:05:42,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:05:42,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:05:46,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 18:05:46,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=98693.33333333333, ans=0.1 2023-09-28 18:05:49,622 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.55 vs. limit=22.5 2023-09-28 18:05:50,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:05:50,584 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:05:51,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:05:54,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 18:05:54,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:05:57,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 18:05:57,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:05:58,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:06:00,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:06:01,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 18:06:01,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:01,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 18:06:03,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 18:06:06,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 18:06:06,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:06:06,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:06:06,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 18:06:08,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 18:06:08,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:06:08,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 18:06:09,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:06:09,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=98760.0, ans=0.125 2023-09-28 18:06:11,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:06:11,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 18:06:13,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:06:16,593 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=98760.0, ans=0.125 2023-09-28 18:06:17,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:06:18,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=98826.66666666667, ans=0.05 2023-09-28 18:06:19,353 INFO [train.py:1039] (3/4) Epoch 3, batch 4200, loss[loss=0.2764, simple_loss=0.3334, pruned_loss=0.1096, over 24536.00 frames. ], tot_loss[loss=0.2813, simple_loss=0.3313, pruned_loss=0.1156, over 4717487.26 frames. ], batch size: 71, lr: 2.82e-02, grad_scale: 32.0 2023-09-28 18:06:19,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 18:06:20,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:06:21,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:06:24,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:06:24,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:06:24,912 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:06:26,709 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.941e+02 2.537e+02 2.926e+02 3.391e+02 4.648e+02, threshold=5.852e+02, percent-clipped=0.0 2023-09-28 18:06:26,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 18:06:28,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 18:06:28,911 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=98826.66666666667, ans=0.125 2023-09-28 18:06:30,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:33,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:06:36,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:06:37,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 18:06:39,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:06:40,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:42,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 18:06:42,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:06:42,542 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=98893.33333333333, ans=0.1 2023-09-28 18:06:44,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:44,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:06:44,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:06:45,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:06:48,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 18:06:48,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:06:52,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=98960.0, ans=0.125 2023-09-28 18:06:54,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=98960.0, ans=0.1 2023-09-28 18:06:56,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 18:06:57,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:06:59,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:07:02,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:07:04,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=98960.0, ans=0.125 2023-09-28 18:07:05,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:07:05,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 18:07:05,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:07:07,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:07:07,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=99026.66666666667, ans=0.125 2023-09-28 18:07:12,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:07:13,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:07:20,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:07:21,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 18:07:25,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:07:30,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:07:31,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:07:33,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 18:07:40,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:07:41,878 INFO [train.py:1039] (3/4) Epoch 3, batch 4250, loss[loss=0.2851, simple_loss=0.3315, pruned_loss=0.1193, over 23620.00 frames. ], tot_loss[loss=0.2807, simple_loss=0.33, pruned_loss=0.1157, over 4710990.11 frames. ], batch size: 135, lr: 2.81e-02, grad_scale: 16.0 2023-09-28 18:07:45,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:07:45,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:07:46,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:07:51,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:07:52,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 18:07:52,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:07:54,691 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=99160.0, ans=0.035 2023-09-28 18:07:56,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:08:00,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:08:01,739 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.17 vs. limit=15.0 2023-09-28 18:08:02,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=99226.66666666667, ans=0.125 2023-09-28 18:08:05,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:08:05,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:08:08,448 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:08:08,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:08:08,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:08:10,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:08:12,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:08:15,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:08:16,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:08:18,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 18:08:21,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 18:08:21,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:08:22,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:08:22,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:08:24,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:08:24,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:08:24,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:08:27,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 18:08:27,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:08:33,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:08:33,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=99360.0, ans=0.1 2023-09-28 18:08:35,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:08:35,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 18:08:35,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:08:37,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 18:08:40,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:08:41,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:08:44,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:08:44,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:08:46,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 18:08:46,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:08:48,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:08:52,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:08:55,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:08:57,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:08:58,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:09:02,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:09:03,674 INFO [train.py:1039] (3/4) Epoch 3, batch 4300, loss[loss=0.2617, simple_loss=0.3182, pruned_loss=0.1026, over 24464.00 frames. ], tot_loss[loss=0.2797, simple_loss=0.3288, pruned_loss=0.1153, over 4705955.49 frames. ], batch size: 63, lr: 2.81e-02, grad_scale: 16.0 2023-09-28 18:09:03,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:09:05,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:09:05,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 18:09:07,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:09:12,363 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.893e+02 2.623e+02 3.036e+02 3.611e+02 5.200e+02, threshold=6.071e+02, percent-clipped=0.0 2023-09-28 18:09:12,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:09:12,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:09:12,974 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=99493.33333333333, ans=0.0 2023-09-28 18:09:17,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:09:23,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:09:23,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 18:09:25,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:09:28,445 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:09:28,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:09:28,511 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 18:09:31,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 18:09:33,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:09:36,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 18:09:36,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:09:36,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 18:09:37,095 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=99626.66666666667, ans=0.125 2023-09-28 18:09:40,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 18:09:41,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:09:47,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:09:47,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:09:47,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:09:47,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=99626.66666666667, ans=0.125 2023-09-28 18:09:48,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:09:50,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:09:50,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 18:09:51,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 18:09:53,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:09:55,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:09:55,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 18:09:55,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:09:56,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:09:56,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 18:09:56,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 18:09:56,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 18:09:58,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:09:58,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 18:10:00,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 18:10:03,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:10:03,558 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 18:10:04,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:10:06,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:10:06,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:10:10,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 18:10:10,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:10:10,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:10:10,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:10:10,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:10:10,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:10:12,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:10:14,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=99760.0, ans=0.125 2023-09-28 18:10:15,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:10:15,638 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=99760.0, ans=0.0 2023-09-28 18:10:16,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:10:16,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:10:23,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 18:10:23,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 18:10:26,387 INFO [train.py:1039] (3/4) Epoch 3, batch 4350, loss[loss=0.3634, simple_loss=0.3765, pruned_loss=0.1752, over 19523.00 frames. ], tot_loss[loss=0.282, simple_loss=0.3305, pruned_loss=0.1168, over 4684597.37 frames. ], batch size: 388, lr: 2.81e-02, grad_scale: 16.0 2023-09-28 18:10:29,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:10:31,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:10:31,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=99826.66666666667, ans=0.125 2023-09-28 18:10:34,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:10:34,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:10:40,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:10:44,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:10:47,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:10:47,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:10:48,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:10:53,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:10:55,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:11:01,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 18:11:02,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:11:03,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:04,015 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.60 vs. limit=15.0 2023-09-28 18:11:04,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=99960.0, ans=0.1 2023-09-28 18:11:08,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:09,260 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.61 vs. limit=15.0 2023-09-28 18:11:10,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 18:11:15,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:11:18,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:11:20,841 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 18:11:21,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:11:22,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:11:24,103 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 18:11:24,215 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 18:11:24,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:11:24,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:11:25,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:11:27,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:11:29,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:11:29,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:11:32,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 18:11:32,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:32,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:11:33,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:33,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 18:11:35,381 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 18:11:35,388 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 18:11:35,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 18:11:38,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:11:38,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:11:39,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:11:39,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:11:41,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 18:11:45,073 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 18:11:45,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:49,536 INFO [train.py:1039] (3/4) Epoch 3, batch 4400, loss[loss=0.3092, simple_loss=0.3418, pruned_loss=0.1382, over 23815.00 frames. ], tot_loss[loss=0.2831, simple_loss=0.3313, pruned_loss=0.1175, over 4675902.08 frames. ], batch size: 195, lr: 2.80e-02, grad_scale: 32.0 2023-09-28 18:11:50,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:11:50,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:11:51,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:11:56,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 18:11:56,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 18:11:56,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 18:11:56,319 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 18:11:57,544 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.868e+02 2.556e+02 3.170e+02 3.495e+02 5.491e+02, threshold=6.340e+02, percent-clipped=0.0 2023-09-28 18:11:57,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:11:57,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:11:59,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=100160.0, ans=0.2 2023-09-28 18:12:01,358 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 18:12:01,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:12:04,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:12:04,467 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 18:12:07,569 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:12:07,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 18:12:07,639 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 18:12:07,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=100226.66666666667, ans=0.125 2023-09-28 18:12:10,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 18:12:10,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=100226.66666666667, ans=0.125 2023-09-28 18:12:12,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 18:12:12,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 18:12:12,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:12:13,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:12:15,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:12:15,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:12:16,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 18:12:16,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 18:12:16,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:12:20,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:12:20,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:12:20,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:12:21,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:12:21,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 18:12:23,890 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 18:12:29,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:12:37,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:12:40,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 18:12:44,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:12:47,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:12:49,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:12:50,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 18:12:51,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:12:51,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:12:51,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:12:52,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:12:57,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 18:13:02,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 18:13:02,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 18:13:02,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:13:04,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 18:13:04,738 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:13:07,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:13:09,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 18:13:10,916 INFO [train.py:1039] (3/4) Epoch 3, batch 4450, loss[loss=0.3051, simple_loss=0.3415, pruned_loss=0.1344, over 23777.00 frames. ], tot_loss[loss=0.2832, simple_loss=0.3316, pruned_loss=0.1175, over 4688758.88 frames. ], batch size: 135, lr: 2.80e-02, grad_scale: 32.0 2023-09-28 18:13:12,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:13:16,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:13:16,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:13:23,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:13:23,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:13:26,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:13:28,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:13:28,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:13:28,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:13:30,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 18:13:30,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:13:31,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=100560.0, ans=0.0 2023-09-28 18:13:32,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:13:32,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:13:32,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:13:35,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 18:13:41,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten.whitening_limit, batch_count=100560.0, ans=15.0 2023-09-28 18:13:42,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:13:43,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:13:45,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:13:45,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:13:47,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:13:47,915 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=100626.66666666667, ans=0.1 2023-09-28 18:13:49,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=100626.66666666667, ans=0.125 2023-09-28 18:13:52,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 18:13:53,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 18:13:53,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 18:13:53,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:13:55,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:13:57,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 18:14:01,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:14:04,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:14:04,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 18:14:04,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:14:04,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:14:05,590 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:14:05,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:14:07,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:14:10,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 18:14:10,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 18:14:13,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:14:14,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:14:17,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:14:18,422 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=22.22 vs. limit=22.5 2023-09-28 18:14:19,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:14:19,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 18:14:21,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:14:26,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 18:14:27,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:14:31,869 INFO [train.py:1039] (3/4) Epoch 3, batch 4500, loss[loss=0.2771, simple_loss=0.3209, pruned_loss=0.1167, over 23683.00 frames. ], tot_loss[loss=0.2838, simple_loss=0.332, pruned_loss=0.1179, over 4673857.56 frames. ], batch size: 149, lr: 2.79e-02, grad_scale: 32.0 2023-09-28 18:14:33,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:14:33,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=100826.66666666667, ans=0.125 2023-09-28 18:14:34,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 18:14:34,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 18:14:36,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:14:40,300 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.868e+02 2.564e+02 2.888e+02 3.333e+02 4.958e+02, threshold=5.777e+02, percent-clipped=0.0 2023-09-28 18:14:40,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:14:42,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:14:42,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:14:44,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:14:44,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:14:45,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:14:59,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:14:59,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:15:03,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:15:04,499 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:15:05,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:15:12,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 18:15:17,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:15:21,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:15:22,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=101026.66666666667, ans=0.1 2023-09-28 18:15:26,022 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:15:26,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 18:15:26,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:15:27,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:15:29,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:15:29,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:15:32,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:15:32,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 18:15:32,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:15:34,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:15:37,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:15:38,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:15:40,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:15:43,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:15:43,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:15:45,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 18:15:48,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 18:15:48,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 18:15:53,574 INFO [train.py:1039] (3/4) Epoch 3, batch 4550, loss[loss=0.2947, simple_loss=0.3256, pruned_loss=0.1319, over 23993.00 frames. ], tot_loss[loss=0.2808, simple_loss=0.3299, pruned_loss=0.1158, over 4699860.11 frames. ], batch size: 196, lr: 2.79e-02, grad_scale: 16.0 2023-09-28 18:15:53,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 18:15:55,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 18:15:56,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:15:58,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:15:59,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:16:03,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:16:08,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:16:11,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:16:12,838 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:16:12,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:16:12,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:15,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:16:15,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:16:19,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:16:22,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 18:16:22,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 18:16:24,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:16:26,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 18:16:29,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=101293.33333333333, ans=0.0 2023-09-28 18:16:30,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 18:16:30,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:16:33,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 18:16:36,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:16:39,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:39,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:39,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:16:42,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 18:16:44,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:16:47,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:47,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:16:49,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:16:50,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 18:16:52,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 18:16:52,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:16:53,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 18:16:57,100 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 18:16:57,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:16:58,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:16:59,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:16:59,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:16:59,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:17:01,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 18:17:01,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 18:17:04,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:17:04,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 18:17:05,030 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=7.89 vs. limit=12.0 2023-09-28 18:17:05,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 18:17:05,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:17:05,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 18:17:08,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:17:08,814 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:17:10,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:17:10,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:17:10,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 18:17:12,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:17:15,574 INFO [train.py:1039] (3/4) Epoch 3, batch 4600, loss[loss=0.2792, simple_loss=0.3344, pruned_loss=0.112, over 23975.00 frames. ], tot_loss[loss=0.2794, simple_loss=0.3287, pruned_loss=0.115, over 4700085.43 frames. ], batch size: 80, lr: 2.79e-02, grad_scale: 16.0 2023-09-28 18:17:15,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:17:17,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:17:20,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:17:20,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=101493.33333333333, ans=0.0 2023-09-28 18:17:23,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:17:23,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:17:23,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:17:24,709 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.921e+02 2.433e+02 2.837e+02 3.221e+02 4.908e+02, threshold=5.674e+02, percent-clipped=0.0 2023-09-28 18:17:24,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 18:17:27,295 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=22.05 vs. limit=15.0 2023-09-28 18:17:28,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:17:32,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:17:32,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=101560.0, ans=0.2 2023-09-28 18:17:34,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:17:37,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:17:42,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 18:17:43,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:17:45,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:17:48,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:17:48,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:17:50,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=101626.66666666667, ans=0.09899494936611666 2023-09-28 18:17:55,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 18:17:55,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 18:17:55,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:18:02,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:18:03,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:18:05,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:18:08,139 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=101693.33333333333, ans=0.0 2023-09-28 18:18:09,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 18:18:10,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 18:18:15,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:18:16,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:18:18,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:18:18,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 18:18:18,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:18:19,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 18:18:20,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:18:20,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:18:21,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:18:23,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:18:25,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:18:25,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 18:18:25,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 18:18:26,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 18:18:26,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:18:28,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:18:29,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:18:29,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:18:38,786 INFO [train.py:1039] (3/4) Epoch 3, batch 4650, loss[loss=0.292, simple_loss=0.3316, pruned_loss=0.1262, over 23641.00 frames. ], tot_loss[loss=0.2778, simple_loss=0.3269, pruned_loss=0.1143, over 4678360.06 frames. ], batch size: 232, lr: 2.78e-02, grad_scale: 16.0 2023-09-28 18:18:39,643 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.90 vs. limit=15.0 2023-09-28 18:18:40,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=101826.66666666667, ans=0.0 2023-09-28 18:18:41,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:18:45,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:18:45,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:18:47,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:18:47,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:18:47,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:18:48,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:18:52,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 18:18:56,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:18:58,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 18:18:58,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:18:59,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 18:19:00,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:19:01,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 18:19:01,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 18:19:02,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:19:02,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:19:05,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:19:07,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:19:07,445 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 18:19:09,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=101960.0, ans=0.125 2023-09-28 18:19:11,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:19:12,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 18:19:16,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:19:16,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:19:17,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 18:19:19,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:19:22,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:19:25,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:19:26,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=102026.66666666667, ans=0.125 2023-09-28 18:19:26,785 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.75 vs. limit=15.0 2023-09-28 18:19:27,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=102026.66666666667, ans=0.0 2023-09-28 18:19:30,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:19:32,953 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.86 vs. limit=6.0 2023-09-28 18:19:34,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:19:34,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:19:35,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:19:38,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 18:19:40,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 18:19:41,020 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.55 vs. limit=15.0 2023-09-28 18:19:41,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 18:19:41,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 18:19:43,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:19:52,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:19:52,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:19:52,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 18:19:52,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:19:53,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:19:53,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:19:56,028 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:19:57,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:19:57,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:19:57,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:20:00,512 INFO [train.py:1039] (3/4) Epoch 3, batch 4700, loss[loss=0.2949, simple_loss=0.3334, pruned_loss=0.1283, over 22930.00 frames. ], tot_loss[loss=0.2788, simple_loss=0.3287, pruned_loss=0.1144, over 4691290.58 frames. ], batch size: 322, lr: 2.78e-02, grad_scale: 16.0 2023-09-28 18:20:02,549 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=102160.0, ans=0.125 2023-09-28 18:20:03,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:20:05,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:20:05,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:20:05,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 18:20:06,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:20:06,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 18:20:10,096 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.77 vs. limit=12.0 2023-09-28 18:20:10,642 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.892e+02 2.707e+02 3.161e+02 3.958e+02 7.246e+02, threshold=6.322e+02, percent-clipped=4.0 2023-09-28 18:20:14,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:20:14,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:20:14,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:20:15,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:20:18,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:20:24,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 18:20:24,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 18:20:25,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:20:27,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=102226.66666666667, ans=0.2 2023-09-28 18:20:29,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:20:29,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:20:32,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:20:39,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 18:20:41,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 18:20:42,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:20:48,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 18:20:50,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:20:53,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:20:57,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 18:20:59,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:20:59,521 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=102360.0, ans=0.125 2023-09-28 18:21:04,406 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:21:04,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 18:21:04,901 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=102360.0, ans=0.0 2023-09-28 18:21:06,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:21:06,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:21:09,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:21:10,534 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:21:10,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 18:21:12,048 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 18:21:13,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:21:15,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:21:15,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:21:15,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 18:21:18,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:21:21,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 18:21:23,341 INFO [train.py:1039] (3/4) Epoch 3, batch 4750, loss[loss=0.2663, simple_loss=0.3352, pruned_loss=0.09869, over 24011.00 frames. ], tot_loss[loss=0.2802, simple_loss=0.3304, pruned_loss=0.115, over 4690750.31 frames. ], batch size: 80, lr: 2.78e-02, grad_scale: 16.0 2023-09-28 18:21:23,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:21:24,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:21:28,211 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.86 vs. limit=12.0 2023-09-28 18:21:30,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:21:30,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:21:33,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 18:21:33,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:21:35,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 18:21:38,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:21:39,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:21:39,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:21:42,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=102560.0, ans=0.1 2023-09-28 18:21:45,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 18:21:50,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:21:52,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 18:21:53,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:21:59,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:21:59,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:21:59,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:21:59,668 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 18:21:59,673 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 18:22:05,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 18:22:08,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:22:10,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:22:13,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:22:13,942 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 18:22:13,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:22:15,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:22:18,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:22:20,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 18:22:20,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 18:22:20,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:22:21,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:22:21,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:22:23,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 18:22:23,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 18:22:25,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 18:22:26,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=102693.33333333333, ans=0.0 2023-09-28 18:22:27,536 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=102693.33333333333, ans=0.0 2023-09-28 18:22:28,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:22:31,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:22:31,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 18:22:33,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:22:33,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:22:34,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:22:36,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:22:37,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 18:22:41,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:22:41,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 18:22:43,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 18:22:44,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 18:22:46,133 INFO [train.py:1039] (3/4) Epoch 3, batch 4800, loss[loss=0.2358, simple_loss=0.2971, pruned_loss=0.08728, over 24344.00 frames. ], tot_loss[loss=0.2794, simple_loss=0.3299, pruned_loss=0.1145, over 4708594.89 frames. ], batch size: 56, lr: 2.77e-02, grad_scale: 32.0 2023-09-28 18:22:48,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:22:48,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:22:48,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=102826.66666666667, ans=0.0 2023-09-28 18:22:49,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 18:22:55,847 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.876e+02 2.499e+02 2.983e+02 3.709e+02 7.262e+02, threshold=5.966e+02, percent-clipped=1.0 2023-09-28 18:22:55,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:22:56,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:22:57,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=102826.66666666667, ans=0.07 2023-09-28 18:23:02,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:23:04,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:23:04,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:23:04,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 18:23:05,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:23:07,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:23:07,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:23:09,335 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:23:13,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:23:17,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:23:17,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:23:17,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:23:17,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 18:23:17,277 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:23:19,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:23:22,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:23:22,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:23:24,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:23:24,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:23:25,062 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 18:23:26,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 18:23:27,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:23:29,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 18:23:29,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 18:23:32,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:23:32,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:23:32,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:23:32,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:23:34,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:23:35,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:23:35,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:23:40,350 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:23:41,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:23:44,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:23:48,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 18:23:49,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:23:49,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:23:49,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:23:52,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:23:55,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:23:57,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:23:57,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:23:57,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:23:58,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:24:00,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:24:03,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:24:03,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:24:03,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:24:05,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 18:24:08,458 INFO [train.py:1039] (3/4) Epoch 3, batch 4850, loss[loss=0.2816, simple_loss=0.343, pruned_loss=0.1101, over 24480.00 frames. ], tot_loss[loss=0.2797, simple_loss=0.3302, pruned_loss=0.1146, over 4713639.18 frames. ], batch size: 66, lr: 2.77e-02, grad_scale: 32.0 2023-09-28 18:24:08,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 18:24:08,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:24:08,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:24:10,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:24:10,398 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:24:13,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:24:13,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=103160.0, ans=0.125 2023-09-28 18:24:21,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 18:24:21,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:24:26,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:24:28,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 18:24:28,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:24:32,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:24:33,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:24:35,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:24:35,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 18:24:39,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:24:40,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=103293.33333333333, ans=0.125 2023-09-28 18:24:42,051 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:24:42,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 18:24:42,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:24:42,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 18:24:45,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=103293.33333333333, ans=0.125 2023-09-28 18:24:46,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:24:46,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:24:49,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:24:49,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 18:24:51,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 18:24:51,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:24:53,376 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.80 vs. limit=15.0 2023-09-28 18:24:59,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:25:00,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 18:25:00,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:25:00,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:25:03,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:25:05,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 18:25:05,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:25:08,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 18:25:08,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:25:09,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:25:11,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 18:25:22,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:25:28,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:25:28,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:25:29,919 INFO [train.py:1039] (3/4) Epoch 3, batch 4900, loss[loss=0.2723, simple_loss=0.3248, pruned_loss=0.1099, over 23653.00 frames. ], tot_loss[loss=0.2789, simple_loss=0.3289, pruned_loss=0.1144, over 4709482.20 frames. ], batch size: 85, lr: 2.77e-02, grad_scale: 32.0 2023-09-28 18:25:34,852 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=103493.33333333333, ans=15.0 2023-09-28 18:25:35,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 18:25:35,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:25:40,620 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.035e+02 2.465e+02 2.992e+02 4.302e+02 8.236e+02, threshold=5.984e+02, percent-clipped=6.0 2023-09-28 18:25:40,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:25:42,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:25:42,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:25:45,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 18:25:50,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 18:25:54,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 18:25:55,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 18:25:55,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:25:57,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:25:57,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:25:57,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:25:57,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:25:57,225 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 18:26:00,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 18:26:01,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:26:03,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:26:04,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:26:05,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=103626.66666666667, ans=0.0 2023-09-28 18:26:06,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:26:08,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:26:08,629 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:26:08,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 18:26:10,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:26:12,339 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.13 vs. limit=22.5 2023-09-28 18:26:13,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:26:13,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 18:26:13,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 18:26:18,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 18:26:20,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:26:21,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:26:23,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:26:23,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:26:23,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 18:26:23,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:26:24,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 18:26:27,850 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:26:30,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 18:26:30,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:26:32,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=103693.33333333333, ans=0.2 2023-09-28 18:26:34,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 18:26:35,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:26:35,746 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 18:26:35,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 18:26:44,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:26:45,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:26:46,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=103760.0, ans=0.125 2023-09-28 18:26:47,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 18:26:47,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 18:26:47,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:26:51,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:26:53,069 INFO [train.py:1039] (3/4) Epoch 3, batch 4950, loss[loss=0.2706, simple_loss=0.3376, pruned_loss=0.1018, over 24558.00 frames. ], tot_loss[loss=0.2771, simple_loss=0.3279, pruned_loss=0.1131, over 4722110.95 frames. ], batch size: 71, lr: 2.76e-02, grad_scale: 32.0 2023-09-28 18:26:54,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:26:54,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:26:56,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:26:56,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 18:26:57,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 18:27:00,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:27:00,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 18:27:04,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 18:27:04,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 18:27:04,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=103826.66666666667, ans=0.1 2023-09-28 18:27:05,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:27:05,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 18:27:05,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:06,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:27:06,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:27:07,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:27:08,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:27:10,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:27:10,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=103893.33333333333, ans=0.125 2023-09-28 18:27:11,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:27:13,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=103893.33333333333, ans=0.0 2023-09-28 18:27:13,669 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.19 vs. limit=22.5 2023-09-28 18:27:14,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:27:14,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:14,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:27:19,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:27:25,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:25,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:27:27,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:27,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:27:28,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:27:30,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 18:27:31,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 18:27:34,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:27:37,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:27:37,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:27:39,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:27:39,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:27:40,845 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:27:41,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=104026.66666666667, ans=0.5 2023-09-28 18:27:42,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:27:44,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:27:45,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:27:50,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:27:50,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:27:52,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 18:27:52,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:27:53,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:27:56,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:27:59,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:27:59,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:28:01,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:28:01,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:28:02,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:28:04,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:28:04,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:28:04,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:28:05,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 18:28:09,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:28:13,298 INFO [train.py:1039] (3/4) Epoch 3, batch 5000, loss[loss=0.2882, simple_loss=0.3006, pruned_loss=0.1379, over 19014.00 frames. ], tot_loss[loss=0.2757, simple_loss=0.326, pruned_loss=0.1127, over 4711277.12 frames. ], batch size: 389, lr: 2.76e-02, grad_scale: 32.0 2023-09-28 18:28:15,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 18:28:15,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 18:28:22,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:28:23,505 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.857e+02 2.486e+02 2.809e+02 3.764e+02 5.780e+02, threshold=5.617e+02, percent-clipped=0.0 2023-09-28 18:28:23,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:28:25,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 18:28:25,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 18:28:26,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:28:29,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 18:28:29,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:28:30,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:28:31,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 18:28:31,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:28:31,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=104226.66666666667, ans=0.125 2023-09-28 18:28:33,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:28:33,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 18:28:33,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:28:34,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:28:35,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 18:28:35,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 18:28:36,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:28:38,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 18:28:38,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:28:38,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:28:39,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:28:39,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 18:28:39,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 18:28:41,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 18:28:41,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:28:41,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:28:44,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 18:28:44,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:28:45,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:28:46,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:28:48,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 18:28:51,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 18:28:51,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:28:52,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:28:57,422 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 18:29:00,499 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:29:02,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:29:02,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:03,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 18:29:03,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:29:03,931 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:29:06,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:29:07,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 18:29:09,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:29:11,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:29:13,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:29:13,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=104360.0, ans=0.125 2023-09-28 18:29:19,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 18:29:24,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:24,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=104426.66666666667, ans=0.0 2023-09-28 18:29:30,620 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=104426.66666666667, ans=0.1 2023-09-28 18:29:33,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:29:34,961 INFO [train.py:1039] (3/4) Epoch 3, batch 5050, loss[loss=0.2869, simple_loss=0.3296, pruned_loss=0.1221, over 23749.00 frames. ], tot_loss[loss=0.2772, simple_loss=0.3272, pruned_loss=0.1135, over 4703838.80 frames. ], batch size: 212, lr: 2.75e-02, grad_scale: 32.0 2023-09-28 18:29:35,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:35,079 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:29:35,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:29:36,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:29:36,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:29:37,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:42,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:29:42,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 18:29:42,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:29:45,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:29:47,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:29:48,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 18:29:48,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:29:50,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:29:53,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:29:53,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:29:54,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:30:04,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 18:30:04,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 18:30:06,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:30:06,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 18:30:06,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:30:08,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:30:09,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:30:10,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:30:10,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 18:30:11,468 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 18:30:13,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:30:16,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:30:16,893 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=104626.66666666667, ans=0.125 2023-09-28 18:30:18,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=104626.66666666667, ans=10.0 2023-09-28 18:30:19,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:30:20,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 18:30:21,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:30:24,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 18:30:27,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:30:27,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:30:27,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:30:28,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=104693.33333333333, ans=0.0 2023-09-28 18:30:29,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:30:32,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:30:33,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:30:35,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:30:35,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:30:37,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:30:37,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 18:30:38,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:30:40,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:30:43,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:30:43,636 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 18:30:43,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 18:30:45,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:30:45,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:30:45,844 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 18:30:48,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:30:48,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 18:30:48,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:30:52,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:30:52,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:30:54,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 18:30:55,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 18:30:57,718 INFO [train.py:1039] (3/4) Epoch 3, batch 5100, loss[loss=0.2384, simple_loss=0.297, pruned_loss=0.08991, over 24324.00 frames. ], tot_loss[loss=0.2768, simple_loss=0.3274, pruned_loss=0.113, over 4715092.35 frames. ], batch size: 56, lr: 2.75e-02, grad_scale: 32.0 2023-09-28 18:30:57,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:30:57,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:30:59,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:31:01,087 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=104826.66666666667, ans=0.1 2023-09-28 18:31:02,321 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 18:31:03,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:31:06,674 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.935e+02 2.697e+02 3.242e+02 4.082e+02 8.790e+02, threshold=6.484e+02, percent-clipped=7.0 2023-09-28 18:31:06,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 18:31:07,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=104826.66666666667, ans=0.07 2023-09-28 18:31:08,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 18:31:10,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:31:12,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:31:15,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:31:16,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 18:31:16,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 18:31:20,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:31:22,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:31:23,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=104893.33333333333, ans=0.125 2023-09-28 18:31:25,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:31:28,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 18:31:28,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:31:31,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:31:31,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 18:31:31,670 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=14.24 vs. limit=15.0 2023-09-28 18:31:32,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:31:34,111 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:31:34,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 18:31:37,105 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 18:31:37,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:31:37,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 18:31:37,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 18:31:42,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:31:51,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:31:53,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 18:31:53,835 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 18:31:55,196 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 18:31:56,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 18:31:56,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:32:00,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 18:32:04,246 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.63 vs. limit=15.0 2023-09-28 18:32:06,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 18:32:09,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 18:32:11,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:32:12,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 18:32:14,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 18:32:14,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 18:32:17,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=105160.0, ans=0.1 2023-09-28 18:32:18,907 INFO [train.py:1039] (3/4) Epoch 3, batch 5150, loss[loss=0.2712, simple_loss=0.3356, pruned_loss=0.1034, over 24549.00 frames. ], tot_loss[loss=0.2778, simple_loss=0.3283, pruned_loss=0.1137, over 4717877.57 frames. ], batch size: 71, lr: 2.75e-02, grad_scale: 32.0 2023-09-28 18:32:22,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:32:22,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:32:22,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:32:24,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:32:24,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 18:32:24,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:32:25,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 18:32:25,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 18:32:27,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 18:32:27,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:32:27,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 18:32:29,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:32:29,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 18:32:30,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:32:32,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:32:37,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:32:37,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 18:32:40,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:32:40,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:32:41,250 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.06 vs. limit=15.0 2023-09-28 18:32:42,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:32:42,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:32:42,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:32:44,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:32:44,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:32:44,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 18:32:46,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:32:46,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:32:49,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:32:51,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 18:32:54,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:33:00,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:33:04,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 18:33:07,489 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:33:14,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:33:14,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:33:17,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:33:19,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:33:21,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 18:33:21,447 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=105360.0, ans=0.0 2023-09-28 18:33:22,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=105360.0, ans=0.0 2023-09-28 18:33:25,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:33:25,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:33:27,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:33:30,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:33:30,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:33:32,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 18:33:37,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:33:39,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 18:33:42,275 INFO [train.py:1039] (3/4) Epoch 3, batch 5200, loss[loss=0.3171, simple_loss=0.3622, pruned_loss=0.136, over 23467.00 frames. ], tot_loss[loss=0.2784, simple_loss=0.3292, pruned_loss=0.1138, over 4708192.50 frames. ], batch size: 93, lr: 2.74e-02, grad_scale: 32.0 2023-09-28 18:33:42,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:33:42,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:33:43,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 18:33:43,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:33:43,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:33:43,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:33:47,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:33:49,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:33:51,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=105493.33333333333, ans=0.1 2023-09-28 18:33:52,545 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.860e+02 2.485e+02 2.931e+02 3.472e+02 7.408e+02, threshold=5.863e+02, percent-clipped=1.0 2023-09-28 18:33:52,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:33:55,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 18:33:57,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:33:57,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:34:00,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:34:00,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:34:02,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:34:02,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 18:34:07,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:34:09,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:34:12,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 18:34:13,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:34:13,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:34:15,440 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 18:34:15,518 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 18:34:18,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=105626.66666666667, ans=0.125 2023-09-28 18:34:19,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 18:34:21,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:34:21,717 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 18:34:21,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:34:21,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:34:23,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:34:23,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 18:34:24,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:34:27,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:34:30,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 18:34:30,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 18:34:30,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 18:34:35,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 18:34:37,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:34:41,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:34:43,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:34:43,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 18:34:43,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:34:44,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 18:34:44,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:34:45,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:34:48,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:34:50,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:34:54,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:34:55,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:34:55,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:34:56,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=105760.0, ans=0.2 2023-09-28 18:35:02,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:35:04,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 18:35:04,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:35:05,441 INFO [train.py:1039] (3/4) Epoch 3, batch 5250, loss[loss=0.2774, simple_loss=0.3229, pruned_loss=0.1159, over 23679.00 frames. ], tot_loss[loss=0.2791, simple_loss=0.3287, pruned_loss=0.1148, over 4687699.56 frames. ], batch size: 135, lr: 2.74e-02, grad_scale: 32.0 2023-09-28 18:35:05,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:35:05,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:35:07,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 18:35:08,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:35:12,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:35:14,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:35:14,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:35:14,970 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.97 vs. limit=22.5 2023-09-28 18:35:15,727 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:35:21,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:35:22,100 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=16.37 vs. limit=22.5 2023-09-28 18:35:24,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:35:27,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:35:30,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:35:32,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 18:35:32,255 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:35:32,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:35:32,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=105893.33333333333, ans=0.2 2023-09-28 18:35:40,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=105960.0, ans=0.2 2023-09-28 18:35:48,523 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=105960.0, ans=0.0 2023-09-28 18:36:09,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=106093.33333333333, ans=0.1 2023-09-28 18:36:13,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=106093.33333333333, ans=0.125 2023-09-28 18:36:20,357 INFO [train.py:1039] (3/4) Epoch 3, batch 5300, loss[loss=0.2596, simple_loss=0.3013, pruned_loss=0.109, over 23677.00 frames. ], tot_loss[loss=0.2774, simple_loss=0.3254, pruned_loss=0.1147, over 4666635.55 frames. ], batch size: 232, lr: 2.74e-02, grad_scale: 32.0 2023-09-28 18:36:28,671 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.019e+02 2.515e+02 2.948e+02 3.617e+02 7.012e+02, threshold=5.895e+02, percent-clipped=2.0 2023-09-28 18:36:29,334 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.07 vs. limit=15.0 2023-09-28 18:36:30,957 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.97 vs. limit=22.5 2023-09-28 18:36:35,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:36:35,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 18:36:35,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 18:36:35,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:36:36,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:36:36,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:36:36,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:36:36,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:36:36,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:36:36,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:36:36,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 18:36:37,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:36:37,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 18:36:37,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 18:36:37,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 18:36:37,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 18:36:37,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 18:36:38,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 18:36:38,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:36:38,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:36:38,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:36:38,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:36:39,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:36:39,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:36:39,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:36:39,559 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:36:40,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:36:40,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:36:40,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:36:40,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:36:40,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:36:41,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 18:36:41,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:36:41,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:36:41,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 18:36:41,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 18:36:41,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:36:41,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:36:41,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 18:36:42,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 18:36:42,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:36:42,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:36:43,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:36:43,310 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 18:36:43,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 18:36:43,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:36:43,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:36:44,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 18:36:44,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 18:36:44,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 18:36:44,771 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:36:54,486 INFO [train.py:1039] (3/4) Epoch 4, batch 0, loss[loss=0.2685, simple_loss=0.318, pruned_loss=0.1095, over 23740.00 frames. ], tot_loss[loss=0.2685, simple_loss=0.318, pruned_loss=0.1095, over 23740.00 frames. ], batch size: 232, lr: 2.56e-02, grad_scale: 32.0 2023-09-28 18:36:54,486 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-28 18:37:09,546 INFO [train.py:1071] (3/4) Epoch 4, validation: loss=0.3856, simple_loss=0.3373, pruned_loss=0.217, over 1125622.00 frames. 2023-09-28 18:37:09,548 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-28 18:37:12,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 18:37:14,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:37:15,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:37:21,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:37:21,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:37:22,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:37:24,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 18:37:25,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 18:37:27,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:37:27,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:37:33,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:37:33,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:37:34,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:37:34,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:37:36,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 18:37:39,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:37:46,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:37:46,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:37:48,455 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 18:37:50,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=106373.33333333333, ans=0.04949747468305833 2023-09-28 18:37:52,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:37:52,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:37:55,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:37:58,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:38:01,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:38:06,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=106440.0, ans=0.125 2023-09-28 18:38:08,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 18:38:09,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=106440.0, ans=0.125 2023-09-28 18:38:13,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 18:38:13,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:38:13,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:38:15,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:38:15,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:38:16,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 18:38:20,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:38:22,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:38:22,538 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=106506.66666666667, ans=0.125 2023-09-28 18:38:25,454 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:38:28,706 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 18:38:30,528 INFO [train.py:1039] (3/4) Epoch 4, batch 50, loss[loss=0.3937, simple_loss=0.4093, pruned_loss=0.189, over 19653.00 frames. ], tot_loss[loss=0.2729, simple_loss=0.3282, pruned_loss=0.1088, over 1077058.27 frames. ], batch size: 388, lr: 2.56e-02, grad_scale: 32.0 2023-09-28 18:38:32,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:38:34,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:38:37,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:38:37,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 18:38:39,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:38:39,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:38:40,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:38:42,098 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:38:44,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:38:48,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 18:38:48,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:38:58,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 18:39:00,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 18:39:02,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 18:39:04,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=106640.0, ans=0.1 2023-09-28 18:39:05,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:39:07,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:39:07,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:39:09,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:39:10,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 18:39:11,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 18:39:11,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:39:16,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:39:19,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:39:19,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:39:19,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 18:39:21,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:39:22,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:39:22,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 18:39:23,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:39:24,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 18:39:24,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=106773.33333333333, ans=0.0 2023-09-28 18:39:30,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:39:30,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:39:34,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:39:35,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:39:35,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:39:37,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 18:39:37,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 18:39:38,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:39:39,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:39:41,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:39:41,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:39:43,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 18:39:43,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 18:39:44,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 18:39:46,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:39:46,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:39:47,467 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 2.007e+02 2.562e+02 2.907e+02 3.580e+02 6.238e+02, threshold=5.814e+02, percent-clipped=1.0 2023-09-28 18:39:47,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 18:39:47,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 18:39:49,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:39:49,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:39:49,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=106840.0, ans=0.2 2023-09-28 18:39:50,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 18:39:50,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:39:55,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:39:56,664 INFO [train.py:1039] (3/4) Epoch 4, batch 100, loss[loss=0.2276, simple_loss=0.2903, pruned_loss=0.08251, over 24547.00 frames. ], tot_loss[loss=0.2791, simple_loss=0.3305, pruned_loss=0.1138, over 1873198.09 frames. ], batch size: 60, lr: 2.55e-02, grad_scale: 32.0 2023-09-28 18:39:58,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:40:01,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:40:05,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 18:40:05,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:40:07,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=106906.66666666667, ans=0.0 2023-09-28 18:40:08,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:40:08,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:40:08,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:40:08,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:40:10,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:40:10,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=106906.66666666667, ans=0.0 2023-09-28 18:40:11,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 18:40:15,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:40:15,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:40:15,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:40:17,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:40:21,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 18:40:21,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:40:22,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:40:22,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:40:23,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=106973.33333333333, ans=0.0 2023-09-28 18:40:24,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:40:29,215 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 18:40:29,242 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 18:40:30,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:40:30,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:40:35,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 18:40:35,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:40:36,020 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=107040.0, ans=0.0 2023-09-28 18:40:39,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:40:45,465 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.01 vs. limit=15.0 2023-09-28 18:40:46,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=107106.66666666667, ans=0.0 2023-09-28 18:40:47,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:40:47,633 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 18:40:49,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 18:40:53,891 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.43 vs. limit=15.0 2023-09-28 18:40:54,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:40:55,229 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.73 vs. limit=15.0 2023-09-28 18:40:55,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:41:00,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:41:02,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:41:05,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:41:06,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:41:09,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:41:10,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:41:11,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:41:11,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:41:11,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:41:13,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 18:41:13,102 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 18:41:13,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:41:14,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:41:14,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:14,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:41:14,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 18:41:16,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:41:16,802 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:41:16,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:16,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:41:18,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:41:18,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:41:18,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:41:19,795 INFO [train.py:1039] (3/4) Epoch 4, batch 150, loss[loss=0.301, simple_loss=0.3473, pruned_loss=0.1274, over 23865.00 frames. ], tot_loss[loss=0.2768, simple_loss=0.33, pruned_loss=0.1118, over 2523816.15 frames. ], batch size: 195, lr: 2.55e-02, grad_scale: 32.0 2023-09-28 18:41:21,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:41:24,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:41:24,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:41:24,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:28,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:41:29,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:31,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:41:32,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=107240.0, ans=0.2 2023-09-28 18:41:33,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:39,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 18:41:39,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 18:41:39,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 18:41:42,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:41:42,345 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:41:43,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:41:45,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:41:45,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:41:45,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:47,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=107306.66666666667, ans=0.2 2023-09-28 18:41:48,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:41:49,763 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 18:41:51,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:41:58,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:42:03,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:42:03,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 18:42:09,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:42:09,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:42:09,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:42:11,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:42:13,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:42:16,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:42:17,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:42:18,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 18:42:22,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:42:24,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:42:24,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:42:24,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:42:27,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:42:29,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 18:42:31,539 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.741e+02 2.491e+02 2.943e+02 3.333e+02 6.261e+02, threshold=5.886e+02, percent-clipped=1.0 2023-09-28 18:42:33,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:42:34,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:42:34,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:42:36,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:42:36,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 18:42:37,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:42:37,386 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 18:42:37,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=107506.66666666667, ans=0.07 2023-09-28 18:42:37,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=107506.66666666667, ans=0.0 2023-09-28 18:42:40,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:42:41,593 INFO [train.py:1039] (3/4) Epoch 4, batch 200, loss[loss=0.3116, simple_loss=0.3487, pruned_loss=0.1373, over 22704.00 frames. ], tot_loss[loss=0.2801, simple_loss=0.3315, pruned_loss=0.1143, over 3001211.45 frames. ], batch size: 322, lr: 2.55e-02, grad_scale: 32.0 2023-09-28 18:42:44,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:42:45,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:42:48,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 18:42:50,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:42:50,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:42:51,968 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=107573.33333333333, ans=0.0 2023-09-28 18:42:53,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 18:42:56,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 18:42:56,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:42:57,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:42:59,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=107640.0, ans=0.125 2023-09-28 18:43:01,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:43:02,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:43:02,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:43:03,775 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.63 vs. limit=6.0 2023-09-28 18:43:19,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:43:19,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:43:21,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 18:43:23,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:43:23,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 18:43:23,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:43:25,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=107706.66666666667, ans=0.2 2023-09-28 18:43:26,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:43:26,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:43:26,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:43:28,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:43:28,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 18:43:29,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 18:43:29,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:43:33,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:43:43,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:43:51,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:43:53,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:43:57,900 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:44:00,402 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=107840.0, ans=0.2 2023-09-28 18:44:01,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 18:44:01,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:44:01,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:44:01,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:44:02,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 18:44:04,262 INFO [train.py:1039] (3/4) Epoch 4, batch 250, loss[loss=0.2261, simple_loss=0.2909, pruned_loss=0.08071, over 24357.00 frames. ], tot_loss[loss=0.2794, simple_loss=0.331, pruned_loss=0.1139, over 3386688.45 frames. ], batch size: 56, lr: 2.54e-02, grad_scale: 16.0 2023-09-28 18:44:04,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 18:44:04,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:44:04,571 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 18:44:07,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:44:11,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:44:13,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:44:13,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:44:14,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:44:14,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:44:16,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:44:20,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:44:34,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:44:36,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:44:38,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:44:41,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=108040.0, ans=0.125 2023-09-28 18:44:45,010 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.22 vs. limit=22.5 2023-09-28 18:44:45,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 18:44:45,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:44:47,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:44:47,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:44:47,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:44:47,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:44:47,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:44:51,194 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:44:51,603 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=108040.0, ans=0.0 2023-09-28 18:44:52,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 18:44:52,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:44:54,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:44:56,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:44:56,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:44:57,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:44:57,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:44:59,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:45:00,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:45:02,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:45:03,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:45:04,278 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=108106.66666666667, ans=0.2 2023-09-28 18:45:08,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:45:12,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:45:13,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:45:17,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:45:19,088 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.821e+02 2.378e+02 2.704e+02 3.177e+02 4.711e+02, threshold=5.407e+02, percent-clipped=0.0 2023-09-28 18:45:19,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:45:23,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 18:45:25,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:45:25,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 18:45:25,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 18:45:25,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 18:45:26,849 INFO [train.py:1039] (3/4) Epoch 4, batch 300, loss[loss=0.2612, simple_loss=0.3168, pruned_loss=0.1028, over 24467.00 frames. ], tot_loss[loss=0.2767, simple_loss=0.3281, pruned_loss=0.1126, over 3683951.65 frames. ], batch size: 66, lr: 2.54e-02, grad_scale: 16.0 2023-09-28 18:45:27,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:45:27,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 18:45:32,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:45:32,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:45:38,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:45:38,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 18:45:39,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=108240.0, ans=0.015 2023-09-28 18:45:40,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:45:41,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 18:45:41,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 18:45:41,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:45:42,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=108306.66666666667, ans=0.2 2023-09-28 18:45:47,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:45:52,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:45:52,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 18:45:57,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 18:45:58,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:00,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:46:01,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:01,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 18:46:01,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:46:04,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:46:07,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:46:07,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:46:12,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 18:46:12,019 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 18:46:13,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:46:16,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:18,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 18:46:18,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:46:23,805 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:46:24,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff3.min_abs, batch_count=108440.0, ans=0.2 2023-09-28 18:46:27,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:46:27,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 18:46:28,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=108440.0, ans=0.0 2023-09-28 18:46:30,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:30,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 18:46:30,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=108440.0, ans=0.2 2023-09-28 18:46:33,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:35,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:46:35,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 18:46:35,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:46:38,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:46:39,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 18:46:41,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:46:41,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:46:43,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:46:44,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:46:44,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:46:49,315 INFO [train.py:1039] (3/4) Epoch 4, batch 350, loss[loss=0.2704, simple_loss=0.3382, pruned_loss=0.1013, over 24420.00 frames. ], tot_loss[loss=0.2736, simple_loss=0.3263, pruned_loss=0.1105, over 3929953.58 frames. ], batch size: 69, lr: 2.54e-02, grad_scale: 16.0 2023-09-28 18:46:49,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:46:49,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 18:46:53,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:46:59,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=108573.33333333333, ans=0.125 2023-09-28 18:47:01,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:47:04,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:47:04,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:47:07,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 18:47:09,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:47:10,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 18:47:12,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:47:12,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 18:47:14,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:47:17,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 18:47:18,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:47:21,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:47:22,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:47:24,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:47:24,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:47:24,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:47:24,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:47:25,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:47:26,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:47:26,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:47:34,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=108706.66666666667, ans=0.0 2023-09-28 18:47:35,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:47:36,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:47:36,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:47:36,771 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:47:38,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=108773.33333333333, ans=0.125 2023-09-28 18:47:41,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 18:47:41,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:47:42,431 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.61 vs. limit=15.0 2023-09-28 18:47:46,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:47:47,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:47:47,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:47:49,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 18:47:50,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:47:52,409 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 18:47:52,997 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.52 vs. limit=15.0 2023-09-28 18:47:54,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 18:47:54,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:47:57,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:47:57,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 18:48:00,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:48:01,360 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=23.40 vs. limit=22.5 2023-09-28 18:48:02,974 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.94 vs. limit=15.0 2023-09-28 18:48:03,920 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.921e+02 2.316e+02 2.681e+02 3.192e+02 4.934e+02, threshold=5.363e+02, percent-clipped=0.0 2023-09-28 18:48:04,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 18:48:06,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:48:07,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:48:07,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:48:10,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:48:12,983 INFO [train.py:1039] (3/4) Epoch 4, batch 400, loss[loss=0.2442, simple_loss=0.2998, pruned_loss=0.09429, over 24592.00 frames. ], tot_loss[loss=0.2723, simple_loss=0.3251, pruned_loss=0.1098, over 4109054.84 frames. ], batch size: 60, lr: 2.53e-02, grad_scale: 32.0 2023-09-28 18:48:13,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:48:16,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 18:48:16,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 18:48:17,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:48:17,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:48:20,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:48:21,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:48:22,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:48:25,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:48:28,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 18:48:30,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 18:48:30,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:48:32,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 18:48:32,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:48:37,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:48:37,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:48:39,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 18:48:39,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:48:40,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:48:40,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:48:40,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:48:43,251 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 18:48:44,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 18:48:50,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:48:51,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:48:51,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 18:48:53,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 18:48:56,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:48:59,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:49:02,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=109106.66666666667, ans=0.125 2023-09-28 18:49:05,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 18:49:09,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 18:49:10,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 18:49:12,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=109106.66666666667, ans=0.1 2023-09-28 18:49:14,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:49:14,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:49:15,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 18:49:19,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:49:23,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 18:49:23,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:49:26,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:49:26,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 18:49:29,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 18:49:30,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 18:49:31,229 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=109173.33333333333, ans=0.0 2023-09-28 18:49:34,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:49:34,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:49:34,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 18:49:35,758 INFO [train.py:1039] (3/4) Epoch 4, batch 450, loss[loss=0.2779, simple_loss=0.3334, pruned_loss=0.1112, over 24681.00 frames. ], tot_loss[loss=0.2723, simple_loss=0.3254, pruned_loss=0.1096, over 4259056.02 frames. ], batch size: 68, lr: 2.53e-02, grad_scale: 32.0 2023-09-28 18:49:36,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:49:37,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:49:37,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 18:49:39,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 18:49:40,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:49:40,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:49:40,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:49:43,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 18:49:43,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:49:44,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 18:49:44,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=109240.0, ans=0.0 2023-09-28 18:49:46,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 18:49:55,225 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.32 vs. limit=15.0 2023-09-28 18:49:55,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:49:57,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:49:58,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 18:49:58,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 18:50:03,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:50:06,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:50:07,339 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.62 vs. limit=10.0 2023-09-28 18:50:08,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:50:11,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:50:13,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:50:16,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 18:50:18,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 18:50:20,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 18:50:22,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:50:23,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:50:23,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:50:25,663 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 18:50:25,679 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 18:50:25,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:50:26,047 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=109440.0, ans=0.1 2023-09-28 18:50:27,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:50:27,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 18:50:31,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 18:50:31,149 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 18:50:31,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 18:50:32,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 18:50:35,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:50:37,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 18:50:37,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 18:50:38,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 18:50:43,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 18:50:45,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 18:50:45,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 18:50:46,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 18:50:50,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=109506.66666666667, ans=0.0 2023-09-28 18:50:51,106 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.901e+02 2.294e+02 2.615e+02 3.130e+02 6.732e+02, threshold=5.230e+02, percent-clipped=1.0 2023-09-28 18:50:53,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:50:55,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:50:58,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:50:58,538 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 18:50:59,901 INFO [train.py:1039] (3/4) Epoch 4, batch 500, loss[loss=0.2417, simple_loss=0.2945, pruned_loss=0.09451, over 24351.00 frames. ], tot_loss[loss=0.2728, simple_loss=0.3258, pruned_loss=0.1099, over 4361006.09 frames. ], batch size: 56, lr: 2.53e-02, grad_scale: 32.0 2023-09-28 18:51:00,541 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=109573.33333333333, ans=0.125 2023-09-28 18:51:02,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:51:04,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:51:04,305 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:51:04,324 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 18:51:04,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=109573.33333333333, ans=0.1 2023-09-28 18:51:05,108 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=15.23 vs. limit=15.0 2023-09-28 18:51:07,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 18:51:07,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:51:10,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 18:51:14,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 18:51:15,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=109640.0, ans=0.04949747468305833 2023-09-28 18:51:15,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=109640.0, ans=0.125 2023-09-28 18:51:17,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 18:51:19,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:51:19,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:51:19,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:51:31,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:51:31,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 18:51:31,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 18:51:33,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:51:33,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 18:51:33,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 18:51:36,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:51:36,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:51:37,750 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.64 vs. limit=22.5 2023-09-28 18:51:38,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 18:51:38,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:51:40,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 18:51:43,345 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 18:51:44,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:51:47,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:51:48,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:51:48,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:51:48,216 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=109773.33333333333, ans=0.0 2023-09-28 18:51:49,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 18:51:51,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 18:51:55,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 18:51:55,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:52:01,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:52:03,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:52:10,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:52:12,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 18:52:14,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:52:14,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:52:17,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 18:52:19,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 18:52:20,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:52:22,111 INFO [train.py:1039] (3/4) Epoch 4, batch 550, loss[loss=0.2861, simple_loss=0.3302, pruned_loss=0.121, over 23337.00 frames. ], tot_loss[loss=0.2737, simple_loss=0.3266, pruned_loss=0.1104, over 4449084.41 frames. ], batch size: 105, lr: 2.52e-02, grad_scale: 32.0 2023-09-28 18:52:25,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 18:52:26,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 18:52:26,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:52:26,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 18:52:28,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:52:28,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:52:28,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:52:28,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:52:29,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:52:29,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:52:32,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:52:34,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 18:52:34,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:52:37,744 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.05 vs. limit=15.0 2023-09-28 18:52:38,722 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:52:38,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:52:40,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:52:43,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:52:47,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 18:52:47,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=109973.33333333333, ans=0.1 2023-09-28 18:52:48,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 18:52:51,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:52:53,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=110040.0, ans=0.125 2023-09-28 18:52:58,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:52:58,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:52:59,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:53:04,283 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:53:04,293 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 18:53:04,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:53:05,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 18:53:09,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=110106.66666666667, ans=0.035 2023-09-28 18:53:10,980 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 18:53:11,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 18:53:11,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:53:11,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:53:14,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 18:53:15,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 18:53:15,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:53:15,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:53:15,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:53:15,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:53:16,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=110106.66666666667, ans=0.0 2023-09-28 18:53:21,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:53:21,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 18:53:25,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:53:25,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:53:28,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 18:53:28,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:53:28,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=110173.33333333333, ans=0.1 2023-09-28 18:53:29,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:53:31,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:53:31,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:53:32,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 18:53:32,842 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 18:53:35,861 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.907e+02 2.501e+02 3.093e+02 3.785e+02 7.626e+02, threshold=6.186e+02, percent-clipped=7.0 2023-09-28 18:53:36,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=110173.33333333333, ans=0.0 2023-09-28 18:53:37,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 18:53:43,633 INFO [train.py:1039] (3/4) Epoch 4, batch 600, loss[loss=0.2881, simple_loss=0.3336, pruned_loss=0.1214, over 17375.00 frames. ], tot_loss[loss=0.2751, simple_loss=0.3274, pruned_loss=0.1114, over 4501509.79 frames. ], batch size: 37, lr: 2.52e-02, grad_scale: 32.0 2023-09-28 18:53:43,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 18:53:43,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:53:45,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:53:45,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:53:53,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:53:55,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 18:53:55,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=110240.0, ans=0.125 2023-09-28 18:53:57,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 18:53:58,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 18:53:58,933 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=110240.0, ans=0.1 2023-09-28 18:54:01,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:54:04,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:54:06,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 18:54:06,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:54:12,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 18:54:15,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=110373.33333333333, ans=0.2 2023-09-28 18:54:18,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:54:18,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:54:18,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:54:18,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=110373.33333333333, ans=0.125 2023-09-28 18:54:22,421 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=110373.33333333333, ans=0.125 2023-09-28 18:54:25,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:54:25,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:54:27,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:54:27,815 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.93 vs. limit=6.0 2023-09-28 18:54:29,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=110373.33333333333, ans=0.125 2023-09-28 18:54:33,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:54:37,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=110440.0, ans=0.125 2023-09-28 18:54:38,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:54:38,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:54:38,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:54:39,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=110440.0, ans=0.025 2023-09-28 18:54:45,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 18:54:50,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 18:54:51,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:54:54,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 18:54:56,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 18:55:00,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 18:55:00,142 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 18:55:00,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 18:55:06,793 INFO [train.py:1039] (3/4) Epoch 4, batch 650, loss[loss=0.2595, simple_loss=0.3075, pruned_loss=0.1058, over 24348.00 frames. ], tot_loss[loss=0.2732, simple_loss=0.326, pruned_loss=0.1102, over 4556934.01 frames. ], batch size: 56, lr: 2.52e-02, grad_scale: 32.0 2023-09-28 18:55:06,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 18:55:07,060 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 18:55:10,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:55:12,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:55:15,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:55:17,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 18:55:17,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:55:22,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=110640.0, ans=0.0 2023-09-28 18:55:23,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 18:55:23,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:55:27,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=110640.0, ans=0.0 2023-09-28 18:55:29,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:55:30,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 18:55:32,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=110640.0, ans=0.125 2023-09-28 18:55:34,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:55:34,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:55:38,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:55:38,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 18:55:41,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:55:42,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:55:42,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:55:43,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:55:44,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 18:55:46,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 18:55:46,320 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 18:55:46,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:55:46,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:55:49,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:55:51,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:55:52,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:55:52,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 18:55:54,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 18:55:54,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 18:55:55,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 18:55:57,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 18:55:57,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:55:59,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 18:56:02,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 18:56:03,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 18:56:05,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:05,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:56:05,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:56:05,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:56:05,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=110773.33333333333, ans=0.0 2023-09-28 18:56:08,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:56:15,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:15,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:56:15,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:56:18,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:56:18,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 18:56:20,141 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.763e+02 2.387e+02 2.700e+02 3.231e+02 6.128e+02, threshold=5.400e+02, percent-clipped=0.0 2023-09-28 18:56:20,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:56:26,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 18:56:26,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:56:27,738 INFO [train.py:1039] (3/4) Epoch 4, batch 700, loss[loss=0.2848, simple_loss=0.3465, pruned_loss=0.1116, over 23982.00 frames. ], tot_loss[loss=0.2713, simple_loss=0.3234, pruned_loss=0.1096, over 4577138.73 frames. ], batch size: 80, lr: 2.51e-02, grad_scale: 32.0 2023-09-28 18:56:27,840 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:56:27,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:56:32,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 18:56:34,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 18:56:36,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 18:56:36,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:39,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:56:41,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 18:56:45,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:56:48,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:56:48,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:50,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 18:56:51,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:56:52,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=110973.33333333333, ans=0.1 2023-09-28 18:56:53,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:56:55,380 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=110973.33333333333, ans=0.1 2023-09-28 18:56:56,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 18:56:56,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 18:56:58,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 18:57:00,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 18:57:05,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 18:57:05,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:57:08,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 18:57:11,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:57:13,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 18:57:18,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:57:19,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:57:19,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 18:57:24,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:57:26,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:57:28,290 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=111106.66666666667, ans=22.5 2023-09-28 18:57:30,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:57:36,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 18:57:36,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 18:57:40,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 18:57:40,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 18:57:43,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:57:45,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:57:46,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:57:50,922 INFO [train.py:1039] (3/4) Epoch 4, batch 750, loss[loss=0.2814, simple_loss=0.3254, pruned_loss=0.1188, over 23808.00 frames. ], tot_loss[loss=0.2705, simple_loss=0.3224, pruned_loss=0.1092, over 4605562.96 frames. ], batch size: 179, lr: 2.51e-02, grad_scale: 32.0 2023-09-28 18:57:51,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:57:51,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 18:57:54,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 18:57:54,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 18:57:55,836 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.31 vs. limit=15.0 2023-09-28 18:57:56,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 18:57:57,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 18:57:57,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 18:57:57,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:57:59,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 18:57:59,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:57:59,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=111240.0, ans=0.125 2023-09-28 18:58:01,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:58:01,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=111240.0, ans=0.125 2023-09-28 18:58:02,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:58:04,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:58:05,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 18:58:05,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:58:07,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 18:58:08,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 18:58:10,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 18:58:12,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:58:14,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:58:14,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 18:58:15,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 18:58:17,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:58:18,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 18:58:20,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 18:58:20,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=111306.66666666667, ans=0.125 2023-09-28 18:58:20,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=111306.66666666667, ans=0.0 2023-09-28 18:58:21,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 18:58:21,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:58:25,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 18:58:25,632 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 18:58:27,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 18:58:27,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 18:58:27,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 18:58:27,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=111373.33333333333, ans=0.0 2023-09-28 18:58:27,673 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=111373.33333333333, ans=0.0 2023-09-28 18:58:28,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 18:58:32,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=111373.33333333333, ans=0.1 2023-09-28 18:58:35,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 18:58:36,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:58:36,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 18:58:38,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:58:39,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:58:39,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 18:58:41,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 18:58:43,477 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.59 vs. limit=15.0 2023-09-28 18:58:44,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 18:58:44,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 18:58:46,982 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.04 vs. limit=22.5 2023-09-28 18:58:47,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 18:58:47,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 18:58:48,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:58:54,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:58:54,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=111506.66666666667, ans=0.125 2023-09-28 18:58:55,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 18:58:55,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:58:57,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=111506.66666666667, ans=0.2 2023-09-28 18:58:59,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 18:59:03,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 18:59:04,538 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.995e+02 2.483e+02 2.790e+02 3.186e+02 5.320e+02, threshold=5.579e+02, percent-clipped=0.0 2023-09-28 18:59:04,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:59:04,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:59:09,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:59:09,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:59:09,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=111506.66666666667, ans=0.125 2023-09-28 18:59:12,121 INFO [train.py:1039] (3/4) Epoch 4, batch 800, loss[loss=0.2575, simple_loss=0.3148, pruned_loss=0.1001, over 24414.00 frames. ], tot_loss[loss=0.2707, simple_loss=0.3228, pruned_loss=0.1093, over 4630663.19 frames. ], batch size: 58, lr: 2.51e-02, grad_scale: 32.0 2023-09-28 18:59:12,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:59:12,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 18:59:20,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 18:59:20,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:59:23,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 18:59:23,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:59:24,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:59:24,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:59:26,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:59:30,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=111640.0, ans=0.2 2023-09-28 18:59:31,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:59:31,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 18:59:35,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 18:59:37,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:59:37,863 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=111640.0, ans=0.2 2023-09-28 18:59:38,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 18:59:38,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 18:59:39,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:59:39,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 18:59:39,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:59:39,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 18:59:42,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 18:59:45,421 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 18:59:46,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 18:59:46,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 18:59:50,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:59:50,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 18:59:51,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=111706.66666666667, ans=0.0 2023-09-28 18:59:56,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 18:59:56,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 18:59:56,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 18:59:58,494 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 18:59:59,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 18:59:59,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 18:59:59,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:00:01,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:00:02,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:00:06,879 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.89 vs. limit=22.5 2023-09-28 19:00:08,255 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 19:00:08,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 19:00:10,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:00:13,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:00:18,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:00:21,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:00:23,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 19:00:23,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:00:26,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 19:00:28,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=111840.0, ans=0.125 2023-09-28 19:00:34,260 INFO [train.py:1039] (3/4) Epoch 4, batch 850, loss[loss=0.2415, simple_loss=0.3012, pruned_loss=0.09091, over 24392.00 frames. ], tot_loss[loss=0.27, simple_loss=0.3228, pruned_loss=0.1086, over 4666413.28 frames. ], batch size: 58, lr: 2.50e-02, grad_scale: 32.0 2023-09-28 19:00:34,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:00:36,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:00:36,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 19:00:36,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=111906.66666666667, ans=0.125 2023-09-28 19:00:37,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:00:37,713 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:00:39,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 19:00:40,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:00:40,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:00:41,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=111906.66666666667, ans=15.0 2023-09-28 19:00:42,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:00:44,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:00:46,636 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:00:48,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 19:00:48,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 19:00:48,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 19:00:49,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:00:49,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:00:52,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:00:52,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:00:53,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:00:58,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:00:59,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:00:59,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 19:01:02,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 19:01:04,991 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:01:06,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 19:01:10,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 19:01:12,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 19:01:14,096 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 19:01:16,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:01:16,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:01:16,149 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 19:01:16,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=112040.0, ans=0.1 2023-09-28 19:01:17,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:01:20,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=112040.0, ans=0.0 2023-09-28 19:01:21,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:01:21,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 19:01:24,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:01:24,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:01:25,894 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:01:25,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 19:01:28,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:01:30,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 19:01:30,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 19:01:35,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:01:35,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:01:35,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:01:35,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:01:37,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:01:38,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:01:42,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 19:01:43,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:01:45,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:01:46,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:01:46,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=112173.33333333333, ans=0.0 2023-09-28 19:01:47,939 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.846e+02 2.332e+02 2.611e+02 3.097e+02 5.192e+02, threshold=5.223e+02, percent-clipped=0.0 2023-09-28 19:01:52,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:01:54,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:01:56,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 19:01:56,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:01:56,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:01:57,612 INFO [train.py:1039] (3/4) Epoch 4, batch 900, loss[loss=0.2707, simple_loss=0.32, pruned_loss=0.1106, over 23617.00 frames. ], tot_loss[loss=0.2714, simple_loss=0.3239, pruned_loss=0.1094, over 4680291.04 frames. ], batch size: 93, lr: 2.50e-02, grad_scale: 32.0 2023-09-28 19:01:59,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 19:02:05,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=112240.0, ans=0.125 2023-09-28 19:02:06,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:02:09,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:02:09,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 19:02:13,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:02:13,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=112306.66666666667, ans=0.0 2023-09-28 19:02:15,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 19:02:15,204 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 19:02:16,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:02:16,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:02:16,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:02:18,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:02:29,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:02:29,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:02:29,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:02:33,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:02:38,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 19:02:40,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:02:45,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:02:46,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:02:46,789 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 19:02:48,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 19:02:54,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:02:54,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:02:55,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:03:01,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:03:01,427 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:03:05,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 19:03:05,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:03:07,830 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=6.89 vs. limit=15.0 2023-09-28 19:03:08,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 19:03:10,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:03:10,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:03:13,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:03:13,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:03:13,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=112506.66666666667, ans=15.0 2023-09-28 19:03:15,262 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=112506.66666666667, ans=0.125 2023-09-28 19:03:15,813 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.37 vs. limit=15.0 2023-09-28 19:03:16,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 19:03:16,632 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 19:03:16,946 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=112506.66666666667, ans=0.2 2023-09-28 19:03:18,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 19:03:18,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 19:03:20,104 INFO [train.py:1039] (3/4) Epoch 4, batch 950, loss[loss=0.2878, simple_loss=0.3353, pruned_loss=0.1201, over 23523.00 frames. ], tot_loss[loss=0.2723, simple_loss=0.3244, pruned_loss=0.1101, over 4687951.46 frames. ], batch size: 134, lr: 2.50e-02, grad_scale: 32.0 2023-09-28 19:03:21,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:03:26,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 19:03:26,988 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.88 vs. limit=22.5 2023-09-28 19:03:29,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:03:30,119 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=112573.33333333333, ans=0.125 2023-09-28 19:03:33,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:03:33,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:03:34,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 19:03:36,478 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 19:03:41,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:03:41,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:03:41,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:03:43,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:03:43,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 19:03:45,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 19:03:46,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:03:47,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 19:03:48,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:03:53,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:03:53,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:03:53,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:03:55,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 19:03:56,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 19:04:00,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:04:01,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:04:05,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:04:06,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:04:07,748 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.45 vs. limit=10.0 2023-09-28 19:04:08,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 19:04:10,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 19:04:10,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:04:12,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:04:12,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:04:12,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:04:13,070 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.96 vs. limit=10.0 2023-09-28 19:04:17,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 19:04:18,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:04:21,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:04:23,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:04:23,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 19:04:24,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:04:24,726 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:04:26,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 19:04:30,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:04:33,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:04:34,689 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.828e+02 2.515e+02 2.858e+02 3.350e+02 4.786e+02, threshold=5.716e+02, percent-clipped=0.0 2023-09-28 19:04:37,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:04:38,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 19:04:38,623 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 19:04:42,973 INFO [train.py:1039] (3/4) Epoch 4, batch 1000, loss[loss=0.28, simple_loss=0.316, pruned_loss=0.122, over 23717.00 frames. ], tot_loss[loss=0.2725, simple_loss=0.3247, pruned_loss=0.1102, over 4700913.36 frames. ], batch size: 232, lr: 2.50e-02, grad_scale: 32.0 2023-09-28 19:04:43,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:04:46,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 19:04:46,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:04:51,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:04:53,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 19:04:53,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 19:04:59,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:04:59,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:04:59,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:05:04,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 19:05:08,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 19:05:10,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 19:05:10,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:05:11,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 19:05:13,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 19:05:14,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 19:05:16,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:05:17,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:05:18,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=113040.0, ans=0.125 2023-09-28 19:05:27,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:05:27,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:05:29,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:05:30,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:05:30,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 19:05:31,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:05:31,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:05:32,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:05:33,263 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 19:05:36,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 19:05:37,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 19:05:38,173 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=113106.66666666667, ans=0.125 2023-09-28 19:05:39,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 19:05:40,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:05:47,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:05:47,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:05:47,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:05:49,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:05:50,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 19:05:52,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:05:53,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 19:05:53,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 19:05:55,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:05:55,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:05:58,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:06:02,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:06:04,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:06:05,833 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.08 vs. limit=15.0 2023-09-28 19:06:06,321 INFO [train.py:1039] (3/4) Epoch 4, batch 1050, loss[loss=0.2631, simple_loss=0.3262, pruned_loss=0.1, over 24664.00 frames. ], tot_loss[loss=0.2701, simple_loss=0.3217, pruned_loss=0.1093, over 4690115.60 frames. ], batch size: 68, lr: 2.49e-02, grad_scale: 32.0 2023-09-28 19:06:09,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:06:11,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:06:11,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=113240.0, ans=0.2 2023-09-28 19:06:12,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 19:06:14,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:06:15,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:06:16,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:06:18,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:06:21,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:06:22,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:06:22,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:06:24,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:06:24,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 19:06:25,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:06:26,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 19:06:29,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:06:29,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 19:06:29,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:06:38,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:06:38,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:06:40,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:06:41,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 19:06:41,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 19:06:42,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:06:45,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 19:06:48,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 19:06:48,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:06:51,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 19:06:54,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:06:55,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:06:55,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:06:58,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:07:02,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 19:07:03,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 19:07:03,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 19:07:05,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:07:05,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:07:06,772 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.19 vs. limit=15.0 2023-09-28 19:07:07,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 19:07:08,589 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.58 vs. limit=22.5 2023-09-28 19:07:13,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:07:14,075 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:07:15,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:07:15,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:07:16,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:07:16,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:07:20,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:07:20,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 19:07:21,387 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.899e+02 2.368e+02 2.685e+02 3.530e+02 6.169e+02, threshold=5.370e+02, percent-clipped=1.0 2023-09-28 19:07:23,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:07:23,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 19:07:23,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 19:07:24,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:07:28,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:07:28,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=113573.33333333333, ans=0.0 2023-09-28 19:07:29,809 INFO [train.py:1039] (3/4) Epoch 4, batch 1100, loss[loss=0.2666, simple_loss=0.3155, pruned_loss=0.1089, over 23597.00 frames. ], tot_loss[loss=0.2696, simple_loss=0.321, pruned_loss=0.1091, over 4689523.74 frames. ], batch size: 256, lr: 2.49e-02, grad_scale: 32.0 2023-09-28 19:07:34,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:07:38,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:07:38,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:07:40,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:07:40,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 19:07:44,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:07:44,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=113573.33333333333, ans=0.0 2023-09-28 19:07:45,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 19:07:47,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:07:48,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=113640.0, ans=0.07 2023-09-28 19:07:50,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=113640.0, ans=0.0 2023-09-28 19:07:51,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:07:51,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 19:07:53,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 19:07:54,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:07:56,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:07:57,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:08:00,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:08:02,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_positive, batch_count=113706.66666666667, ans=0.05 2023-09-28 19:08:04,731 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:08:06,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 19:08:08,024 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 19:08:09,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:08:11,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:08:12,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 19:08:13,155 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=113706.66666666667, ans=0.2 2023-09-28 19:08:15,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:08:16,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 19:08:16,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:08:16,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:08:16,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:08:16,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:08:16,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 19:08:22,177 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=113773.33333333333, ans=0.125 2023-09-28 19:08:23,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:08:23,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 19:08:26,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:08:28,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=113773.33333333333, ans=0.125 2023-09-28 19:08:30,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:08:34,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 19:08:34,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 19:08:37,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:08:40,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:08:40,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:08:42,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 19:08:43,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:08:43,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:08:44,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 19:08:44,171 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:08:45,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 19:08:47,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=113840.0, ans=0.125 2023-09-28 19:08:49,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:08:49,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:08:51,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:08:52,497 INFO [train.py:1039] (3/4) Epoch 4, batch 1150, loss[loss=0.2527, simple_loss=0.3109, pruned_loss=0.09718, over 24638.00 frames. ], tot_loss[loss=0.2699, simple_loss=0.3217, pruned_loss=0.1091, over 4698922.19 frames. ], batch size: 60, lr: 2.49e-02, grad_scale: 32.0 2023-09-28 19:08:57,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:08:59,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:09:00,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=113906.66666666667, ans=0.2 2023-09-28 19:09:01,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:09:01,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:09:02,485 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 19:09:02,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:09:05,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 19:09:05,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:09:05,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:09:12,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 19:09:14,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:09:20,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:09:20,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:09:20,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 19:09:20,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:09:20,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:09:24,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 19:09:26,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:09:27,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:09:33,620 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.37 vs. limit=22.5 2023-09-28 19:09:38,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:09:45,633 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.72 vs. limit=10.0 2023-09-28 19:09:47,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:09:47,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 19:09:47,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:09:47,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=114106.66666666667, ans=0.125 2023-09-28 19:09:48,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:09:53,436 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 19:09:56,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:10:03,239 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 19:10:06,259 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.775e+02 2.334e+02 2.773e+02 3.498e+02 6.141e+02, threshold=5.547e+02, percent-clipped=2.0 2023-09-28 19:10:07,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:10:08,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:10:08,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:10:08,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:10:11,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:10:12,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=114173.33333333333, ans=0.0 2023-09-28 19:10:14,772 INFO [train.py:1039] (3/4) Epoch 4, batch 1200, loss[loss=0.3619, simple_loss=0.3833, pruned_loss=0.1703, over 19511.00 frames. ], tot_loss[loss=0.2706, simple_loss=0.3229, pruned_loss=0.1092, over 4710311.08 frames. ], batch size: 388, lr: 2.48e-02, grad_scale: 32.0 2023-09-28 19:10:17,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:10:17,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:10:19,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:10:19,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:10:19,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:10:19,756 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:10:23,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:10:23,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=114240.0, ans=0.0 2023-09-28 19:10:23,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=114240.0, ans=0.0 2023-09-28 19:10:24,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:10:26,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:10:26,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:10:29,329 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 19:10:32,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 19:10:34,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=114306.66666666667, ans=0.2 2023-09-28 19:10:37,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:10:39,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:10:42,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:10:44,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:10:44,578 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 19:10:44,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:10:53,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 19:10:53,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:10:53,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 19:10:55,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:10:59,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 19:11:02,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 19:11:02,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:11:05,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:11:06,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:11:06,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:11:08,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:11:08,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:11:08,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:11:10,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 19:11:10,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:11:12,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:11:12,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:11:15,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:11:15,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:11:20,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 19:11:22,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:11:23,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 19:11:27,176 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 19:11:29,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:11:32,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:11:34,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:11:35,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:11:37,100 INFO [train.py:1039] (3/4) Epoch 4, batch 1250, loss[loss=0.2649, simple_loss=0.3168, pruned_loss=0.1065, over 23486.00 frames. ], tot_loss[loss=0.2712, simple_loss=0.3238, pruned_loss=0.1093, over 4702035.36 frames. ], batch size: 120, lr: 2.48e-02, grad_scale: 32.0 2023-09-28 19:11:38,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 19:11:42,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:11:44,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:11:44,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 19:11:48,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:11:49,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:11:51,512 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=114573.33333333333, ans=0.1 2023-09-28 19:11:51,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=114573.33333333333, ans=0.125 2023-09-28 19:11:54,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 19:11:54,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:11:55,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:11:55,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:11:58,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:12:04,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 19:12:04,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 19:12:04,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:12:06,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:12:06,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:12:09,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:12:10,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:12:14,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 19:12:15,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:12:19,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:12:20,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 19:12:21,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:12:21,474 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 19:12:21,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:12:21,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:12:24,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:12:27,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:12:29,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:12:30,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 19:12:30,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 19:12:32,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 19:12:34,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:12:34,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=114773.33333333333, ans=0.0 2023-09-28 19:12:34,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=114773.33333333333, ans=0.125 2023-09-28 19:12:36,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 19:12:37,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:12:40,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 19:12:40,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:12:42,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 19:12:42,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 19:12:43,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:12:43,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:12:45,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:12:48,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 19:12:50,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:12:51,350 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.852e+02 2.462e+02 2.704e+02 3.277e+02 4.911e+02, threshold=5.408e+02, percent-clipped=0.0 2023-09-28 19:12:51,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:12:53,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:12:56,375 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=114840.0, ans=0.2 2023-09-28 19:12:58,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 19:13:00,821 INFO [train.py:1039] (3/4) Epoch 4, batch 1300, loss[loss=0.2934, simple_loss=0.3355, pruned_loss=0.1257, over 23484.00 frames. ], tot_loss[loss=0.2717, simple_loss=0.3244, pruned_loss=0.1095, over 4705071.96 frames. ], batch size: 120, lr: 2.48e-02, grad_scale: 32.0 2023-09-28 19:13:01,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:13:02,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 19:13:05,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:13:07,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:13:08,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:13:12,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:13:13,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:13:13,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 19:13:20,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:13:20,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:13:20,390 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=114973.33333333333, ans=0.0 2023-09-28 19:13:21,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 19:13:26,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:13:30,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:13:30,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:13:32,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:13:33,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=115040.0, ans=0.125 2023-09-28 19:13:34,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:13:35,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:13:37,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 19:13:37,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 19:13:40,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=115040.0, ans=0.125 2023-09-28 19:13:43,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:13:43,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:13:45,704 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 19:13:47,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 19:13:48,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:13:49,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=115106.66666666667, ans=0.125 2023-09-28 19:13:51,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:13:53,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 19:13:53,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:13:53,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 19:13:54,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:13:59,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:13:59,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:14:04,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 19:14:04,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 19:14:06,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 19:14:11,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:14:13,111 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 19:14:14,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:14:21,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=115240.0, ans=0.125 2023-09-28 19:14:22,659 INFO [train.py:1039] (3/4) Epoch 4, batch 1350, loss[loss=0.2378, simple_loss=0.2962, pruned_loss=0.08966, over 24613.00 frames. ], tot_loss[loss=0.2686, simple_loss=0.3215, pruned_loss=0.1079, over 4707746.18 frames. ], batch size: 60, lr: 2.47e-02, grad_scale: 32.0 2023-09-28 19:14:24,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 19:14:28,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:14:30,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:14:33,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:14:33,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:14:35,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:14:35,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:14:39,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=115306.66666666667, ans=0.125 2023-09-28 19:14:43,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:14:44,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 19:14:44,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 19:14:46,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:14:48,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 19:14:49,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:14:51,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:14:51,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 19:14:52,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 19:14:54,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 19:14:56,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:14:56,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 19:15:06,363 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=115373.33333333333, ans=0.05 2023-09-28 19:15:07,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:15:13,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=115440.0, ans=0.125 2023-09-28 19:15:16,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:15:18,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:15:18,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 19:15:21,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:15:21,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 19:15:21,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 19:15:23,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:15:26,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:15:29,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 19:15:30,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:15:36,830 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.790e+02 2.262e+02 2.530e+02 3.010e+02 4.866e+02, threshold=5.060e+02, percent-clipped=0.0 2023-09-28 19:15:38,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 19:15:40,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 19:15:43,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=115573.33333333333, ans=0.1 2023-09-28 19:15:45,546 INFO [train.py:1039] (3/4) Epoch 4, batch 1400, loss[loss=0.2597, simple_loss=0.3048, pruned_loss=0.1073, over 23589.00 frames. ], tot_loss[loss=0.2679, simple_loss=0.3202, pruned_loss=0.1078, over 4704898.89 frames. ], batch size: 256, lr: 2.47e-02, grad_scale: 32.0 2023-09-28 19:15:45,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 19:15:46,186 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=115573.33333333333, ans=0.2 2023-09-28 19:15:48,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:15:52,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:15:52,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:15:57,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 19:16:00,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 19:16:06,494 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.32 vs. limit=6.0 2023-09-28 19:16:08,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:16:10,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:16:13,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:16:13,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 19:16:15,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=115640.0, ans=0.035 2023-09-28 19:16:16,641 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.77 vs. limit=15.0 2023-09-28 19:16:17,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:16:20,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 19:16:28,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:16:30,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:16:35,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 19:16:37,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:16:38,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:16:38,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:16:38,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:16:40,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:16:40,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:16:40,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:16:40,597 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=115773.33333333333, ans=0.2 2023-09-28 19:16:43,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 19:16:43,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:16:48,256 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.59 vs. limit=12.0 2023-09-28 19:16:49,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:16:52,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:17:00,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 19:17:01,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 19:17:03,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:17:06,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 19:17:06,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:17:07,665 INFO [train.py:1039] (3/4) Epoch 4, batch 1450, loss[loss=0.2954, simple_loss=0.3436, pruned_loss=0.1236, over 24024.00 frames. ], tot_loss[loss=0.267, simple_loss=0.3196, pruned_loss=0.1072, over 4714153.69 frames. ], batch size: 86, lr: 2.47e-02, grad_scale: 32.0 2023-09-28 19:17:07,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:17:12,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:17:15,948 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:17:15,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:15,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 19:17:16,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten.whitening_limit, batch_count=115906.66666666667, ans=15.0 2023-09-28 19:17:20,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:17:20,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:17:22,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:17:22,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 19:17:22,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:17:24,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 19:17:24,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:25,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:17:25,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 19:17:27,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:17:29,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:17:29,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 19:17:31,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:17:31,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:17:33,271 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:35,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:17:35,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=115973.33333333333, ans=0.125 2023-09-28 19:17:38,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:17:38,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:17:39,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:17:41,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:43,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:17:43,043 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:17:43,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:17:45,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:17:49,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 19:17:52,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:17:54,415 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 19:17:55,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:17:59,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:17:59,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:18:00,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 19:18:03,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=116106.66666666667, ans=22.5 2023-09-28 19:18:06,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:18:06,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 19:18:08,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 19:18:09,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:18:12,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:18:14,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:18:15,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 19:18:17,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 19:18:17,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 19:18:20,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:18:21,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=116173.33333333333, ans=0.125 2023-09-28 19:18:22,091 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.906e+02 2.300e+02 2.689e+02 3.268e+02 5.170e+02, threshold=5.379e+02, percent-clipped=2.0 2023-09-28 19:18:22,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:18:29,431 INFO [train.py:1039] (3/4) Epoch 4, batch 1500, loss[loss=0.2843, simple_loss=0.3229, pruned_loss=0.1228, over 23913.00 frames. ], tot_loss[loss=0.2669, simple_loss=0.3197, pruned_loss=0.107, over 4710614.10 frames. ], batch size: 179, lr: 2.46e-02, grad_scale: 32.0 2023-09-28 19:18:34,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 19:18:35,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:18:35,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:18:36,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:18:38,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:18:38,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:18:40,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 19:18:40,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=116240.0, ans=0.1 2023-09-28 19:18:41,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:18:42,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:18:42,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:18:43,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:18:44,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:18:47,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:18:53,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:18:53,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 19:18:54,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:18:54,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:18:56,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:18:57,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 19:18:58,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=116306.66666666667, ans=0.0 2023-09-28 19:18:58,472 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.52 vs. limit=22.5 2023-09-28 19:19:02,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 19:19:04,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:19:04,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 19:19:06,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 19:19:09,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:19:09,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:19:09,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:19:11,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 19:19:11,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:19:13,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:19:13,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 19:19:15,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:19:21,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:19:21,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 19:19:21,925 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.55 vs. limit=6.0 2023-09-28 19:19:28,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:19:29,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:19:34,006 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 19:19:35,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:19:35,492 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 19:19:37,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:19:37,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:19:38,755 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 19:19:40,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:19:41,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 19:19:44,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:19:47,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:19:47,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:19:49,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:19:49,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:19:51,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:19:51,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 19:19:52,779 INFO [train.py:1039] (3/4) Epoch 4, batch 1550, loss[loss=0.2806, simple_loss=0.3331, pruned_loss=0.114, over 24012.00 frames. ], tot_loss[loss=0.2672, simple_loss=0.3206, pruned_loss=0.1069, over 4723502.64 frames. ], batch size: 86, lr: 2.46e-02, grad_scale: 32.0 2023-09-28 19:19:52,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 19:19:52,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:19:53,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 19:19:54,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 19:19:58,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:20:00,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:20:00,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:20:00,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:20:02,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:20:03,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:20:07,105 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 19:20:07,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:20:07,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:20:07,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:20:07,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=116640.0, ans=0.125 2023-09-28 19:20:10,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:20:10,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 19:20:11,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:20:12,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 19:20:13,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 19:20:13,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 19:20:13,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:20:15,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:20:21,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:20:22,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 19:20:22,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 19:20:33,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:20:36,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:20:36,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 19:20:36,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:20:36,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 19:20:41,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:20:42,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:20:45,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:20:46,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=116773.33333333333, ans=0.125 2023-09-28 19:20:48,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:20:49,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:20:49,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 19:20:50,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:20:51,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:20:52,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:20:53,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 19:20:53,966 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 19:20:56,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:21:02,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 19:21:06,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:21:07,902 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.992e+02 2.334e+02 2.588e+02 3.124e+02 7.530e+02, threshold=5.176e+02, percent-clipped=1.0 2023-09-28 19:21:08,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:21:09,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 19:21:11,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:21:12,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:21:12,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:21:12,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:21:14,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:21:15,410 INFO [train.py:1039] (3/4) Epoch 4, batch 1600, loss[loss=0.2662, simple_loss=0.3115, pruned_loss=0.1105, over 23752.00 frames. ], tot_loss[loss=0.2696, simple_loss=0.3223, pruned_loss=0.1085, over 4705320.36 frames. ], batch size: 164, lr: 2.46e-02, grad_scale: 32.0 2023-09-28 19:21:16,312 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.03 vs. limit=15.0 2023-09-28 19:21:18,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:21:18,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 19:21:20,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 19:21:23,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 19:21:24,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:21:26,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 19:21:28,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:21:30,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:21:31,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=116973.33333333333, ans=0.2 2023-09-28 19:21:32,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=116973.33333333333, ans=0.125 2023-09-28 19:21:34,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:21:40,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 19:21:43,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:21:43,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 19:21:43,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:21:45,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 19:21:49,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 19:21:57,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:22:01,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 19:22:02,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:22:02,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:22:02,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:22:03,758 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.59 vs. limit=15.0 2023-09-28 19:22:05,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 19:22:09,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 19:22:11,276 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:22:11,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=117106.66666666667, ans=0.0 2023-09-28 19:22:12,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:22:12,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:22:12,912 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:22:16,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:22:18,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:22:19,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:22:24,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:22:25,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:22:29,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 19:22:29,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:22:29,383 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 19:22:33,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:22:36,650 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=6.13 vs. limit=6.0 2023-09-28 19:22:37,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:22:37,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:22:37,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=117240.0, ans=10.0 2023-09-28 19:22:39,573 INFO [train.py:1039] (3/4) Epoch 4, batch 1650, loss[loss=0.2534, simple_loss=0.3074, pruned_loss=0.0997, over 23300.00 frames. ], tot_loss[loss=0.2703, simple_loss=0.3227, pruned_loss=0.1089, over 4692645.52 frames. ], batch size: 105, lr: 2.46e-02, grad_scale: 32.0 2023-09-28 19:22:39,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 19:22:39,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 19:22:39,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 19:22:39,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 19:22:40,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=117240.0, ans=0.125 2023-09-28 19:22:44,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:22:45,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:22:45,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:22:47,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:22:50,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:22:51,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=117240.0, ans=0.2 2023-09-28 19:22:53,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 19:22:55,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:22:57,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:22:57,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:22:57,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:22:57,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 19:22:57,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 19:23:04,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:23:07,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:23:07,987 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=14.55 vs. limit=15.0 2023-09-28 19:23:19,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 19:23:19,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:23:20,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 19:23:23,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:23:26,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:23:26,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:23:26,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:23:27,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:23:29,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:23:31,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:23:31,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:23:32,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:23:32,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:23:32,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:23:33,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:23:36,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:23:37,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 19:23:39,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:23:39,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 19:23:41,848 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.99 vs. limit=12.0 2023-09-28 19:23:42,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 19:23:42,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 19:23:42,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:23:43,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:23:45,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:23:45,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:23:45,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 19:23:50,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:23:53,773 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.658e+02 2.422e+02 2.762e+02 3.244e+02 4.441e+02, threshold=5.524e+02, percent-clipped=0.0 2023-09-28 19:23:53,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:23:53,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:23:55,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 19:24:00,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:24:00,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:24:00,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 19:24:01,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:24:01,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:24:01,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:24:02,363 INFO [train.py:1039] (3/4) Epoch 4, batch 1700, loss[loss=0.2648, simple_loss=0.3097, pruned_loss=0.11, over 23410.00 frames. ], tot_loss[loss=0.2699, simple_loss=0.3214, pruned_loss=0.1092, over 4697121.41 frames. ], batch size: 134, lr: 2.45e-02, grad_scale: 32.0 2023-09-28 19:24:02,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=117573.33333333333, ans=0.04949747468305833 2023-09-28 19:24:05,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:24:06,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:24:06,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 19:24:10,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:24:19,294 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=117640.0, ans=0.2 2023-09-28 19:24:20,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:24:21,317 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.52 vs. limit=12.0 2023-09-28 19:24:22,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:24:27,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:24:28,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:24:30,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:24:30,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:24:31,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 19:24:34,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:24:34,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:24:38,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:24:39,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:24:41,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 19:24:43,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 19:24:43,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:24:45,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 19:24:46,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:24:56,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:24:58,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:24:59,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:25:00,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 19:25:00,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 19:25:01,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:25:03,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:25:03,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 19:25:04,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:25:04,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:25:04,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:25:04,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:25:07,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:25:07,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:25:09,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:25:09,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:25:09,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:25:14,559 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:25:16,054 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 19:25:18,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:25:18,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:25:19,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 19:25:24,950 INFO [train.py:1039] (3/4) Epoch 4, batch 1750, loss[loss=0.3079, simple_loss=0.3429, pruned_loss=0.1364, over 23880.00 frames. ], tot_loss[loss=0.2683, simple_loss=0.3198, pruned_loss=0.1084, over 4702942.17 frames. ], batch size: 195, lr: 2.45e-02, grad_scale: 32.0 2023-09-28 19:25:26,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:25:28,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:25:28,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:25:30,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 19:25:31,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:25:34,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:25:34,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:25:39,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 19:25:40,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:25:44,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 19:25:44,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:25:46,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:25:46,855 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.63 vs. limit=15.0 2023-09-28 19:25:48,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 19:25:51,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 19:25:51,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:25:53,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 19:25:56,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=117973.33333333333, ans=0.0 2023-09-28 19:26:03,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:26:04,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=118040.0, ans=0.0 2023-09-28 19:26:06,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:26:06,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:26:12,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:26:12,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:26:14,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:26:16,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:26:18,204 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:26:19,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:26:19,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 19:26:21,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:26:23,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=118106.66666666667, ans=0.125 2023-09-28 19:26:24,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 19:26:24,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:26:25,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:26:27,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:26:30,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=118173.33333333333, ans=0.125 2023-09-28 19:26:32,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:26:32,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 19:26:32,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:26:35,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=118173.33333333333, ans=0.2 2023-09-28 19:26:36,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:26:39,665 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.369e+02 2.693e+02 3.225e+02 5.418e+02, threshold=5.386e+02, percent-clipped=0.0 2023-09-28 19:26:41,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:26:44,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:26:44,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:26:47,453 INFO [train.py:1039] (3/4) Epoch 4, batch 1800, loss[loss=0.2842, simple_loss=0.3388, pruned_loss=0.1148, over 23298.00 frames. ], tot_loss[loss=0.2669, simple_loss=0.3191, pruned_loss=0.1073, over 4704844.16 frames. ], batch size: 93, lr: 2.45e-02, grad_scale: 32.0 2023-09-28 19:26:47,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 19:26:47,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:26:48,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:26:48,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:26:49,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:26:49,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:26:50,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:26:52,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:26:53,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=118240.0, ans=0.1 2023-09-28 19:26:54,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:26:55,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 19:26:57,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=118240.0, ans=0.125 2023-09-28 19:26:58,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:27:00,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 19:27:00,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=118240.0, ans=0.125 2023-09-28 19:27:04,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:27:07,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:27:10,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:27:10,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:27:12,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:27:14,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:27:14,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 19:27:15,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:27:19,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:27:21,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 19:27:24,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 19:27:24,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 19:27:24,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:27:26,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:27:26,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:27:27,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:27:34,006 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 19:27:34,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:27:36,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:27:36,703 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=118440.0, ans=0.125 2023-09-28 19:27:38,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 19:27:38,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 19:27:40,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:27:41,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:27:41,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:27:45,386 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:27:46,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 19:27:46,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=118440.0, ans=0.0 2023-09-28 19:27:49,435 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.38 vs. limit=10.0 2023-09-28 19:27:51,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:27:51,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 19:27:51,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:27:53,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:27:53,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:27:53,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 19:27:58,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:27:58,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:28:01,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 19:28:01,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:28:03,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:28:04,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:28:04,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:28:06,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:28:06,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:28:10,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:28:10,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:28:11,325 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=13.36 vs. limit=15.0 2023-09-28 19:28:11,768 INFO [train.py:1039] (3/4) Epoch 4, batch 1850, loss[loss=0.3026, simple_loss=0.349, pruned_loss=0.1281, over 23444.00 frames. ], tot_loss[loss=0.2675, simple_loss=0.3198, pruned_loss=0.1076, over 4702928.23 frames. ], batch size: 285, lr: 2.44e-02, grad_scale: 32.0 2023-09-28 19:28:13,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:28:13,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:28:20,461 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.76 vs. limit=15.0 2023-09-28 19:28:23,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:28:23,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 19:28:26,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 19:28:29,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 19:28:33,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:28:33,915 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=118640.0, ans=0.0 2023-09-28 19:28:35,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 19:28:35,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 19:28:44,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:28:46,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 19:28:49,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:28:50,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:28:54,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 19:28:54,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:28:54,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:28:57,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:28:59,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:29:01,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:29:04,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:29:04,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:29:05,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 19:29:05,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:29:09,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:29:09,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:29:14,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 19:29:14,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:29:19,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:29:19,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:29:19,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 19:29:19,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 19:29:20,954 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 19:29:21,066 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 19:29:21,288 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=118840.0, ans=0.2 2023-09-28 19:29:24,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:29:24,609 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:29:24,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:29:24,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:29:24,776 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 19:29:24,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:29:26,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:29:27,991 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.758e+02 2.551e+02 2.974e+02 3.413e+02 5.793e+02, threshold=5.947e+02, percent-clipped=2.0 2023-09-28 19:29:28,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:29:28,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:29:29,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:29:29,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 19:29:33,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:29:33,133 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 19:29:33,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:29:34,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:29:36,077 INFO [train.py:1039] (3/4) Epoch 4, batch 1900, loss[loss=0.216, simple_loss=0.2801, pruned_loss=0.07597, over 24607.00 frames. ], tot_loss[loss=0.2681, simple_loss=0.3208, pruned_loss=0.1077, over 4701939.14 frames. ], batch size: 60, lr: 2.44e-02, grad_scale: 32.0 2023-09-28 19:29:39,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:29:41,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:29:43,553 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 19:29:43,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 19:29:45,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:29:46,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:29:46,694 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 19:29:46,770 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 19:29:49,142 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=118906.66666666667, ans=0.125 2023-09-28 19:29:51,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 19:29:54,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:29:57,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 19:30:00,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 19:30:00,862 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.74 vs. limit=15.0 2023-09-28 19:30:08,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 19:30:11,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 19:30:12,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:30:12,991 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 19:30:12,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 19:30:14,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 19:30:14,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 19:30:14,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:30:20,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 19:30:23,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:30:27,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:30:27,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 19:30:29,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:30:32,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 19:30:33,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:30:42,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:30:42,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:30:42,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:30:42,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:30:44,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:30:45,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 19:30:45,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:30:48,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:30:48,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:30:52,039 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:30:52,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:30:54,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:30:54,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:30:58,744 INFO [train.py:1039] (3/4) Epoch 4, batch 1950, loss[loss=0.2688, simple_loss=0.3259, pruned_loss=0.1059, over 24643.00 frames. ], tot_loss[loss=0.2679, simple_loss=0.321, pruned_loss=0.1074, over 4708542.42 frames. ], batch size: 65, lr: 2.44e-02, grad_scale: 32.0 2023-09-28 19:30:58,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:31:01,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:31:03,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:31:03,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:31:05,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 19:31:07,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 19:31:07,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:31:07,326 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=119240.0, ans=0.0 2023-09-28 19:31:10,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:31:13,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:31:13,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:31:13,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:31:17,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:31:18,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:31:18,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:31:20,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:31:20,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:31:22,183 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=119306.66666666667, ans=0.125 2023-09-28 19:31:24,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:31:27,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:31:27,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:31:27,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 19:31:27,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 19:31:29,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:31:29,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:31:29,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:31:36,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:31:37,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:31:43,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:31:46,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:31:46,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:31:46,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 19:31:46,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:31:52,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:31:52,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:31:52,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:32:00,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:32:01,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:32:02,142 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.48 vs. limit=22.5 2023-09-28 19:32:06,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:32:06,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=119506.66666666667, ans=0.1 2023-09-28 19:32:08,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:32:09,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:32:11,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:32:11,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 19:32:11,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:32:12,782 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.624e+02 2.939e+02 3.496e+02 6.198e+02, threshold=5.878e+02, percent-clipped=1.0 2023-09-28 19:32:12,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:32:13,283 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:32:15,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 19:32:16,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:32:21,589 INFO [train.py:1039] (3/4) Epoch 4, batch 2000, loss[loss=0.2548, simple_loss=0.314, pruned_loss=0.0978, over 24603.00 frames. ], tot_loss[loss=0.2683, simple_loss=0.3219, pruned_loss=0.1074, over 4719885.32 frames. ], batch size: 60, lr: 2.44e-02, grad_scale: 32.0 2023-09-28 19:32:21,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:32:23,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:32:23,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:32:26,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:32:26,989 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:32:31,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 19:32:31,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:32:34,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:32:36,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 19:32:36,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:32:36,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:32:39,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:32:41,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 19:32:43,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:32:44,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:32:44,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:32:47,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 19:32:47,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:32:49,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 19:32:49,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:32:53,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:32:53,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 19:32:53,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:32:55,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:32:55,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:32:56,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 19:33:00,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 19:33:00,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:33:00,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:33:06,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:33:07,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:33:07,974 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:33:09,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:33:11,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:33:11,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:33:13,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:33:13,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:33:16,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:19,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:33:19,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 19:33:24,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:33:27,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:33:29,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=119840.0, ans=0.0 2023-09-28 19:33:32,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:33:32,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:33:35,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:39,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:33:39,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:39,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:33:39,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:33:42,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:33:44,037 INFO [train.py:1039] (3/4) Epoch 4, batch 2050, loss[loss=0.2477, simple_loss=0.3015, pruned_loss=0.09694, over 24621.00 frames. ], tot_loss[loss=0.2673, simple_loss=0.3209, pruned_loss=0.1069, over 4726405.74 frames. ], batch size: 60, lr: 2.43e-02, grad_scale: 32.0 2023-09-28 19:33:44,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:47,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:33:48,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:55,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:33:56,202 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.10 vs. limit=15.0 2023-09-28 19:33:57,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:33:57,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:33:58,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:34:02,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 19:34:02,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:34:02,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:34:04,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:34:11,946 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=119973.33333333333, ans=0.2 2023-09-28 19:34:14,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:34:14,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:34:17,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 19:34:19,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:34:20,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 19:34:20,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:34:24,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:34:27,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:34:29,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:34:29,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:34:30,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:34:31,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:34:31,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:34:36,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:34:37,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:34:39,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:34:39,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:34:44,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:34:44,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=120106.66666666667, ans=0.0 2023-09-28 19:34:49,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:34:50,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 19:34:51,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=120173.33333333333, ans=0.125 2023-09-28 19:34:55,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:34:57,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:34:59,046 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.851e+02 2.560e+02 3.020e+02 3.672e+02 5.923e+02, threshold=6.041e+02, percent-clipped=1.0 2023-09-28 19:35:00,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:35:02,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 19:35:06,666 INFO [train.py:1039] (3/4) Epoch 4, batch 2100, loss[loss=0.2233, simple_loss=0.2871, pruned_loss=0.07976, over 24295.00 frames. ], tot_loss[loss=0.2664, simple_loss=0.32, pruned_loss=0.1064, over 4734397.78 frames. ], batch size: 56, lr: 2.43e-02, grad_scale: 32.0 2023-09-28 19:35:06,861 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 19:35:06,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:35:07,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:35:08,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:35:09,062 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:35:09,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 19:35:10,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 19:35:12,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:35:15,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:35:17,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:35:18,587 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.30 vs. limit=22.5 2023-09-28 19:35:20,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:35:22,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:35:22,054 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 19:35:22,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:35:23,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 19:35:23,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 19:35:25,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:35:26,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:35:26,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 19:35:26,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 19:35:27,269 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=120306.66666666667, ans=0.0 2023-09-28 19:35:32,459 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 19:35:32,460 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:35:36,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:35:37,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:35:40,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:35:40,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 19:35:41,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:35:41,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 19:35:42,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=120373.33333333333, ans=0.125 2023-09-28 19:35:43,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 19:35:44,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:35:44,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 19:35:45,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 19:35:45,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 19:35:48,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:35:50,805 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:35:52,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:35:54,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 19:35:57,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:35:59,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:35:59,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 19:35:59,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:35:59,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:36:00,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:36:00,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 19:36:02,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 19:36:02,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 19:36:06,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:36:09,170 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:36:10,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 19:36:14,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:36:17,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:36:18,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:36:18,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:36:18,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 19:36:20,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:36:23,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:36:23,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:36:23,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:36:23,943 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:36:27,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 19:36:30,137 INFO [train.py:1039] (3/4) Epoch 4, batch 2150, loss[loss=0.2363, simple_loss=0.3004, pruned_loss=0.0861, over 24486.00 frames. ], tot_loss[loss=0.265, simple_loss=0.319, pruned_loss=0.1055, over 4719961.28 frames. ], batch size: 63, lr: 2.43e-02, grad_scale: 32.0 2023-09-28 19:36:30,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 19:36:30,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:36:31,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:36:31,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:36:31,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:36:33,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:36:37,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 19:36:38,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:36:41,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:36:44,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:36:44,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:36:44,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:36:47,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:36:47,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=120640.0, ans=0.125 2023-09-28 19:36:49,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:36:49,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:36:52,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:36:52,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 19:36:57,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:37:00,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:37:00,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:02,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:37:02,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:02,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:37:02,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:37:02,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:37:04,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:37:04,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 19:37:06,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:37:08,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:37:09,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:37:09,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=120706.66666666667, ans=0.1 2023-09-28 19:37:10,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:37:12,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:37:15,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:37:16,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:37:16,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:37:16,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 19:37:18,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 19:37:18,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=120773.33333333333, ans=0.04949747468305833 2023-09-28 19:37:21,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:37:21,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:22,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:37:24,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:37:24,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:37:26,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:26,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 19:37:27,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 19:37:27,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:37:28,641 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 19:37:29,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:37:30,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:37:30,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=120773.33333333333, ans=0.125 2023-09-28 19:37:31,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 19:37:31,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:37:32,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 19:37:32,903 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 19:37:32,903 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 19:37:32,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 19:37:35,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:37:36,712 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:37:36,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:37:36,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:38,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 19:37:39,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:37:40,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:45,491 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.874e+02 2.345e+02 2.868e+02 3.477e+02 5.291e+02, threshold=5.737e+02, percent-clipped=0.0 2023-09-28 19:37:47,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:37:48,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 19:37:53,284 INFO [train.py:1039] (3/4) Epoch 4, batch 2200, loss[loss=0.2348, simple_loss=0.2907, pruned_loss=0.08943, over 24403.00 frames. ], tot_loss[loss=0.2658, simple_loss=0.3197, pruned_loss=0.106, over 4717527.82 frames. ], batch size: 58, lr: 2.42e-02, grad_scale: 32.0 2023-09-28 19:37:53,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:37:53,801 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:37:55,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=120906.66666666667, ans=0.125 2023-09-28 19:37:58,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:37:58,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:37:59,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:38:01,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:38:04,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:38:04,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:38:04,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 19:38:08,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=120973.33333333333, ans=0.05 2023-09-28 19:38:11,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 19:38:13,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:38:20,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 19:38:22,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=120973.33333333333, ans=0.1 2023-09-28 19:38:23,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:38:25,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:38:25,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:38:26,139 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.80 vs. limit=15.0 2023-09-28 19:38:28,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:38:30,052 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 19:38:31,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:38:34,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:38:34,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 19:38:34,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=121040.0, ans=0.125 2023-09-28 19:38:38,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:38:39,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:38:41,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=121106.66666666667, ans=0.1 2023-09-28 19:38:43,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:38:44,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:38:48,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 19:38:48,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:38:49,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 19:38:51,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:38:51,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 19:38:51,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:38:55,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:38:56,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:38:56,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:38:56,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:38:57,365 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.36 vs. limit=6.0 2023-09-28 19:38:58,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:38:58,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:38:59,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 19:39:02,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 19:39:04,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:39:07,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:39:07,560 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 19:39:10,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:39:11,993 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 19:39:12,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:39:14,062 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 19:39:15,506 INFO [train.py:1039] (3/4) Epoch 4, batch 2250, loss[loss=0.2393, simple_loss=0.302, pruned_loss=0.08829, over 24652.00 frames. ], tot_loss[loss=0.2648, simple_loss=0.3191, pruned_loss=0.1052, over 4725700.32 frames. ], batch size: 65, lr: 2.42e-02, grad_scale: 64.0 2023-09-28 19:39:15,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:39:17,143 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 19:39:19,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:39:20,746 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 19:39:20,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:39:24,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:39:30,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:39:32,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:39:34,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=121306.66666666667, ans=0.125 2023-09-28 19:39:38,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:39:38,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:39:38,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:39:40,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 19:39:40,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:39:41,096 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.84 vs. limit=22.5 2023-09-28 19:39:41,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:39:42,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 19:39:42,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:39:42,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:39:45,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:39:50,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:39:52,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 19:39:52,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:39:54,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 19:39:57,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:39:58,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:40:02,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:40:04,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:40:05,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:40:05,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:40:08,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:40:10,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:40:10,649 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=121440.0, ans=0.125 2023-09-28 19:40:13,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:40:16,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 19:40:21,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 19:40:21,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:40:22,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:40:27,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 19:40:30,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 19:40:30,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 19:40:30,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=121506.66666666667, ans=0.125 2023-09-28 19:40:32,081 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.851e+02 2.293e+02 2.607e+02 2.902e+02 3.937e+02, threshold=5.215e+02, percent-clipped=0.0 2023-09-28 19:40:32,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:40:32,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:40:37,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 19:40:38,977 INFO [train.py:1039] (3/4) Epoch 4, batch 2300, loss[loss=0.2647, simple_loss=0.3247, pruned_loss=0.1024, over 24011.00 frames. ], tot_loss[loss=0.2675, simple_loss=0.3216, pruned_loss=0.1067, over 4723459.60 frames. ], batch size: 86, lr: 2.42e-02, grad_scale: 32.0 2023-09-28 19:40:39,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:40:40,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:40:45,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:40:46,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:40:50,022 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 19:40:51,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:40:58,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:40:58,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 19:40:59,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:40:59,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:40:59,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 19:41:02,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:41:03,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:41:05,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:41:08,731 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:41:12,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:41:15,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:41:16,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=121706.66666666667, ans=0.125 2023-09-28 19:41:17,833 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.34 vs. limit=22.5 2023-09-28 19:41:20,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:41:21,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:41:22,075 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:41:22,237 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.69 vs. limit=12.0 2023-09-28 19:41:26,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:41:27,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:41:30,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:41:32,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:41:32,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:41:32,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 19:41:37,753 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 19:41:37,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:41:39,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:41:39,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:41:40,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:41:42,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 19:41:42,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 19:41:42,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 19:41:42,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:41:42,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:41:44,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 19:41:48,903 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.81 vs. limit=15.0 2023-09-28 19:41:49,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:41:53,266 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=121840.0, ans=10.0 2023-09-28 19:41:54,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:41:59,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:41:59,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:41:59,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 19:42:00,456 INFO [train.py:1039] (3/4) Epoch 4, batch 2350, loss[loss=0.2963, simple_loss=0.3469, pruned_loss=0.1229, over 23343.00 frames. ], tot_loss[loss=0.2679, simple_loss=0.3221, pruned_loss=0.1069, over 4717116.25 frames. ], batch size: 93, lr: 2.42e-02, grad_scale: 32.0 2023-09-28 19:42:00,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:42:00,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:42:02,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:42:02,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 19:42:02,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=121906.66666666667, ans=0.1 2023-09-28 19:42:10,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:42:10,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 19:42:17,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 19:42:20,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:42:24,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:42:24,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:42:24,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:42:24,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:42:26,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 19:42:29,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:42:34,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 19:42:35,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:42:38,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:42:40,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:42:42,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:42:44,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 19:42:44,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:42:47,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:42:47,805 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:42:47,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:42:49,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=122106.66666666667, ans=0.125 2023-09-28 19:42:52,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:42:53,237 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=7.58 vs. limit=15.0 2023-09-28 19:42:54,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 19:42:56,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:42:58,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:42:58,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:42:59,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 19:43:01,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:43:03,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 19:43:04,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:43:04,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=122106.66666666667, ans=0.2 2023-09-28 19:43:09,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 19:43:09,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=122173.33333333333, ans=0.0 2023-09-28 19:43:10,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 19:43:11,255 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=122173.33333333333, ans=0.2 2023-09-28 19:43:12,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:43:12,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 19:43:12,536 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 19:43:12,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 19:43:16,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 19:43:18,099 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.790e+02 2.367e+02 2.843e+02 3.356e+02 5.882e+02, threshold=5.686e+02, percent-clipped=1.0 2023-09-28 19:43:18,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:43:18,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=122173.33333333333, ans=0.035 2023-09-28 19:43:22,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:43:24,111 INFO [train.py:1039] (3/4) Epoch 4, batch 2400, loss[loss=0.2766, simple_loss=0.2963, pruned_loss=0.1285, over 18901.00 frames. ], tot_loss[loss=0.2665, simple_loss=0.3203, pruned_loss=0.1063, over 4711644.15 frames. ], batch size: 388, lr: 2.41e-02, grad_scale: 32.0 2023-09-28 19:43:28,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:43:29,948 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:43:31,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 19:43:31,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 19:43:37,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=122240.0, ans=0.125 2023-09-28 19:43:39,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 19:43:39,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:43:40,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 19:43:40,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:43:41,431 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.36 vs. limit=15.0 2023-09-28 19:43:42,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:43:42,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 19:43:49,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:43:52,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 19:43:57,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:44:02,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 19:44:04,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:44:07,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:44:09,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=122373.33333333333, ans=0.0 2023-09-28 19:44:12,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:44:12,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 19:44:12,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 19:44:12,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=122440.0, ans=0.125 2023-09-28 19:44:19,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:44:22,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:44:24,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:44:25,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:44:25,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 19:44:25,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=122440.0, ans=0.1 2023-09-28 19:44:27,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:44:27,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:44:27,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:44:29,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 19:44:32,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:44:34,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 19:44:34,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 19:44:36,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 19:44:39,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:44:39,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:44:39,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 19:44:41,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 19:44:41,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 19:44:41,771 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 19:44:43,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 19:44:44,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:44:45,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:44:45,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:44:46,615 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 19:44:46,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:44:48,026 INFO [train.py:1039] (3/4) Epoch 4, batch 2450, loss[loss=0.2629, simple_loss=0.3137, pruned_loss=0.106, over 23622.00 frames. ], tot_loss[loss=0.2658, simple_loss=0.3191, pruned_loss=0.1063, over 4704335.69 frames. ], batch size: 149, lr: 2.41e-02, grad_scale: 32.0 2023-09-28 19:44:48,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 19:44:51,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:44:51,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:44:56,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:44:56,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:44:58,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 19:45:02,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:45:02,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:45:06,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:45:08,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:45:08,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:45:08,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 19:45:13,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:45:15,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:45:16,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:45:17,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=122640.0, ans=0.125 2023-09-28 19:45:20,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 19:45:20,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:45:21,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=122706.66666666667, ans=0.125 2023-09-28 19:45:23,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:45:23,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:45:24,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 19:45:25,179 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:45:26,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:45:26,730 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=122706.66666666667, ans=0.125 2023-09-28 19:45:34,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:45:36,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:45:36,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:45:36,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:45:38,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:45:39,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:45:39,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 19:45:42,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=122773.33333333333, ans=0.1 2023-09-28 19:45:45,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:45:45,772 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:45:48,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:45:48,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:45:52,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=122773.33333333333, ans=0.0 2023-09-28 19:45:54,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:45:54,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 19:45:56,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:45:56,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:45:56,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 19:45:57,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:45:58,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:46:02,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:46:03,796 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.743e+02 2.418e+02 2.743e+02 3.147e+02 4.422e+02, threshold=5.485e+02, percent-clipped=0.0 2023-09-28 19:46:05,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:46:05,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:46:09,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 19:46:09,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 19:46:11,542 INFO [train.py:1039] (3/4) Epoch 4, batch 2500, loss[loss=0.2451, simple_loss=0.3025, pruned_loss=0.09385, over 24391.00 frames. ], tot_loss[loss=0.265, simple_loss=0.3191, pruned_loss=0.1055, over 4722971.61 frames. ], batch size: 58, lr: 2.41e-02, grad_scale: 32.0 2023-09-28 19:46:12,052 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=122906.66666666667, ans=0.04949747468305833 2023-09-28 19:46:19,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:46:28,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:46:28,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:46:29,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:46:29,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 19:46:37,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:46:37,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:46:38,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 19:46:38,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 19:46:40,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 19:46:41,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:46:42,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:46:42,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=123040.0, ans=0.2 2023-09-28 19:46:44,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 19:46:44,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:46:44,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 19:46:44,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:46:50,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:46:53,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:46:56,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 19:46:56,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 19:46:56,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:46:59,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:47:04,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:47:07,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:47:09,915 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=11.41 vs. limit=12.0 2023-09-28 19:47:12,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:47:15,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 19:47:17,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 19:47:17,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:47:19,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 19:47:20,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:47:20,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 19:47:22,710 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 19:47:22,711 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 19:47:22,731 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 19:47:25,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:47:29,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 19:47:29,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 19:47:29,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:47:29,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=123173.33333333333, ans=0.1 2023-09-28 19:47:29,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=123173.33333333333, ans=0.1 2023-09-28 19:47:31,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 19:47:34,347 INFO [train.py:1039] (3/4) Epoch 4, batch 2550, loss[loss=0.2843, simple_loss=0.3434, pruned_loss=0.1126, over 24344.00 frames. ], tot_loss[loss=0.265, simple_loss=0.3194, pruned_loss=0.1052, over 4726261.77 frames. ], batch size: 77, lr: 2.40e-02, grad_scale: 32.0 2023-09-28 19:47:34,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff3.min_abs, batch_count=123240.0, ans=0.2 2023-09-28 19:47:36,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 19:47:37,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:47:38,270 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=123240.0, ans=0.0 2023-09-28 19:47:39,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:47:40,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:47:42,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:47:44,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 19:47:45,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:47:48,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 19:47:50,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:47:54,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:47:56,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:47:56,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 19:47:56,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:47:58,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:47:58,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:48:02,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:48:02,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 19:48:02,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 19:48:02,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:48:02,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 19:48:13,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:48:19,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:48:19,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:48:19,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:48:21,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 19:48:27,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:48:30,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 19:48:31,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:48:31,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:48:31,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 19:48:32,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:48:36,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:48:38,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:48:44,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:48:44,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 19:48:44,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:48:44,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:48:46,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 19:48:47,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:48:47,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:48:49,410 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.960e+02 2.344e+02 2.648e+02 3.019e+02 5.195e+02, threshold=5.296e+02, percent-clipped=0.0 2023-09-28 19:48:54,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:48:55,625 INFO [train.py:1039] (3/4) Epoch 4, batch 2600, loss[loss=0.36, simple_loss=0.3797, pruned_loss=0.1701, over 19726.00 frames. ], tot_loss[loss=0.2648, simple_loss=0.3197, pruned_loss=0.105, over 4727092.44 frames. ], batch size: 389, lr: 2.40e-02, grad_scale: 32.0 2023-09-28 19:48:55,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:48:56,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=123573.33333333333, ans=0.125 2023-09-28 19:48:57,825 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=123573.33333333333, ans=0.0 2023-09-28 19:48:59,037 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 19:49:02,813 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 19:49:02,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:49:04,157 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 19:49:04,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 19:49:04,304 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 19:49:07,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:49:07,446 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 19:49:09,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 19:49:11,065 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 19:49:12,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:49:14,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 19:49:16,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 19:49:17,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 19:49:18,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 19:49:20,945 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 19:49:20,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 19:49:27,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:49:27,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:49:27,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:49:27,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 19:49:28,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 19:49:37,239 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 19:49:41,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:49:41,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:49:43,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 19:49:44,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:49:44,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:49:44,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 19:49:50,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:49:50,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:49:53,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:49:56,332 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 19:49:56,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:49:56,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 19:50:01,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:50:01,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 19:50:01,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 19:50:04,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:50:05,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:50:07,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:50:14,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 19:50:16,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:50:18,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 19:50:20,017 INFO [train.py:1039] (3/4) Epoch 4, batch 2650, loss[loss=0.2766, simple_loss=0.3197, pruned_loss=0.1168, over 23771.00 frames. ], tot_loss[loss=0.2658, simple_loss=0.3204, pruned_loss=0.1056, over 4718769.47 frames. ], batch size: 179, lr: 2.40e-02, grad_scale: 16.0 2023-09-28 19:50:21,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 19:50:21,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:50:21,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 19:50:24,065 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 19:50:24,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:50:25,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=123906.66666666667, ans=0.125 2023-09-28 19:50:26,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:50:27,870 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.47 vs. limit=15.0 2023-09-28 19:50:30,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 19:50:30,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:50:33,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:50:33,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 19:50:33,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:50:33,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:50:38,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 19:50:39,509 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 19:50:43,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:50:48,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 19:50:48,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:50:48,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 19:50:53,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:50:53,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 19:50:53,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:50:53,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:51:01,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 19:51:01,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 19:51:03,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:51:07,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 19:51:07,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:51:09,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:51:09,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:51:09,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=124106.66666666667, ans=0.95 2023-09-28 19:51:11,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:51:11,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:51:14,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:51:15,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:51:16,626 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.02 vs. limit=12.0 2023-09-28 19:51:17,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:51:17,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:51:19,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:51:21,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:51:21,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:51:23,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:51:24,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:51:24,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 19:51:25,538 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=15.10 vs. limit=15.0 2023-09-28 19:51:27,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:51:29,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:51:29,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:51:31,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 19:51:32,120 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.78 vs. limit=12.0 2023-09-28 19:51:35,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:51:36,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:51:37,258 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.892e+02 2.373e+02 3.005e+02 3.591e+02 5.745e+02, threshold=6.010e+02, percent-clipped=4.0 2023-09-28 19:51:39,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:51:40,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:51:42,346 INFO [train.py:1039] (3/4) Epoch 4, batch 2700, loss[loss=0.2274, simple_loss=0.2844, pruned_loss=0.08523, over 24304.00 frames. ], tot_loss[loss=0.2665, simple_loss=0.3214, pruned_loss=0.1058, over 4724307.62 frames. ], batch size: 56, lr: 2.40e-02, grad_scale: 16.0 2023-09-28 19:51:42,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 19:51:42,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:51:44,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:51:44,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 19:51:48,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:51:50,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 19:51:51,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:51:51,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:51:53,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:51:55,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:51:55,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:51:55,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 19:51:57,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 19:51:57,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 19:51:57,148 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:51:58,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:51:58,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:52:00,373 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:52:04,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=124306.66666666667, ans=0.125 2023-09-28 19:52:05,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 19:52:05,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 19:52:07,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:52:12,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:52:12,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:52:18,740 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:52:18,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:52:18,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:52:18,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:52:20,998 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.41 vs. limit=22.5 2023-09-28 19:52:21,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:52:26,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:52:26,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:52:26,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:52:31,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:52:31,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 19:52:35,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=124440.0, ans=0.125 2023-09-28 19:52:40,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:52:40,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:52:45,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:52:45,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:52:48,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:52:48,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:52:50,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:52:52,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:52:55,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:52:55,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:52:57,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=124506.66666666667, ans=0.2 2023-09-28 19:52:58,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:53:00,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:53:00,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:53:02,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 19:53:03,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:53:05,062 INFO [train.py:1039] (3/4) Epoch 4, batch 2750, loss[loss=0.2484, simple_loss=0.3049, pruned_loss=0.096, over 24328.00 frames. ], tot_loss[loss=0.2663, simple_loss=0.3209, pruned_loss=0.1058, over 4724661.48 frames. ], batch size: 61, lr: 2.39e-02, grad_scale: 16.0 2023-09-28 19:53:06,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:53:06,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 19:53:06,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=124573.33333333333, ans=0.125 2023-09-28 19:53:08,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 19:53:08,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:53:10,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:53:10,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:53:15,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:15,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 19:53:15,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:18,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:53:20,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 19:53:20,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:53:20,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:20,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 19:53:20,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:53:20,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:53:27,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 19:53:30,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:53:30,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:30,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:53:32,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 19:53:32,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:53:33,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:53:34,061 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:53:35,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:53:35,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:53:38,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 19:53:38,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 19:53:38,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=124706.66666666667, ans=0.0 2023-09-28 19:53:39,224 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.81 vs. limit=6.0 2023-09-28 19:53:39,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 19:53:40,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:43,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:53:50,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:53:52,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 19:53:52,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:53:56,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:53:56,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:53:58,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 19:54:04,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 19:54:04,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 19:54:04,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 19:54:10,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:54:12,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 19:54:17,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 19:54:19,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:54:19,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 19:54:20,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:54:22,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 19:54:24,071 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.875e+02 2.756e+02 3.214e+02 3.920e+02 6.552e+02, threshold=6.428e+02, percent-clipped=3.0 2023-09-28 19:54:24,271 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 19:54:25,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:54:28,567 INFO [train.py:1039] (3/4) Epoch 4, batch 2800, loss[loss=0.2512, simple_loss=0.314, pruned_loss=0.09421, over 23188.00 frames. ], tot_loss[loss=0.2656, simple_loss=0.3191, pruned_loss=0.106, over 4699383.65 frames. ], batch size: 93, lr: 2.39e-02, grad_scale: 32.0 2023-09-28 19:54:28,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 19:54:30,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:54:30,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:54:30,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 19:54:30,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:54:31,812 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:54:33,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:54:33,505 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 19:54:33,507 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 19:54:38,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:54:40,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:54:41,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:54:43,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:54:45,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=124973.33333333333, ans=0.0 2023-09-28 19:54:47,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 19:54:48,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 19:54:50,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 19:54:50,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:54:50,712 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:54:50,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:54:55,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:54:55,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:54:55,620 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:54:57,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:55:07,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:55:08,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:55:11,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:55:13,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:55:13,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:55:19,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:55:19,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 19:55:21,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:55:21,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:55:23,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:55:26,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:55:26,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:55:28,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=125106.66666666667, ans=0.125 2023-09-28 19:55:30,345 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.02 vs. limit=15.0 2023-09-28 19:55:32,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:55:36,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:55:36,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:55:36,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 19:55:36,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 19:55:37,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 19:55:37,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:55:37,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 19:55:39,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:55:40,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:55:40,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:55:42,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 19:55:42,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:55:42,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:55:43,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:55:44,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 19:55:52,388 INFO [train.py:1039] (3/4) Epoch 4, batch 2850, loss[loss=0.2753, simple_loss=0.3183, pruned_loss=0.1161, over 23773.00 frames. ], tot_loss[loss=0.264, simple_loss=0.318, pruned_loss=0.105, over 4710621.19 frames. ], batch size: 212, lr: 2.39e-02, grad_scale: 32.0 2023-09-28 19:55:52,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:55:52,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 19:55:54,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:55:57,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:56:00,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:56:00,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:56:00,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:56:04,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:56:04,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:56:05,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:56:07,062 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 19:56:15,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 19:56:15,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:56:17,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 19:56:17,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:56:21,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 19:56:21,170 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 19:56:22,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:56:23,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=125306.66666666667, ans=0.2 2023-09-28 19:56:35,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:56:37,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:56:37,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 19:56:37,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 19:56:37,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 19:56:38,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 19:56:40,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 19:56:40,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 19:56:43,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 19:56:43,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:56:45,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:56:45,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:56:48,060 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.48 vs. limit=15.0 2023-09-28 19:56:48,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:56:48,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:56:50,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:56:52,995 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 19:56:55,061 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:56:55,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:56:56,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:56:58,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 19:57:03,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:57:04,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 19:57:05,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 19:57:06,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 19:57:07,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:57:08,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 19:57:09,332 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.413e+02 2.730e+02 3.344e+02 4.987e+02, threshold=5.460e+02, percent-clipped=0.0 2023-09-28 19:57:09,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:57:09,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=125506.66666666667, ans=0.1 2023-09-28 19:57:10,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:57:10,922 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:57:10,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 19:57:10,953 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 19:57:12,318 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 19:57:12,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:57:13,704 INFO [train.py:1039] (3/4) Epoch 4, batch 2900, loss[loss=0.2632, simple_loss=0.3107, pruned_loss=0.1078, over 23829.00 frames. ], tot_loss[loss=0.2649, simple_loss=0.3186, pruned_loss=0.1056, over 4712198.48 frames. ], batch size: 212, lr: 2.38e-02, grad_scale: 32.0 2023-09-28 19:57:13,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:57:18,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:57:18,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:57:20,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:57:20,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 19:57:25,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:57:25,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 19:57:26,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 19:57:30,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 19:57:30,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:57:30,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:57:32,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:57:36,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 19:57:37,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:57:39,528 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:57:40,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 19:57:40,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 19:57:42,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 19:57:43,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:57:45,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 19:57:47,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 19:57:50,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:57:50,317 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 19:57:50,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:57:53,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:57:53,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 19:57:54,156 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.18 vs. limit=10.0 2023-09-28 19:57:56,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:57:56,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:58:01,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:58:03,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:58:07,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 19:58:07,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 19:58:07,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:58:10,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 19:58:15,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 19:58:15,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 19:58:20,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 19:58:20,283 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=125840.0, ans=0.125 2023-09-28 19:58:27,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 19:58:27,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 19:58:29,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 19:58:32,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:58:33,195 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=125840.0, ans=0.125 2023-09-28 19:58:34,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 19:58:34,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:58:34,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:58:35,885 INFO [train.py:1039] (3/4) Epoch 4, batch 2950, loss[loss=0.2552, simple_loss=0.3069, pruned_loss=0.1018, over 23842.00 frames. ], tot_loss[loss=0.2642, simple_loss=0.3183, pruned_loss=0.105, over 4714936.72 frames. ], batch size: 179, lr: 2.38e-02, grad_scale: 32.0 2023-09-28 19:58:41,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:58:43,440 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 19:58:43,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:58:43,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:58:46,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:58:47,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=125906.66666666667, ans=0.1 2023-09-28 19:58:48,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 19:58:48,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 19:58:48,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 19:58:49,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=125906.66666666667, ans=0.09899494936611666 2023-09-28 19:58:50,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 19:58:50,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 19:58:57,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:58:59,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:59:01,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:59:01,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:59:04,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:59:04,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 19:59:06,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:59:07,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 19:59:07,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 19:59:11,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=126040.0, ans=0.035 2023-09-28 19:59:12,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 19:59:16,046 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 19:59:16,081 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 19:59:16,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 19:59:18,227 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 19:59:21,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 19:59:21,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 19:59:21,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 19:59:21,150 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 19:59:21,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 19:59:22,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 19:59:24,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 19:59:24,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1.whitening_limit, batch_count=126106.66666666667, ans=10.0 2023-09-28 19:59:25,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 19:59:27,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:59:28,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 19:59:28,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:59:30,268 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 19:59:31,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 19:59:31,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 19:59:37,261 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:59:38,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 19:59:40,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 19:59:40,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 19:59:41,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 19:59:45,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:59:46,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 19:59:46,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 19:59:48,801 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 19:59:50,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 19:59:50,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 19:59:52,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 19:59:53,509 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.776e+02 2.372e+02 2.758e+02 3.353e+02 4.666e+02, threshold=5.516e+02, percent-clipped=0.0 2023-09-28 19:59:53,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:59:53,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 19:59:53,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 19:59:53,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 19:59:55,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 19:59:56,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 19:59:56,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 19:59:58,284 INFO [train.py:1039] (3/4) Epoch 4, batch 3000, loss[loss=0.2367, simple_loss=0.2981, pruned_loss=0.08764, over 24455.00 frames. ], tot_loss[loss=0.2653, simple_loss=0.3195, pruned_loss=0.1055, over 4715274.86 frames. ], batch size: 58, lr: 2.38e-02, grad_scale: 32.0 2023-09-28 19:59:58,285 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-28 20:00:10,993 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([2.7599, 2.4474, 3.0982, 2.9523], device='cuda:3') 2023-09-28 20:00:13,227 INFO [train.py:1071] (3/4) Epoch 4, validation: loss=0.3352, simple_loss=0.3262, pruned_loss=0.1721, over 1125622.00 frames. 2023-09-28 20:00:13,228 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-28 20:00:13,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:00:15,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:00:16,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:00:19,661 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 20:00:19,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 20:00:23,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:00:23,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:00:25,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 20:00:25,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:00:31,594 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:00:40,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:00:47,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 20:00:49,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:00:51,415 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.16 vs. limit=10.0 2023-09-28 20:00:52,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:00:52,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:00:52,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:00:56,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:00:56,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 20:00:59,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 20:01:01,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:01:02,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:01:05,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:01:05,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:01:05,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:01:05,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:01:07,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=126440.0, ans=0.125 2023-09-28 20:01:11,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:01:11,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:01:11,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:01:12,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=126440.0, ans=0.0 2023-09-28 20:01:13,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:01:15,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 20:01:16,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:01:16,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:01:16,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:01:18,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=126506.66666666667, ans=0.1 2023-09-28 20:01:22,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:01:22,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:01:23,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 20:01:23,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 20:01:24,443 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.50 vs. limit=15.0 2023-09-28 20:01:25,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:01:25,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 20:01:26,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:01:28,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 20:01:32,050 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:01:32,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:01:32,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 20:01:32,844 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.39 vs. limit=12.0 2023-09-28 20:01:34,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 20:01:34,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 20:01:35,617 INFO [train.py:1039] (3/4) Epoch 4, batch 3050, loss[loss=0.2261, simple_loss=0.2987, pruned_loss=0.07676, over 24657.00 frames. ], tot_loss[loss=0.2665, simple_loss=0.3208, pruned_loss=0.1061, over 4719393.44 frames. ], batch size: 73, lr: 2.38e-02, grad_scale: 32.0 2023-09-28 20:01:35,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:01:37,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:01:37,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 20:01:37,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:01:37,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:01:40,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 20:01:42,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=126573.33333333333, ans=0.0 2023-09-28 20:01:43,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:01:44,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:01:44,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:01:48,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:01:51,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 20:01:51,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=126640.0, ans=0.0 2023-09-28 20:01:56,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 20:01:56,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 20:01:56,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:02:00,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:02:02,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=126640.0, ans=0.0 2023-09-28 20:02:05,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:02:05,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:02:07,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:02:10,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:02:10,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:02:10,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:02:12,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:02:12,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:02:12,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:02:15,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:02:19,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:02:19,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 20:02:19,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:02:19,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:02:21,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=126706.66666666667, ans=0.1 2023-09-28 20:02:22,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:02:24,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:02:24,482 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:02:25,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:02:27,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=126773.33333333333, ans=0.0 2023-09-28 20:02:27,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=126773.33333333333, ans=0.125 2023-09-28 20:02:27,908 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=126773.33333333333, ans=0.125 2023-09-28 20:02:32,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:02:32,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:02:35,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=126773.33333333333, ans=0.1 2023-09-28 20:02:41,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:02:41,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:02:41,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:02:41,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:02:43,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:02:43,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:02:44,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 20:02:45,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:02:45,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:02:47,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 20:02:48,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:02:52,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:02:53,680 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.758e+02 2.370e+02 2.708e+02 3.419e+02 5.330e+02, threshold=5.417e+02, percent-clipped=0.0 2023-09-28 20:02:53,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:02:56,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:02:58,150 INFO [train.py:1039] (3/4) Epoch 4, batch 3100, loss[loss=0.2766, simple_loss=0.2994, pruned_loss=0.1269, over 19276.00 frames. ], tot_loss[loss=0.2661, simple_loss=0.3198, pruned_loss=0.1062, over 4707278.54 frames. ], batch size: 389, lr: 2.37e-02, grad_scale: 32.0 2023-09-28 20:02:59,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 20:03:01,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 20:03:01,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 20:03:04,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:03:07,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:03:07,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:03:09,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 20:03:14,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:03:14,914 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:03:14,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=126973.33333333333, ans=0.125 2023-09-28 20:03:20,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 20:03:25,557 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=126973.33333333333, ans=0.2 2023-09-28 20:03:28,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:03:28,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:03:29,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:03:29,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:03:31,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 20:03:33,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:03:33,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 20:03:33,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:03:34,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:03:36,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 20:03:36,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:03:42,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:03:42,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 20:03:43,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 20:03:45,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:03:45,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:03:48,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:03:48,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:03:48,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:03:52,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:03:52,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:03:55,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:03:55,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:03:55,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:03:55,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 20:03:59,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:04:01,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 20:04:02,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:04:02,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 20:04:04,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:04:04,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:04:04,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 20:04:17,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 20:04:19,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=127240.0, ans=0.1 2023-09-28 20:04:21,077 INFO [train.py:1039] (3/4) Epoch 4, batch 3150, loss[loss=0.2291, simple_loss=0.2978, pruned_loss=0.0802, over 24658.00 frames. ], tot_loss[loss=0.2632, simple_loss=0.3177, pruned_loss=0.1044, over 4708567.11 frames. ], batch size: 65, lr: 2.37e-02, grad_scale: 32.0 2023-09-28 20:04:21,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:04:21,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:04:22,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:04:22,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:04:24,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 20:04:24,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:04:24,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 20:04:24,833 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=127240.0, ans=0.0 2023-09-28 20:04:26,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 20:04:29,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:04:32,619 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 20:04:35,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 20:04:35,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:04:37,590 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 20:04:37,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 20:04:39,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 20:04:39,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 20:04:39,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 20:04:39,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:04:41,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:04:43,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:04:44,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 20:04:47,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:04:47,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:04:48,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:04:50,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 20:04:53,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 20:04:54,933 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:04:56,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 20:04:58,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:04:58,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 20:05:01,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 20:05:03,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:05:03,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 20:05:03,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:05:04,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:05:04,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:05:06,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 20:05:06,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:05:07,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 20:05:07,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:05:07,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:09,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:05:09,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:05:11,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 20:05:11,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:05:12,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 20:05:12,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:12,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 20:05:15,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 20:05:17,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:05:17,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:05:19,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 20:05:21,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 20:05:21,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:05:25,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:05:26,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:26,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:05:32,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=127506.66666666667, ans=0.125 2023-09-28 20:05:33,489 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:05:34,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:38,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 20:05:39,688 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.688e+02 2.347e+02 2.789e+02 3.421e+02 6.245e+02, threshold=5.579e+02, percent-clipped=5.0 2023-09-28 20:05:43,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:05:43,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 20:05:44,437 INFO [train.py:1039] (3/4) Epoch 4, batch 3200, loss[loss=0.2673, simple_loss=0.3315, pruned_loss=0.1016, over 24006.00 frames. ], tot_loss[loss=0.2619, simple_loss=0.3166, pruned_loss=0.1036, over 4716893.92 frames. ], batch size: 80, lr: 2.37e-02, grad_scale: 32.0 2023-09-28 20:05:46,543 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=127573.33333333333, ans=0.1 2023-09-28 20:05:48,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:05:49,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:05:49,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 20:05:52,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:05:57,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:06:02,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:06:02,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=127640.0, ans=0.2 2023-09-28 20:06:12,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:06:23,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 20:06:23,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:06:27,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 20:06:27,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=127706.66666666667, ans=0.125 2023-09-28 20:06:28,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:06:32,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:06:32,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:06:34,639 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=127773.33333333333, ans=0.1 2023-09-28 20:06:35,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:06:38,803 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 20:06:40,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 20:06:43,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 20:06:45,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 20:06:48,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:06:54,051 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:06:54,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:06:54,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:06:54,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=127840.0, ans=0.125 2023-09-28 20:06:55,536 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 20:06:55,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:06:59,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:06:59,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 20:07:01,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 20:07:01,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 20:07:02,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 20:07:04,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=127840.0, ans=0.125 2023-09-28 20:07:05,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:07:07,652 INFO [train.py:1039] (3/4) Epoch 4, batch 3250, loss[loss=0.2588, simple_loss=0.3107, pruned_loss=0.1034, over 23801.00 frames. ], tot_loss[loss=0.2624, simple_loss=0.3166, pruned_loss=0.1041, over 4707816.65 frames. ], batch size: 212, lr: 2.37e-02, grad_scale: 32.0 2023-09-28 20:07:09,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 20:07:09,343 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 20:07:09,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:07:09,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:12,401 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 20:07:16,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:07:17,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:07:25,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=127973.33333333333, ans=0.0 2023-09-28 20:07:28,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:07:28,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 20:07:30,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:07:30,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:07:32,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:07:32,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:07:32,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:07:34,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:34,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:07:34,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:07:36,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:36,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:36,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:07:37,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:07:40,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:07:41,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:07:41,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:07:43,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:07:44,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:07:44,548 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:07:51,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 20:07:51,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:07:52,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:07:52,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:07:52,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=128040.0, ans=0.125 2023-09-28 20:07:55,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:07:59,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:08:06,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:08:06,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:08:06,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 20:08:06,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:08:07,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 20:08:07,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:08:10,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 20:08:11,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 20:08:11,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:08:13,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:08:13,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:08:14,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 20:08:16,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:08:19,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:08:19,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:08:21,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 20:08:21,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:08:23,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:08:23,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 20:08:25,972 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.753e+02 2.244e+02 2.605e+02 3.006e+02 4.571e+02, threshold=5.210e+02, percent-clipped=0.0 2023-09-28 20:08:26,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:08:26,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 20:08:27,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 20:08:29,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 20:08:29,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:08:30,576 INFO [train.py:1039] (3/4) Epoch 4, batch 3300, loss[loss=0.2679, simple_loss=0.3337, pruned_loss=0.101, over 24323.00 frames. ], tot_loss[loss=0.2647, simple_loss=0.3188, pruned_loss=0.1053, over 4702849.70 frames. ], batch size: 74, lr: 2.36e-02, grad_scale: 32.0 2023-09-28 20:08:30,893 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:08:35,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:08:35,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:08:37,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:08:39,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 20:08:39,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:08:42,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:08:43,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:08:49,046 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 20:08:49,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:08:49,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:08:52,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:08:52,232 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 20:08:55,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:08:55,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:08:57,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:08:57,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:08:57,315 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 20:08:57,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=128306.66666666667, ans=0.0 2023-09-28 20:08:57,766 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:08:59,268 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:09:00,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:09:00,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:09:02,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:09:02,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 20:09:04,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 20:09:04,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:09:05,366 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:09:06,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:09:08,047 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 20:09:10,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 20:09:11,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:09:15,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 20:09:18,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:09:19,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:09:21,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:09:24,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:09:25,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:09:25,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:09:25,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 20:09:28,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:09:28,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:09:30,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:09:31,862 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 20:09:34,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 20:09:37,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 20:09:39,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:09:39,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:09:40,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:09:40,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:09:43,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:09:43,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:09:43,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 20:09:45,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:09:46,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:09:50,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 20:09:50,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:09:50,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:09:51,892 INFO [train.py:1039] (3/4) Epoch 4, batch 3350, loss[loss=0.2646, simple_loss=0.3133, pruned_loss=0.108, over 23359.00 frames. ], tot_loss[loss=0.2629, simple_loss=0.3182, pruned_loss=0.1038, over 4716536.99 frames. ], batch size: 119, lr: 2.36e-02, grad_scale: 32.0 2023-09-28 20:09:53,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:09:53,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:09:53,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=128573.33333333333, ans=0.125 2023-09-28 20:09:55,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:09:56,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:09:56,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:00,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:10:00,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:04,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:10:06,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:07,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=128640.0, ans=0.0 2023-09-28 20:10:09,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:10:09,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:10:10,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:10:10,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 20:10:13,825 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 20:10:13,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:10:14,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=128640.0, ans=0.125 2023-09-28 20:10:15,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 20:10:15,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 20:10:16,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:10:18,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:10:19,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:10:20,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 20:10:21,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:21,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:10:23,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:26,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:26,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:28,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:10:30,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=128706.66666666667, ans=0.2 2023-09-28 20:10:32,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:10:33,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:33,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:10:35,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=128706.66666666667, ans=0.0 2023-09-28 20:10:38,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:10:38,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:10:39,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:39,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:10:43,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:10:44,020 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=13.09 vs. limit=15.0 2023-09-28 20:10:44,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 20:10:44,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 20:10:45,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 20:10:46,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:10:46,630 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 20:10:48,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:10:49,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:10:57,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:10:58,868 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 20:10:58,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:11:01,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:11:01,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=128840.0, ans=0.0 2023-09-28 20:11:03,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:11:08,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:11:10,004 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.829e+02 2.437e+02 2.848e+02 3.538e+02 5.302e+02, threshold=5.697e+02, percent-clipped=3.0 2023-09-28 20:11:11,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 20:11:11,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:11:11,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:11:14,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:11:14,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 20:11:14,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:11:14,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 20:11:16,246 INFO [train.py:1039] (3/4) Epoch 4, batch 3400, loss[loss=0.2679, simple_loss=0.3265, pruned_loss=0.1046, over 23463.00 frames. ], tot_loss[loss=0.2642, simple_loss=0.3191, pruned_loss=0.1047, over 4723856.72 frames. ], batch size: 93, lr: 2.36e-02, grad_scale: 32.0 2023-09-28 20:11:16,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:11:16,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:11:18,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 20:11:18,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:11:18,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 20:11:23,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=128906.66666666667, ans=0.0 2023-09-28 20:11:24,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 20:11:24,329 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 20:11:24,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:11:30,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:11:30,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:11:30,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=128973.33333333333, ans=0.125 2023-09-28 20:11:31,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:11:33,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 20:11:37,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:11:40,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 20:11:45,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:11:47,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:11:47,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:11:48,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 20:11:51,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=129040.0, ans=0.0 2023-09-28 20:11:56,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:12:00,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=129040.0, ans=0.04949747468305833 2023-09-28 20:12:01,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 20:12:02,153 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.45 vs. limit=15.0 2023-09-28 20:12:06,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:12:07,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:12:07,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 20:12:08,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:12:09,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:12:11,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:12:11,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:12:13,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:12:16,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:12:16,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:12:23,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:12:25,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 20:12:34,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:12:37,273 INFO [train.py:1039] (3/4) Epoch 4, batch 3450, loss[loss=0.2501, simple_loss=0.3139, pruned_loss=0.09316, over 24643.00 frames. ], tot_loss[loss=0.2635, simple_loss=0.3184, pruned_loss=0.1043, over 4728795.27 frames. ], batch size: 65, lr: 2.36e-02, grad_scale: 32.0 2023-09-28 20:12:38,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 20:12:42,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 20:12:42,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:12:43,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:12:43,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 20:12:45,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:12:51,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:12:55,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:12:57,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:12:59,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:12:59,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:13:01,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:13:02,543 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.87 vs. limit=15.0 2023-09-28 20:13:08,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 20:13:12,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 20:13:14,277 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:13:14,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:13:15,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:13:21,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=129373.33333333333, ans=0.07 2023-09-28 20:13:22,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 20:13:22,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:13:25,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:13:27,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:13:30,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:13:30,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:13:30,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=129440.0, ans=0.1 2023-09-28 20:13:32,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 20:13:32,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:13:33,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:13:36,262 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=129440.0, ans=0.125 2023-09-28 20:13:37,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:13:40,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 20:13:43,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:13:48,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:13:50,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:13:50,375 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=129506.66666666667, ans=0.0 2023-09-28 20:13:51,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:13:55,850 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.743e+02 2.310e+02 2.657e+02 3.151e+02 5.022e+02, threshold=5.313e+02, percent-clipped=0.0 2023-09-28 20:13:57,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:13:57,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:13:57,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:13:59,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:14:01,135 INFO [train.py:1039] (3/4) Epoch 4, batch 3500, loss[loss=0.28, simple_loss=0.3222, pruned_loss=0.1188, over 23836.00 frames. ], tot_loss[loss=0.2616, simple_loss=0.3169, pruned_loss=0.1032, over 4737713.46 frames. ], batch size: 179, lr: 2.35e-02, grad_scale: 32.0 2023-09-28 20:14:04,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:14:06,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:14:08,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 20:14:08,390 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=129573.33333333333, ans=0.1 2023-09-28 20:14:10,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:14:12,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 20:14:15,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:14:15,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 20:14:22,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:14:23,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:14:23,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:14:23,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:14:25,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 20:14:25,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:25,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:14:25,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 20:14:25,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=129640.0, ans=10.0 2023-09-28 20:14:29,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:31,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 20:14:32,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:14:37,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:37,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 20:14:37,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:14:40,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:14:43,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:14:43,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:45,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:14:45,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:14:47,395 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=13.33 vs. limit=15.0 2023-09-28 20:14:48,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 20:14:48,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 20:14:49,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 20:14:51,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:14:51,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:14:52,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:14:52,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:14:56,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 20:14:57,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:15:04,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:15:06,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 20:15:06,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 20:15:06,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:15:07,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:15:09,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:15:10,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=129840.0, ans=0.0 2023-09-28 20:15:11,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:15:11,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=129840.0, ans=10.0 2023-09-28 20:15:14,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 20:15:14,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:15:15,989 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:15:18,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 20:15:19,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 20:15:20,108 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=129840.0, ans=0.0 2023-09-28 20:15:21,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:15:22,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:15:22,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:15:22,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:15:24,326 INFO [train.py:1039] (3/4) Epoch 4, batch 3550, loss[loss=0.2352, simple_loss=0.3051, pruned_loss=0.08265, over 24270.00 frames. ], tot_loss[loss=0.2605, simple_loss=0.316, pruned_loss=0.1025, over 4731006.38 frames. ], batch size: 74, lr: 2.35e-02, grad_scale: 32.0 2023-09-28 20:15:27,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:15:34,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:15:37,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 20:15:39,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=129973.33333333333, ans=0.125 2023-09-28 20:15:41,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:15:42,916 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=17.86 vs. limit=22.5 2023-09-28 20:15:43,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:15:46,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:15:46,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:15:46,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:15:51,451 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:15:51,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:15:52,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:15:52,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 20:15:53,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:15:59,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:15:59,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:16:00,875 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.69 vs. limit=6.0 2023-09-28 20:16:01,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:16:01,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:16:02,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:16:02,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 20:16:02,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:16:04,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:16:05,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 20:16:06,511 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=130040.0, ans=0.0 2023-09-28 20:16:06,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=130040.0, ans=0.125 2023-09-28 20:16:09,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:16:11,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:16:13,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:16:15,107 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=130106.66666666667, ans=0.0 2023-09-28 20:16:16,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 20:16:18,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:16:18,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 20:16:18,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=130106.66666666667, ans=0.5 2023-09-28 20:16:19,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:16:21,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:16:21,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:16:24,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 20:16:24,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:16:31,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:16:32,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 20:16:33,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:16:36,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:16:39,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 20:16:41,264 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=130173.33333333333, ans=0.0 2023-09-28 20:16:42,188 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.807e+02 2.286e+02 2.757e+02 3.216e+02 5.394e+02, threshold=5.514e+02, percent-clipped=1.0 2023-09-28 20:16:44,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 20:16:44,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:16:45,286 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=20.48 vs. limit=22.5 2023-09-28 20:16:46,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:16:46,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:16:48,286 INFO [train.py:1039] (3/4) Epoch 4, batch 3600, loss[loss=0.3328, simple_loss=0.3515, pruned_loss=0.1571, over 19192.00 frames. ], tot_loss[loss=0.261, simple_loss=0.3161, pruned_loss=0.103, over 4721745.31 frames. ], batch size: 389, lr: 2.35e-02, grad_scale: 32.0 2023-09-28 20:16:48,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:16:50,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:16:53,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=130240.0, ans=0.125 2023-09-28 20:16:54,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:16:56,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:16:57,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:16:58,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=130240.0, ans=0.2 2023-09-28 20:16:59,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:17:01,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:17:01,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 20:17:03,646 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=130306.66666666667, ans=0.0 2023-09-28 20:17:04,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:17:05,155 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=130306.66666666667, ans=0.09899494936611666 2023-09-28 20:17:06,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:17:10,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:17:14,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:17:15,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:17:15,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:17:15,802 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 20:17:17,259 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:17:20,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:17:20,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:17:22,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:17:24,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:17:26,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:17:28,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 20:17:35,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:17:35,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:17:36,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 20:17:41,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:17:45,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:17:47,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:17:54,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 20:17:54,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:17:54,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 20:17:56,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=130506.66666666667, ans=0.035 2023-09-28 20:17:57,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 20:18:00,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 20:18:02,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:18:02,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:18:03,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 20:18:05,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:18:05,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:18:05,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:18:07,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 20:18:07,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 20:18:08,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=130506.66666666667, ans=0.125 2023-09-28 20:18:10,880 INFO [train.py:1039] (3/4) Epoch 4, batch 3650, loss[loss=0.2818, simple_loss=0.3228, pruned_loss=0.1204, over 23764.00 frames. ], tot_loss[loss=0.2617, simple_loss=0.3167, pruned_loss=0.1033, over 4726855.46 frames. ], batch size: 195, lr: 2.34e-02, grad_scale: 32.0 2023-09-28 20:18:11,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:18:11,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 20:18:17,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 20:18:18,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:18:23,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 20:18:24,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 20:18:29,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:18:29,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:18:29,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:18:34,350 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.20 vs. limit=22.5 2023-09-28 20:18:35,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 20:18:35,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:18:37,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 20:18:37,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:18:39,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:18:39,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 20:18:39,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 20:18:41,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:18:41,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:18:43,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:18:45,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=130706.66666666667, ans=0.125 2023-09-28 20:18:46,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 20:18:47,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 20:18:49,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:18:52,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 20:18:53,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:18:53,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:18:58,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:19:00,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:19:00,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:19:02,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:19:03,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:19:04,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:19:06,914 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:19:09,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:19:09,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:19:11,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:19:13,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:19:13,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:19:21,137 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 20:19:22,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:19:24,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:19:25,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 20:19:25,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:19:27,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:19:28,557 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.789e+02 2.316e+02 2.706e+02 3.127e+02 4.745e+02, threshold=5.412e+02, percent-clipped=0.0 2023-09-28 20:19:28,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:19:30,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 20:19:30,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:19:33,355 INFO [train.py:1039] (3/4) Epoch 4, batch 3700, loss[loss=0.25, simple_loss=0.3212, pruned_loss=0.08937, over 24452.00 frames. ], tot_loss[loss=0.2635, simple_loss=0.3178, pruned_loss=0.1046, over 4717246.59 frames. ], batch size: 77, lr: 2.34e-02, grad_scale: 32.0 2023-09-28 20:19:34,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:19:38,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:19:39,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:19:42,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:19:42,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 20:19:42,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:19:42,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 20:19:43,203 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=130906.66666666667, ans=0.0 2023-09-28 20:19:44,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:19:47,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:19:51,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:19:51,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:19:53,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:19:53,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:19:53,451 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 20:19:55,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:19:56,644 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 20:20:05,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:20:07,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:20:07,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:20:07,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 20:20:08,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:20:12,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:20:13,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 20:20:15,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:20:16,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:20:19,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=131040.0, ans=0.1 2023-09-28 20:20:20,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:20:20,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:20:23,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 20:20:25,580 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=1.629e-02 2023-09-28 20:20:26,629 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:20:26,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 20:20:28,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:20:28,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 20:20:32,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:20:32,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:20:37,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:20:37,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 20:20:39,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:20:39,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 20:20:40,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:20:40,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:20:44,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:20:45,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 20:20:45,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 20:20:47,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:20:47,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:20:48,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:20:50,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:20:53,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:20:55,678 INFO [train.py:1039] (3/4) Epoch 4, batch 3750, loss[loss=0.2491, simple_loss=0.3221, pruned_loss=0.08809, over 24367.00 frames. ], tot_loss[loss=0.2645, simple_loss=0.3193, pruned_loss=0.1048, over 4724013.72 frames. ], batch size: 77, lr: 2.34e-02, grad_scale: 32.0 2023-09-28 20:20:55,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:20:56,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=131240.0, ans=0.125 2023-09-28 20:20:58,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:20:59,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 20:21:01,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 20:21:04,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 20:21:04,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 20:21:06,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:21:07,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:21:09,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:21:09,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:21:12,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:21:17,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:21:18,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:21:20,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:21:22,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:21:23,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 20:21:23,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:21:25,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:21:25,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:21:29,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 20:21:34,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 20:21:36,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:21:36,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:21:39,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:21:41,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=131373.33333333334, ans=0.2 2023-09-28 20:21:44,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:21:45,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 20:21:50,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 20:21:50,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=131440.0, ans=0.125 2023-09-28 20:21:54,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:21:54,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=131440.0, ans=0.1 2023-09-28 20:21:57,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:21:59,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:22:03,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:22:06,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 20:22:06,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:22:10,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:22:11,190 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.67 vs. limit=15.0 2023-09-28 20:22:11,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:22:13,147 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.774e+02 2.509e+02 2.927e+02 3.521e+02 5.743e+02, threshold=5.855e+02, percent-clipped=1.0 2023-09-28 20:22:13,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 20:22:13,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=131506.66666666666, ans=0.07 2023-09-28 20:22:17,710 INFO [train.py:1039] (3/4) Epoch 4, batch 3800, loss[loss=0.2426, simple_loss=0.2934, pruned_loss=0.09591, over 21489.00 frames. ], tot_loss[loss=0.2656, simple_loss=0.3194, pruned_loss=0.1059, over 4717718.14 frames. ], batch size: 47, lr: 2.34e-02, grad_scale: 32.0 2023-09-28 20:22:22,598 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=131573.33333333334, ans=0.2 2023-09-28 20:22:23,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:22:26,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:22:27,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 20:22:28,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 20:22:30,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:22:31,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:22:33,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 20:22:36,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 20:22:36,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:22:38,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:22:39,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:22:39,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:22:39,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:22:42,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 20:22:45,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 20:22:45,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:22:48,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:22:51,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:22:52,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:22:54,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 20:22:54,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:22:57,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:22:58,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:23:02,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 20:23:02,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 20:23:03,097 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.89 vs. limit=12.0 2023-09-28 20:23:05,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:23:07,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=131773.33333333334, ans=0.125 2023-09-28 20:23:12,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:23:17,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:23:19,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 20:23:22,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 20:23:24,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:23:24,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:23:25,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:23:27,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 20:23:30,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 20:23:30,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 20:23:31,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:23:33,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:23:39,604 INFO [train.py:1039] (3/4) Epoch 4, batch 3850, loss[loss=0.2478, simple_loss=0.3205, pruned_loss=0.08759, over 24301.00 frames. ], tot_loss[loss=0.2643, simple_loss=0.318, pruned_loss=0.1053, over 4700486.46 frames. ], batch size: 74, lr: 2.33e-02, grad_scale: 32.0 2023-09-28 20:23:39,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:23:39,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:23:45,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:23:45,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 20:23:47,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:23:47,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:23:52,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:23:55,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:23:58,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 20:24:00,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 20:24:05,142 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:06,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:24:08,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:24:09,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:24:12,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:15,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:24:15,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:24:15,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:24:16,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:24:18,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:24:19,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:19,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:24:22,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 20:24:22,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 20:24:22,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:24:22,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:22,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=132040.0, ans=0.125 2023-09-28 20:24:25,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:24:25,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:25,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 20:24:28,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 20:24:31,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:24:33,079 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 20:24:36,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 20:24:42,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:24:43,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:24:48,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:24:48,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 20:24:51,095 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=132173.33333333334, ans=0.015 2023-09-28 20:24:52,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 20:24:54,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:24:56,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:24:57,995 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.696e+02 2.471e+02 2.833e+02 3.573e+02 5.682e+02, threshold=5.667e+02, percent-clipped=0.0 2023-09-28 20:24:59,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:24:59,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:25:01,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:01,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:01,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:25:01,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 20:25:02,767 INFO [train.py:1039] (3/4) Epoch 4, batch 3900, loss[loss=0.2317, simple_loss=0.3024, pruned_loss=0.08049, over 24609.00 frames. ], tot_loss[loss=0.2623, simple_loss=0.3162, pruned_loss=0.1042, over 4698158.95 frames. ], batch size: 68, lr: 2.33e-02, grad_scale: 32.0 2023-09-28 20:25:02,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:25:04,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 20:25:04,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:04,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:25:06,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:25:07,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:09,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:25:09,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:25:09,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:25:09,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:25:09,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 20:25:09,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:14,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:25:15,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:25:15,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:25:17,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:25:20,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:25:20,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:23,468 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:25:23,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=132306.66666666666, ans=0.125 2023-09-28 20:25:25,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 20:25:25,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:25:27,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 20:25:28,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:25:30,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 20:25:30,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 20:25:30,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=132306.66666666666, ans=0.0 2023-09-28 20:25:37,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:25:37,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:25:37,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:25:39,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:25:42,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:25:44,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:25:46,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:25:46,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:25:48,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:25:54,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:25:54,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:26:02,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:26:05,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:26:09,872 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.05 vs. limit=15.0 2023-09-28 20:26:15,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:26:18,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:26:18,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 20:26:18,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 20:26:18,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:26:19,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 20:26:21,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:26:21,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 20:26:24,461 INFO [train.py:1039] (3/4) Epoch 4, batch 3950, loss[loss=0.2502, simple_loss=0.3197, pruned_loss=0.0904, over 24661.00 frames. ], tot_loss[loss=0.2604, simple_loss=0.3151, pruned_loss=0.1029, over 4707428.97 frames. ], batch size: 73, lr: 2.33e-02, grad_scale: 32.0 2023-09-28 20:26:29,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:26:30,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 20:26:32,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:26:34,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:26:37,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:26:42,391 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 20:26:43,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:26:43,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 20:26:44,029 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 20:26:45,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:26:47,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:26:48,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:26:48,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:26:51,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 20:26:54,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:26:56,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:26:56,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:26:56,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:26:56,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:27:10,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:27:10,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:27:15,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 20:27:15,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=132773.33333333334, ans=0.2 2023-09-28 20:27:21,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 20:27:21,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 20:27:22,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:27:24,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:27:31,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:27:31,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:27:32,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:27:32,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:27:34,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 20:27:37,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:27:37,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=132840.0, ans=0.125 2023-09-28 20:27:39,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:27:42,851 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.941e+02 2.456e+02 2.836e+02 3.414e+02 5.372e+02, threshold=5.673e+02, percent-clipped=0.0 2023-09-28 20:27:42,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 20:27:48,142 INFO [train.py:1039] (3/4) Epoch 4, batch 4000, loss[loss=0.2234, simple_loss=0.2933, pruned_loss=0.07672, over 24459.00 frames. ], tot_loss[loss=0.2616, simple_loss=0.3159, pruned_loss=0.1036, over 4698291.90 frames. ], batch size: 63, lr: 2.33e-02, grad_scale: 32.0 2023-09-28 20:27:48,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=132906.66666666666, ans=0.125 2023-09-28 20:27:56,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:28:04,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:28:08,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:28:10,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:28:10,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:28:10,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 20:28:10,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=132973.33333333334, ans=0.125 2023-09-28 20:28:10,833 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=132973.33333333334, ans=0.125 2023-09-28 20:28:11,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 20:28:11,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 20:28:13,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:28:13,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 20:28:14,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:28:18,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:28:18,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:28:18,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:28:18,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:28:18,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 20:28:21,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:28:24,122 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 20:28:24,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:28:24,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:28:25,233 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.38 vs. limit=15.0 2023-09-28 20:28:28,899 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 20:28:29,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:28:29,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:28:33,514 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.30 vs. limit=12.0 2023-09-28 20:28:37,051 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 20:28:38,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:28:40,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:28:41,557 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 20:28:43,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:28:43,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 20:28:43,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:28:44,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:28:44,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:28:46,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:28:46,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:28:47,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:28:50,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 20:28:50,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:28:52,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=133173.33333333334, ans=0.1 2023-09-28 20:28:53,091 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 20:28:57,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:29:02,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 20:29:05,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:29:05,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:29:05,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:29:07,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:29:10,082 INFO [train.py:1039] (3/4) Epoch 4, batch 4050, loss[loss=0.2211, simple_loss=0.2862, pruned_loss=0.07802, over 24630.00 frames. ], tot_loss[loss=0.2617, simple_loss=0.3163, pruned_loss=0.1035, over 4713573.28 frames. ], batch size: 65, lr: 2.32e-02, grad_scale: 32.0 2023-09-28 20:29:11,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=133240.0, ans=10.0 2023-09-28 20:29:13,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:29:16,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 20:29:16,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 20:29:18,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:29:19,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:29:19,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:29:21,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:29:21,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:29:23,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=133240.0, ans=0.0 2023-09-28 20:29:26,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:29:30,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:29:30,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 20:29:31,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:29:31,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:29:38,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:29:42,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:29:44,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 20:29:47,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 20:29:47,146 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 20:29:48,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:29:53,646 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=133373.33333333334, ans=0.125 2023-09-28 20:29:53,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=133373.33333333334, ans=0.1 2023-09-28 20:29:54,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 20:29:55,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:29:58,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=133373.33333333334, ans=0.025 2023-09-28 20:29:59,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:30:04,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:30:04,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:30:04,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:30:08,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:30:13,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 20:30:13,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 20:30:14,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:30:16,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 20:30:21,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:30:23,736 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=133506.66666666666, ans=0.125 2023-09-28 20:30:28,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 20:30:29,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:30:29,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:30:30,988 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.888e+02 2.307e+02 2.673e+02 3.242e+02 5.499e+02, threshold=5.347e+02, percent-clipped=0.0 2023-09-28 20:30:31,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 20:30:32,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 20:30:32,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:30:35,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:30:36,466 INFO [train.py:1039] (3/4) Epoch 4, batch 4100, loss[loss=0.2907, simple_loss=0.3303, pruned_loss=0.1256, over 23507.00 frames. ], tot_loss[loss=0.2641, simple_loss=0.3179, pruned_loss=0.1051, over 4702893.91 frames. ], batch size: 134, lr: 2.32e-02, grad_scale: 32.0 2023-09-28 20:30:36,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:30:36,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:30:44,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 20:30:45,611 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=9.31 vs. limit=12.0 2023-09-28 20:30:46,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 20:30:47,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 20:30:48,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 20:30:49,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:30:49,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:30:51,638 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:30:51,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:30:51,775 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 20:30:54,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:30:56,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:30:56,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:30:56,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:31:00,248 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=14.35 vs. limit=15.0 2023-09-28 20:31:02,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:31:03,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:31:03,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:31:05,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 20:31:05,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:31:05,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:31:05,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:31:05,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:31:06,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 20:31:11,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:31:12,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=133706.66666666666, ans=0.125 2023-09-28 20:31:13,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 20:31:14,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:31:17,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:31:17,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 20:31:18,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:31:18,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:31:20,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:31:21,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 20:31:23,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:31:24,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:31:25,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 20:31:27,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:31:27,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:31:27,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=133773.33333333334, ans=0.125 2023-09-28 20:31:30,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:31:31,921 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=133773.33333333334, ans=0.125 2023-09-28 20:31:34,825 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:31:36,512 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=133773.33333333334, ans=0.0 2023-09-28 20:31:39,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:31:39,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:31:47,026 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.37 vs. limit=15.0 2023-09-28 20:31:50,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:31:50,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:31:53,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:31:55,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:31:57,402 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=15.91 vs. limit=15.0 2023-09-28 20:31:58,096 INFO [train.py:1039] (3/4) Epoch 4, batch 4150, loss[loss=0.2405, simple_loss=0.3071, pruned_loss=0.08693, over 24491.00 frames. ], tot_loss[loss=0.264, simple_loss=0.3182, pruned_loss=0.1049, over 4708418.53 frames. ], batch size: 63, lr: 2.32e-02, grad_scale: 32.0 2023-09-28 20:31:59,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:31:59,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:32:01,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:32:01,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:32:06,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 20:32:07,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:32:07,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 20:32:09,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 20:32:09,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 20:32:11,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:32:15,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:32:15,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:32:19,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=133973.33333333334, ans=0.125 2023-09-28 20:32:20,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:32:22,912 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:32:22,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 20:32:26,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 20:32:26,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:32:27,516 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.78 vs. limit=15.0 2023-09-28 20:32:28,032 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 20:32:31,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:32:34,322 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:32:35,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 20:32:39,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 20:32:39,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:32:39,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 20:32:39,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:32:39,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:32:42,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:32:44,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:32:48,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 20:32:51,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 20:32:54,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:32:56,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 20:32:56,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:32:56,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=134106.66666666666, ans=0.0 2023-09-28 20:32:58,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 20:32:59,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:33:01,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:33:02,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:33:02,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 20:33:02,913 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:02,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 20:33:04,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:33:06,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 20:33:06,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:33:06,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:33:06,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:33:07,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 20:33:09,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:33:09,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 20:33:09,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=134173.33333333334, ans=0.125 2023-09-28 20:33:10,485 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:33:12,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:33:13,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 20:33:13,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 20:33:16,751 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.811e+02 2.342e+02 2.661e+02 3.100e+02 4.687e+02, threshold=5.322e+02, percent-clipped=0.0 2023-09-28 20:33:19,944 INFO [train.py:1039] (3/4) Epoch 4, batch 4200, loss[loss=0.2708, simple_loss=0.3196, pruned_loss=0.111, over 23377.00 frames. ], tot_loss[loss=0.2629, simple_loss=0.3174, pruned_loss=0.1042, over 4720316.87 frames. ], batch size: 119, lr: 2.32e-02, grad_scale: 16.0 2023-09-28 20:33:20,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:33:21,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 20:33:23,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:33:24,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:33:26,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:33:27,052 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:33:27,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:33:30,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 20:33:34,872 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.42 vs. limit=15.0 2023-09-28 20:33:35,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 20:33:35,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:38,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:33:40,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:33:43,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 20:33:45,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:33:45,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:45,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=134306.66666666666, ans=0.0 2023-09-28 20:33:47,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 20:33:47,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:33:48,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:48,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:33:48,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:33:50,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:33:52,041 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:33:53,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 20:33:53,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:33:57,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 20:33:59,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:34:03,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:34:03,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:34:05,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:34:05,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 20:34:05,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:34:07,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:34:07,941 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.69 vs. limit=15.0 2023-09-28 20:34:12,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=134440.0, ans=0.95 2023-09-28 20:34:13,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:34:14,922 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:34:21,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:34:23,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=134440.0, ans=0.125 2023-09-28 20:34:23,829 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.57 vs. limit=15.0 2023-09-28 20:34:24,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 20:34:26,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=134506.66666666666, ans=0.1 2023-09-28 20:34:27,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:34:31,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 20:34:32,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:34:35,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 20:34:42,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 20:34:43,559 INFO [train.py:1039] (3/4) Epoch 4, batch 4250, loss[loss=0.2551, simple_loss=0.3192, pruned_loss=0.09546, over 24082.00 frames. ], tot_loss[loss=0.2616, simple_loss=0.3159, pruned_loss=0.1036, over 4707327.98 frames. ], batch size: 80, lr: 2.31e-02, grad_scale: 16.0 2023-09-28 20:34:45,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:34:45,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 20:34:48,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:34:53,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:34:53,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 20:34:53,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:34:56,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:34:59,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:35:04,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:35:06,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:35:08,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:35:08,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:35:10,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:35:11,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:35:13,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:35:15,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:35:18,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:35:18,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=134706.66666666666, ans=0.04949747468305833 2023-09-28 20:35:19,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 20:35:22,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 20:35:22,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:35:24,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:35:24,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:35:26,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:35:26,022 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:35:26,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:35:26,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=134706.66666666666, ans=0.1 2023-09-28 20:35:31,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 20:35:32,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:35:35,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:35:37,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:35:38,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 20:35:38,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:35:41,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 20:35:41,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=134773.33333333334, ans=0.07 2023-09-28 20:35:42,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:35:44,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:35:46,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:35:46,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:35:48,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 20:35:49,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:35:50,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=134840.0, ans=0.125 2023-09-28 20:35:51,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 20:35:54,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:35:56,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:35:58,086 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.27 vs. limit=15.0 2023-09-28 20:35:58,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:36:00,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:36:00,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=134840.0, ans=0.125 2023-09-28 20:36:02,333 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.761e+02 2.342e+02 2.586e+02 3.220e+02 5.035e+02, threshold=5.173e+02, percent-clipped=0.0 2023-09-28 20:36:02,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:36:02,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:36:04,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:36:04,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 20:36:05,373 INFO [train.py:1039] (3/4) Epoch 4, batch 4300, loss[loss=0.2342, simple_loss=0.308, pruned_loss=0.08014, over 24476.00 frames. ], tot_loss[loss=0.2609, simple_loss=0.3155, pruned_loss=0.1032, over 4716271.69 frames. ], batch size: 63, lr: 2.31e-02, grad_scale: 16.0 2023-09-28 20:36:05,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:36:11,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:36:11,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:36:15,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:36:22,043 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=134973.33333333334, ans=0.0 2023-09-28 20:36:23,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:36:23,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 20:36:26,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:36:27,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:36:27,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:36:27,913 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 20:36:29,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=134973.33333333334, ans=0.125 2023-09-28 20:36:31,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:36:32,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:36:37,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 20:36:37,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:36:39,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 20:36:41,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 20:36:42,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:36:43,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=135040.0, ans=0.0 2023-09-28 20:36:44,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:36:44,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:36:44,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=135040.0, ans=0.1 2023-09-28 20:36:45,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:36:47,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:36:49,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:36:49,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 20:36:50,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 20:36:53,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:36:56,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:36:56,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:36:56,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:36:57,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:36:57,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 20:36:57,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 20:36:58,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 20:36:59,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:36:59,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 20:36:59,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 20:37:05,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:37:07,174 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 20:37:07,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:37:08,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:37:08,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:37:09,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=135173.33333333334, ans=0.125 2023-09-28 20:37:10,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 20:37:10,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:37:10,775 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:37:11,510 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.65 vs. limit=15.0 2023-09-28 20:37:12,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:37:12,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:37:14,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:37:17,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:37:18,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:37:20,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:37:20,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:37:28,122 INFO [train.py:1039] (3/4) Epoch 4, batch 4350, loss[loss=0.3521, simple_loss=0.3746, pruned_loss=0.1648, over 19371.00 frames. ], tot_loss[loss=0.2617, simple_loss=0.3165, pruned_loss=0.1035, over 4718994.52 frames. ], batch size: 388, lr: 2.31e-02, grad_scale: 16.0 2023-09-28 20:37:28,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 20:37:28,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 20:37:30,588 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=8.54 vs. limit=15.0 2023-09-28 20:37:34,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:37:37,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:37:39,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:37:39,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:37:45,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:37:47,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:37:50,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:37:50,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:37:53,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:37:54,095 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=135306.66666666666, ans=0.1 2023-09-28 20:37:55,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:37:56,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=135306.66666666666, ans=0.125 2023-09-28 20:37:58,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:38:03,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=135373.33333333334, ans=0.0 2023-09-28 20:38:04,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 20:38:06,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:38:06,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:12,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:13,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 20:38:16,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:38:18,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 20:38:24,898 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 20:38:26,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:38:26,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:38:27,933 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 20:38:29,444 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 20:38:29,454 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:38:29,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:38:29,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:38:29,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:38:32,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:38:32,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:38:35,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 20:38:35,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:35,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:38:35,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:37,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 20:38:39,150 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 20:38:39,168 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 20:38:39,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 20:38:42,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:38:42,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:38:43,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:38:43,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:38:46,866 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.833e+02 2.300e+02 2.585e+02 3.110e+02 4.848e+02, threshold=5.170e+02, percent-clipped=0.0 2023-09-28 20:38:47,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 20:38:48,679 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 20:38:48,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:50,039 INFO [train.py:1039] (3/4) Epoch 4, batch 4400, loss[loss=0.2936, simple_loss=0.3341, pruned_loss=0.1265, over 23741.00 frames. ], tot_loss[loss=0.2635, simple_loss=0.3179, pruned_loss=0.1045, over 4711971.13 frames. ], batch size: 179, lr: 2.31e-02, grad_scale: 32.0 2023-09-28 20:38:53,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:38:53,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:38:56,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:38:58,927 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.49 vs. limit=12.0 2023-09-28 20:38:59,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 20:38:59,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 20:38:59,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 20:38:59,968 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 20:39:01,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 20:39:01,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:39:01,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=135573.33333333334, ans=0.1 2023-09-28 20:39:03,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 20:39:05,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:39:05,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=135640.0, ans=0.125 2023-09-28 20:39:06,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:39:06,901 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 20:39:11,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:39:11,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 20:39:12,056 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 20:39:15,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 20:39:15,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 20:39:15,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 20:39:15,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:39:17,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:39:17,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:39:19,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:39:20,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 20:39:20,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 20:39:21,482 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.68 vs. limit=22.5 2023-09-28 20:39:22,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:39:23,726 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=135706.66666666666, ans=0.125 2023-09-28 20:39:25,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:39:25,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:39:26,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:39:28,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:39:28,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 20:39:29,629 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 20:39:33,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:39:39,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:39:41,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 20:39:45,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:39:50,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:39:51,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:39:51,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 20:39:51,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:39:51,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:39:51,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:39:53,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:39:58,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 20:39:58,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=135840.0, ans=0.09899494936611666 2023-09-28 20:40:01,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 20:40:02,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 20:40:02,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:40:02,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 20:40:02,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=135840.0, ans=0.1 2023-09-28 20:40:04,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:40:05,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:40:08,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 20:40:09,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=135840.0, ans=0.1 2023-09-28 20:40:11,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:40:13,260 INFO [train.py:1039] (3/4) Epoch 4, batch 4450, loss[loss=0.2631, simple_loss=0.3342, pruned_loss=0.09599, over 24448.00 frames. ], tot_loss[loss=0.2629, simple_loss=0.3177, pruned_loss=0.1041, over 4718000.89 frames. ], batch size: 69, lr: 2.30e-02, grad_scale: 32.0 2023-09-28 20:40:16,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:40:17,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:40:26,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:40:26,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:40:31,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:40:31,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:40:33,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:40:34,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:40:36,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 20:40:36,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:40:38,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:40:38,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:40:38,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 20:40:38,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=135973.33333333334, ans=0.2 2023-09-28 20:40:39,638 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:40:39,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=135973.33333333334, ans=0.0 2023-09-28 20:40:44,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:40:46,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:40:46,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:40:47,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:40:49,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:40:52,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=136040.0, ans=0.1 2023-09-28 20:40:55,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 20:40:57,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 20:40:57,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 20:40:57,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:41:01,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:41:03,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 20:41:06,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:41:09,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:41:09,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 20:41:09,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:41:09,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:41:09,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:41:09,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:41:12,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:41:15,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 20:41:17,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 20:41:19,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:41:20,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:41:22,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:41:23,091 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=22.73 vs. limit=22.5 2023-09-28 20:41:24,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:41:25,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 20:41:26,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:41:31,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 20:41:32,907 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.860e+02 2.347e+02 2.673e+02 3.318e+02 4.703e+02, threshold=5.347e+02, percent-clipped=0.0 2023-09-28 20:41:33,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:41:35,982 INFO [train.py:1039] (3/4) Epoch 4, batch 4500, loss[loss=0.2729, simple_loss=0.3414, pruned_loss=0.1022, over 24244.00 frames. ], tot_loss[loss=0.2625, simple_loss=0.3175, pruned_loss=0.1038, over 4721414.02 frames. ], batch size: 74, lr: 2.30e-02, grad_scale: 32.0 2023-09-28 20:41:39,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:41:40,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 20:41:40,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 20:41:42,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:41:45,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:41:47,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:41:47,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 20:41:48,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:41:48,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:41:48,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:41:59,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=136306.66666666666, ans=0.0 2023-09-28 20:42:00,948 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=136306.66666666666, ans=0.0 2023-09-28 20:42:02,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:42:04,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:42:04,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=136306.66666666666, ans=0.0 2023-09-28 20:42:07,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:42:08,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:42:08,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:42:15,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:42:20,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:42:21,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=136373.33333333334, ans=0.125 2023-09-28 20:42:24,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:42:27,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:42:27,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 20:42:29,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:42:29,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:42:31,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:42:31,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:42:34,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:42:34,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 20:42:34,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 20:42:34,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:42:39,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:42:40,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:42:43,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:42:44,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:42:46,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:42:47,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 20:42:50,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 20:42:50,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 20:42:55,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 20:42:56,812 INFO [train.py:1039] (3/4) Epoch 4, batch 4550, loss[loss=0.275, simple_loss=0.3221, pruned_loss=0.1139, over 23324.00 frames. ], tot_loss[loss=0.2612, simple_loss=0.3156, pruned_loss=0.1034, over 4719915.36 frames. ], batch size: 119, lr: 2.30e-02, grad_scale: 32.0 2023-09-28 20:42:57,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 20:42:59,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:43:03,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:43:05,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:43:09,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:43:12,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:43:14,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:43:16,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:43:16,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:43:16,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:43:20,400 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:43:20,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:43:23,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:43:26,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 20:43:28,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 20:43:29,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:43:29,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 20:43:32,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 20:43:34,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:43:37,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 20:43:39,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:43:44,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:43:44,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:43:44,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:43:46,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 20:43:49,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:43:51,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:43:51,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:43:52,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:43:54,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 20:43:55,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 20:43:56,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:43:57,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 20:44:00,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 20:44:00,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:44:00,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:02,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:44:02,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:44:02,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:44:04,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=136840.0, ans=0.09899494936611666 2023-09-28 20:44:05,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:44:05,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 20:44:06,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:44:06,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 20:44:08,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 20:44:08,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:44:08,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 20:44:10,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=136840.0, ans=0.0 2023-09-28 20:44:11,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:44:11,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:44:15,292 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.851e+02 2.285e+02 2.509e+02 2.934e+02 4.311e+02, threshold=5.019e+02, percent-clipped=0.0 2023-09-28 20:44:16,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:44:16,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:44:16,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=136840.0, ans=0.1 2023-09-28 20:44:17,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 20:44:19,148 INFO [train.py:1039] (3/4) Epoch 4, batch 4600, loss[loss=0.2521, simple_loss=0.2725, pruned_loss=0.1158, over 19182.00 frames. ], tot_loss[loss=0.2602, simple_loss=0.3153, pruned_loss=0.1026, over 4726533.33 frames. ], batch size: 388, lr: 2.30e-02, grad_scale: 32.0 2023-09-28 20:44:19,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:44:20,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 20:44:22,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:23,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:44:26,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:44:26,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:44:27,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:44:28,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 20:44:30,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:44:34,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:44:36,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:44:37,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:43,064 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=6.32 vs. limit=15.0 2023-09-28 20:44:45,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 20:44:47,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:50,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:44:54,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:44:54,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:44:56,413 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.77 vs. limit=15.0 2023-09-28 20:44:58,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 20:44:58,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:44:59,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:45:03,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:45:03,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:45:04,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=137040.0, ans=0.2 2023-09-28 20:45:04,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=137040.0, ans=0.04949747468305833 2023-09-28 20:45:05,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:45:09,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 20:45:11,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 20:45:14,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:45:16,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:45:20,469 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.26 vs. limit=10.0 2023-09-28 20:45:20,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:45:20,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 20:45:21,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:45:22,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 20:45:22,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:45:23,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:45:25,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:45:25,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:45:27,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:45:28,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 20:45:28,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 20:45:28,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 20:45:28,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:45:31,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:45:31,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:45:33,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:45:39,437 INFO [train.py:1039] (3/4) Epoch 4, batch 4650, loss[loss=0.2631, simple_loss=0.3267, pruned_loss=0.09978, over 24659.00 frames. ], tot_loss[loss=0.2599, simple_loss=0.3153, pruned_loss=0.1022, over 4731111.19 frames. ], batch size: 73, lr: 2.29e-02, grad_scale: 32.0 2023-09-28 20:45:42,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:45:45,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:45:45,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:45:45,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:45:46,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:45:47,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:45:48,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=137240.0, ans=0.125 2023-09-28 20:45:49,391 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:45:53,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 20:45:57,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:45:57,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=137306.66666666666, ans=0.0 2023-09-28 20:45:59,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 20:45:59,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:45:59,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 20:46:00,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:46:02,147 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 20:46:02,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 20:46:02,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:46:02,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:46:08,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:46:08,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:46:09,933 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 20:46:13,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:46:13,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 20:46:16,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:46:16,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:46:17,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 20:46:20,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:46:23,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:46:28,727 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:46:29,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=137440.0, ans=0.0 2023-09-28 20:46:32,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:46:33,007 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=137440.0, ans=0.125 2023-09-28 20:46:35,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:46:35,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:46:37,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:46:38,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 20:46:40,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 20:46:40,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 20:46:40,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 20:46:42,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:46:45,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=137506.66666666666, ans=0.125 2023-09-28 20:46:49,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:46:49,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:46:49,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 20:46:49,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:46:51,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:46:51,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:46:53,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:46:56,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:46:56,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:46:56,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:46:57,871 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.03 vs. limit=15.0 2023-09-28 20:46:58,905 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.838e+02 2.187e+02 2.531e+02 2.951e+02 4.992e+02, threshold=5.061e+02, percent-clipped=0.0 2023-09-28 20:47:00,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:47:02,508 INFO [train.py:1039] (3/4) Epoch 4, batch 4700, loss[loss=0.2366, simple_loss=0.2927, pruned_loss=0.09026, over 24463.00 frames. ], tot_loss[loss=0.259, simple_loss=0.3149, pruned_loss=0.1016, over 4735093.55 frames. ], batch size: 58, lr: 2.29e-02, grad_scale: 32.0 2023-09-28 20:47:02,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:47:02,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:47:02,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 20:47:04,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 20:47:06,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 20:47:10,641 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.51 vs. limit=22.5 2023-09-28 20:47:13,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=137573.33333333334, ans=0.05 2023-09-28 20:47:14,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:47:14,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:47:16,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:47:17,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:47:19,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 20:47:21,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=137640.0, ans=0.0 2023-09-28 20:47:25,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 20:47:25,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 20:47:27,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:47:28,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:47:28,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:47:32,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:47:34,995 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=137706.66666666666, ans=0.1 2023-09-28 20:47:39,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:47:42,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 20:47:45,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:47:49,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=137706.66666666666, ans=0.1 2023-09-28 20:47:51,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 20:47:52,183 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.14 vs. limit=15.0 2023-09-28 20:47:52,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:47:54,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:47:54,826 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.13 vs. limit=15.0 2023-09-28 20:47:57,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 20:47:58,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:48:03,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:48:05,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 20:48:05,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=137773.33333333334, ans=0.0 2023-09-28 20:48:07,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:48:07,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:48:10,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:48:12,100 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:48:12,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 20:48:12,269 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 20:48:15,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:48:17,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:48:17,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:48:17,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 20:48:19,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:48:20,246 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.13 vs. limit=22.5 2023-09-28 20:48:22,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 20:48:25,432 INFO [train.py:1039] (3/4) Epoch 4, batch 4750, loss[loss=0.2892, simple_loss=0.3268, pruned_loss=0.1258, over 23767.00 frames. ], tot_loss[loss=0.2593, simple_loss=0.3151, pruned_loss=0.1018, over 4729883.90 frames. ], batch size: 232, lr: 2.29e-02, grad_scale: 32.0 2023-09-28 20:48:25,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:48:27,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:48:31,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:48:31,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:48:33,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 20:48:33,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:48:36,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 20:48:36,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=137906.66666666666, ans=0.1 2023-09-28 20:48:38,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:48:39,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:48:39,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:48:46,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 20:48:51,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:48:52,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=137973.33333333334, ans=0.1 2023-09-28 20:48:54,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 20:48:54,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:48:59,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:48:59,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:48:59,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:48:59,452 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 20:48:59,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 20:49:02,926 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:49:05,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 20:49:08,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:49:10,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:49:11,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:49:11,778 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 20:49:11,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:49:15,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:49:18,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 20:49:18,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 20:49:20,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 20:49:20,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:49:20,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=138106.66666666666, ans=0.2 2023-09-28 20:49:21,932 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:49:22,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:49:22,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:49:24,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 20:49:27,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 20:49:29,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:49:32,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:49:32,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 20:49:33,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:49:34,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:49:37,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 20:49:37,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:49:37,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=138173.33333333334, ans=0.025 2023-09-28 20:49:38,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 20:49:43,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:49:43,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 20:49:44,427 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 2.351e+02 2.944e+02 3.482e+02 5.215e+02, threshold=5.888e+02, percent-clipped=1.0 2023-09-28 20:49:44,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 20:49:46,108 INFO [train.py:1039] (3/4) Epoch 4, batch 4800, loss[loss=0.2449, simple_loss=0.2954, pruned_loss=0.09718, over 24460.00 frames. ], tot_loss[loss=0.2605, simple_loss=0.316, pruned_loss=0.1025, over 4730639.59 frames. ], batch size: 58, lr: 2.29e-02, grad_scale: 32.0 2023-09-28 20:49:46,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 20:49:47,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:49:49,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:49:51,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 20:49:56,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:49:58,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:49:59,446 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=5.01 vs. limit=12.0 2023-09-28 20:50:05,102 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 20:50:05,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:50:05,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:50:06,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 20:50:08,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:50:08,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:50:09,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 20:50:11,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=138306.66666666666, ans=0.125 2023-09-28 20:50:13,064 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:50:14,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:50:16,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 20:50:19,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:50:19,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 20:50:19,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:50:19,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:50:22,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:50:25,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:50:25,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:50:25,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:50:27,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 20:50:29,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:50:33,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 20:50:33,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 20:50:33,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=138373.33333333334, ans=0.125 2023-09-28 20:50:34,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:50:34,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:50:35,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:50:35,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:50:35,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:50:36,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:50:37,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:50:38,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=138440.0, ans=0.0 2023-09-28 20:50:41,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:50:44,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:50:47,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:50:47,934 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=138440.0, ans=0.0 2023-09-28 20:50:49,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=138440.0, ans=0.1 2023-09-28 20:50:50,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 20:50:51,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:50:52,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:50:52,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:50:53,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:50:54,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=138506.66666666666, ans=0.125 2023-09-28 20:50:58,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:50:58,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:50:58,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:50:58,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:51:00,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:51:00,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:51:04,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:51:04,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:51:04,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:51:06,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 20:51:08,167 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=138573.33333333334, ans=0.125 2023-09-28 20:51:08,544 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.97 vs. limit=12.0 2023-09-28 20:51:09,197 INFO [train.py:1039] (3/4) Epoch 4, batch 4850, loss[loss=0.2589, simple_loss=0.3271, pruned_loss=0.09536, over 24535.00 frames. ], tot_loss[loss=0.2601, simple_loss=0.316, pruned_loss=0.1021, over 4735641.06 frames. ], batch size: 71, lr: 2.28e-02, grad_scale: 32.0 2023-09-28 20:51:09,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 20:51:09,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:51:09,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:51:10,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:51:10,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:51:14,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:51:20,848 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=138573.33333333334, ans=0.125 2023-09-28 20:51:21,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 20:51:23,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:51:26,638 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:51:28,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 20:51:28,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:51:31,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:51:34,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:51:36,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:51:36,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 20:51:41,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:51:42,285 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=138706.66666666666, ans=0.05 2023-09-28 20:51:43,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:51:43,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 20:51:44,991 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 20:51:45,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 20:51:47,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=138706.66666666666, ans=0.1 2023-09-28 20:51:48,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 20:51:48,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:51:53,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:51:53,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 20:51:54,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 20:51:55,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:52:02,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:52:02,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 20:52:03,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:52:04,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:52:06,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:52:08,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 20:52:08,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:52:09,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 20:52:09,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:52:09,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:52:11,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 20:52:14,164 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=138840.0, ans=0.125 2023-09-28 20:52:16,281 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.85 vs. limit=15.0 2023-09-28 20:52:20,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:52:27,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:52:27,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:52:29,918 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.911e+02 2.282e+02 2.559e+02 2.982e+02 4.179e+02, threshold=5.119e+02, percent-clipped=0.0 2023-09-28 20:52:31,427 INFO [train.py:1039] (3/4) Epoch 4, batch 4900, loss[loss=0.2649, simple_loss=0.3366, pruned_loss=0.09664, over 24648.00 frames. ], tot_loss[loss=0.2593, simple_loss=0.3149, pruned_loss=0.1019, over 4725911.65 frames. ], batch size: 68, lr: 2.28e-02, grad_scale: 32.0 2023-09-28 20:52:33,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 20:52:33,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:52:41,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:52:42,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:52:42,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:52:45,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 20:52:49,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 20:52:54,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 20:52:56,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 20:52:56,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:52:56,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:52:56,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:52:58,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:52:58,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 20:52:58,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 20:53:01,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 20:53:02,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:53:04,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 20:53:06,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 20:53:07,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:53:08,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:53:09,657 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:53:09,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 20:53:11,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:53:12,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:53:13,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 20:53:13,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 20:53:17,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 20:53:19,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 20:53:21,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=139106.66666666666, ans=0.025 2023-09-28 20:53:21,988 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=139106.66666666666, ans=0.5 2023-09-28 20:53:22,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=139106.66666666666, ans=0.0 2023-09-28 20:53:23,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:53:23,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:53:23,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:53:23,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 20:53:25,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:53:25,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 20:53:28,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:53:29,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 20:53:32,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:53:32,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=139106.66666666666, ans=0.125 2023-09-28 20:53:35,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 20:53:35,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:53:35,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 20:53:36,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 20:53:42,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:53:44,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:53:46,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 20:53:46,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:53:47,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:53:49,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:53:52,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:53:52,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 20:53:52,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:53:54,530 INFO [train.py:1039] (3/4) Epoch 4, batch 4950, loss[loss=0.2298, simple_loss=0.3054, pruned_loss=0.07711, over 24497.00 frames. ], tot_loss[loss=0.2589, simple_loss=0.3145, pruned_loss=0.1017, over 4722501.80 frames. ], batch size: 66, lr: 2.28e-02, grad_scale: 32.0 2023-09-28 20:53:54,637 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 20:53:56,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 20:53:59,599 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:53:59,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 20:54:02,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 20:54:03,398 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 20:54:03,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:54:06,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 20:54:06,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:06,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:54:07,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 20:54:07,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:54:09,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:54:11,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:54:12,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:54:14,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:54:15,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:15,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:54:18,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 20:54:21,343 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=139306.66666666666, ans=0.1 2023-09-28 20:54:24,872 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.28 vs. limit=15.0 2023-09-28 20:54:25,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:26,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:54:28,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:28,591 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:54:32,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:54:33,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 20:54:35,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 20:54:35,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:54:39,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:54:39,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:54:42,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:54:42,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:54:43,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 20:54:44,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=139440.0, ans=0.0 2023-09-28 20:54:45,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:54:46,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 20:54:48,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:54:48,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=139440.0, ans=0.125 2023-09-28 20:54:50,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:54:51,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:54:51,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 20:54:53,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:54:53,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 20:54:57,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:54:58,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:54:58,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:55:00,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:55:00,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 20:55:00,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 20:55:03,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:55:03,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 20:55:05,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:55:05,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 20:55:10,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:55:14,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=139506.66666666666, ans=0.2 2023-09-28 20:55:15,180 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.874e+02 2.343e+02 2.664e+02 3.186e+02 5.232e+02, threshold=5.328e+02, percent-clipped=1.0 2023-09-28 20:55:16,775 INFO [train.py:1039] (3/4) Epoch 4, batch 5000, loss[loss=0.268, simple_loss=0.2942, pruned_loss=0.1208, over 19295.00 frames. ], tot_loss[loss=0.2575, simple_loss=0.3127, pruned_loss=0.1012, over 4703655.76 frames. ], batch size: 388, lr: 2.28e-02, grad_scale: 32.0 2023-09-28 20:55:16,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 20:55:16,873 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 20:55:21,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:55:21,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:55:23,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 20:55:24,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 20:55:26,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:55:27,203 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.08 vs. limit=15.0 2023-09-28 20:55:28,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 20:55:28,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 20:55:28,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 20:55:30,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 20:55:30,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:55:31,621 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:55:33,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 20:55:33,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:55:33,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:55:33,748 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.66 vs. limit=22.5 2023-09-28 20:55:35,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 20:55:36,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 20:55:36,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:55:36,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 20:55:36,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:55:39,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:55:39,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:55:39,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 20:55:39,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 20:55:40,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 20:55:40,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:55:42,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:55:42,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 20:55:43,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:55:45,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:55:45,499 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:55:48,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 20:55:49,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 20:55:51,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:55:54,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:55:57,609 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 20:55:59,399 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 20:56:02,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:56:02,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:04,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 20:56:05,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:56:06,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:56:06,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:56:07,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 20:56:09,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:56:13,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 20:56:13,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:56:20,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 20:56:24,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:27,267 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=13.13 vs. limit=15.0 2023-09-28 20:56:32,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:56:34,117 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:34,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 20:56:34,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:56:36,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 20:56:36,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 20:56:37,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:38,970 INFO [train.py:1039] (3/4) Epoch 4, batch 5050, loss[loss=0.2192, simple_loss=0.2928, pruned_loss=0.07283, over 24484.00 frames. ], tot_loss[loss=0.2589, simple_loss=0.3139, pruned_loss=0.1019, over 4698518.97 frames. ], batch size: 66, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 20:56:42,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:56:42,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 20:56:45,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 20:56:48,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:56:49,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:56:51,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 20:56:53,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:56:53,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:56:54,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 20:56:56,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 20:56:57,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 20:57:05,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 20:57:05,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 20:57:07,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:57:07,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 20:57:07,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:57:10,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:57:10,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:57:10,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:57:10,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 20:57:12,078 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 20:57:14,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:57:19,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:57:22,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:57:22,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 20:57:24,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:57:28,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 20:57:29,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:57:30,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 20:57:30,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:57:30,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:57:31,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:57:34,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:57:34,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:57:34,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:57:34,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:57:35,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=140106.66666666666, ans=0.0 2023-09-28 20:57:36,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 20:57:36,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 20:57:38,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=140106.66666666666, ans=0.125 2023-09-28 20:57:39,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 20:57:40,095 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.57 vs. limit=15.0 2023-09-28 20:57:43,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:57:43,865 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 20:57:43,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 20:57:45,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:57:45,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:57:45,599 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 20:57:49,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:57:49,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 20:57:49,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:57:54,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:57:54,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:57:54,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 20:57:56,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 20:57:57,720 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.89 vs. limit=15.0 2023-09-28 20:57:59,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:58:00,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:58:00,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 20:58:02,028 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.727e+02 2.333e+02 2.668e+02 3.236e+02 5.838e+02, threshold=5.336e+02, percent-clipped=1.0 2023-09-28 20:58:03,594 INFO [train.py:1039] (3/4) Epoch 4, batch 5100, loss[loss=0.298, simple_loss=0.3338, pruned_loss=0.1311, over 22819.00 frames. ], tot_loss[loss=0.2597, simple_loss=0.3149, pruned_loss=0.1023, over 4700227.42 frames. ], batch size: 322, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 20:58:05,188 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 20:58:08,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 20:58:11,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 20:58:11,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 20:58:12,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:58:13,091 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=140240.0, ans=0.125 2023-09-28 20:58:14,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 20:58:18,173 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.96 vs. limit=6.0 2023-09-28 20:58:18,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 20:58:18,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 20:58:20,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 20:58:25,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 20:58:25,554 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 20:58:28,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:58:34,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 20:58:34,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:58:36,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:58:36,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 20:58:37,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:58:39,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:58:39,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 20:58:41,530 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 20:58:41,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:58:42,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 20:58:43,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 20:58:46,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 20:58:55,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:58:58,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 20:58:58,204 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 20:58:58,231 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 20:59:01,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 20:59:01,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 20:59:06,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 20:59:10,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 20:59:12,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 20:59:14,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 20:59:16,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 20:59:17,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 20:59:19,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 20:59:24,972 INFO [train.py:1039] (3/4) Epoch 4, batch 5150, loss[loss=0.2307, simple_loss=0.2863, pruned_loss=0.08759, over 23323.00 frames. ], tot_loss[loss=0.2608, simple_loss=0.3164, pruned_loss=0.1026, over 4711484.37 frames. ], batch size: 119, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 20:59:25,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 20:59:25,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 20:59:25,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 20:59:26,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 20:59:26,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 20:59:28,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 20:59:28,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 20:59:28,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 20:59:29,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 20:59:29,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 20:59:29,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 20:59:31,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:59:32,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 20:59:34,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:59:36,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 20:59:41,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 20:59:41,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 20:59:43,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 20:59:43,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 20:59:44,345 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 20:59:45,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 20:59:45,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 20:59:46,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 20:59:47,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 20:59:47,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 20:59:47,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 20:59:49,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 20:59:50,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 20:59:52,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 20:59:54,050 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 20:59:55,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:00:01,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:00:04,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 21:00:07,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:00:14,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:00:18,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:00:20,901 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=140773.33333333334, ans=0.125 2023-09-28 21:00:21,375 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=140773.33333333334, ans=10.0 2023-09-28 21:00:23,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:00:23,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:00:27,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 21:00:27,802 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.48 vs. limit=15.0 2023-09-28 21:00:31,591 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:00:31,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:00:31,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=140840.0, ans=0.125 2023-09-28 21:00:33,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:00:36,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:00:37,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:00:37,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 21:00:42,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:00:42,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 21:00:45,324 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.271e+02 2.546e+02 2.924e+02 4.595e+02, threshold=5.092e+02, percent-clipped=0.0 2023-09-28 21:00:45,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:00:45,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:00:47,426 INFO [train.py:1039] (3/4) Epoch 4, batch 5200, loss[loss=0.269, simple_loss=0.3155, pruned_loss=0.1113, over 23368.00 frames. ], tot_loss[loss=0.2608, simple_loss=0.3165, pruned_loss=0.1025, over 4719540.51 frames. ], batch size: 119, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 21:00:47,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:00:47,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 21:00:49,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:00:49,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:00:51,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:00:54,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:00:57,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:01:01,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=140906.66666666666, ans=0.2 2023-09-28 21:01:02,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 21:01:02,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:01:02,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:01:04,644 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=140973.33333333334, ans=0.0 2023-09-28 21:01:05,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:01:07,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:01:07,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:01:07,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=140973.33333333334, ans=0.125 2023-09-28 21:01:08,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 21:01:12,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 21:01:12,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:01:15,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 21:01:16,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:01:17,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=140973.33333333334, ans=0.2 2023-09-28 21:01:18,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 21:01:18,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=141040.0, ans=0.125 2023-09-28 21:01:19,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 21:01:21,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 21:01:23,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 21:01:23,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:01:23,513 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 21:01:24,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:01:25,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:01:25,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:01:27,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 21:01:28,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:01:30,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:01:33,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 21:01:33,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 21:01:35,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 21:01:35,964 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.74 vs. limit=15.0 2023-09-28 21:01:38,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 21:01:38,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:01:43,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:01:43,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:01:45,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 21:01:45,171 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:01:46,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 21:01:46,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:01:46,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:01:50,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:01:53,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:01:55,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:01:55,948 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=7.52 vs. limit=15.0 2023-09-28 21:01:57,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:01:57,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:02:03,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:02:04,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 21:02:05,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:02:05,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:02:08,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:02:09,662 INFO [train.py:1039] (3/4) Epoch 4, batch 5250, loss[loss=0.2404, simple_loss=0.282, pruned_loss=0.09941, over 23445.00 frames. ], tot_loss[loss=0.2596, simple_loss=0.3151, pruned_loss=0.1021, over 4706389.19 frames. ], batch size: 285, lr: 2.27e-02, grad_scale: 32.0 2023-09-28 21:02:09,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:02:09,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:02:11,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:02:16,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:02:17,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:02:18,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:02:23,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:02:25,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:02:28,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:02:30,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:02:33,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 21:02:33,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:02:34,536 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:02:49,175 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.05 vs. limit=15.0 2023-09-28 21:03:01,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=141440.0, ans=0.125 2023-09-28 21:03:08,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=141506.66666666666, ans=0.07 2023-09-28 21:03:21,503 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.854e+02 2.354e+02 2.746e+02 3.335e+02 6.410e+02, threshold=5.493e+02, percent-clipped=2.0 2023-09-28 21:03:22,901 INFO [train.py:1039] (3/4) Epoch 4, batch 5300, loss[loss=0.2371, simple_loss=0.303, pruned_loss=0.08559, over 24472.00 frames. ], tot_loss[loss=0.2587, simple_loss=0.3143, pruned_loss=0.1016, over 4711332.24 frames. ], batch size: 63, lr: 2.26e-02, grad_scale: 32.0 2023-09-28 21:03:27,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=141573.33333333334, ans=0.125 2023-09-28 21:03:38,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:03:38,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 21:03:38,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 21:03:38,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:03:39,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:03:39,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:03:39,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:03:39,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:03:39,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:03:39,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:03:39,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 21:03:40,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:03:40,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 21:03:40,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 21:03:40,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 21:03:40,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:03:40,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 21:03:40,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 21:03:40,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:03:41,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:03:41,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:03:41,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:03:41,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:03:42,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:03:42,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:03:42,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:03:42,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:03:42,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:03:42,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:03:42,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:03:42,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:03:43,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 21:03:43,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:03:44,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:03:44,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 21:03:44,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 21:03:44,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:03:44,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:03:44,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 21:03:45,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 21:03:45,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 21:03:45,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:03:46,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:03:46,657 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 21:03:46,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 21:03:46,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 21:03:46,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:03:47,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 21:03:47,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 21:03:47,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 21:03:47,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 21:03:55,471 INFO [train.py:1039] (3/4) Epoch 5, batch 0, loss[loss=0.2629, simple_loss=0.3201, pruned_loss=0.1028, over 24685.00 frames. ], tot_loss[loss=0.2629, simple_loss=0.3201, pruned_loss=0.1028, over 24685.00 frames. ], batch size: 65, lr: 2.11e-02, grad_scale: 32.0 2023-09-28 21:03:55,472 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-28 21:04:10,259 INFO [train.py:1071] (3/4) Epoch 5, validation: loss=0.3547, simple_loss=0.3281, pruned_loss=0.1907, over 1125622.00 frames. 2023-09-28 21:04:10,260 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-28 21:04:10,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 21:04:12,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:04:14,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:04:19,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:04:19,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:04:20,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:04:21,315 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.11 vs. limit=15.0 2023-09-28 21:04:21,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 21:04:23,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 21:04:25,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:04:26,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:04:31,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:04:31,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:04:32,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:04:32,731 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:04:32,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 21:04:35,922 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:04:43,821 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.18 vs. limit=15.0 2023-09-28 21:04:46,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:04:46,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:04:48,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 21:04:52,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:04:52,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:04:53,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:04:57,185 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.37 vs. limit=15.0 2023-09-28 21:04:58,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:04:58,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=141853.33333333334, ans=0.05 2023-09-28 21:04:59,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=141853.33333333334, ans=0.1 2023-09-28 21:05:02,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:05:07,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 21:05:10,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 21:05:10,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:05:10,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:05:11,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:05:11,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:05:13,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 21:05:17,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:05:19,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:05:23,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:05:27,438 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 21:05:28,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:05:31,980 INFO [train.py:1039] (3/4) Epoch 5, batch 50, loss[loss=0.2329, simple_loss=0.2984, pruned_loss=0.08371, over 24291.00 frames. ], tot_loss[loss=0.2596, simple_loss=0.3172, pruned_loss=0.101, over 1065985.85 frames. ], batch size: 61, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:05:32,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:05:35,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:05:35,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=141986.66666666666, ans=0.0 2023-09-28 21:05:36,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 21:05:36,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:05:36,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:05:39,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:05:42,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:05:45,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:05:48,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 21:05:48,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:05:55,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:05:57,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 21:05:59,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 21:06:02,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:06:02,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:06:02,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:06:02,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:06:05,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:06:05,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 21:06:05,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:06:12,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:06:13,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:06:14,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:06:15,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 21:06:18,078 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:06:19,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:06:19,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 21:06:20,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:06:22,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 21:06:26,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=142186.66666666666, ans=0.05 2023-09-28 21:06:29,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:06:29,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:06:31,353 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.771e+02 2.197e+02 2.413e+02 2.834e+02 4.473e+02, threshold=4.826e+02, percent-clipped=0.0 2023-09-28 21:06:31,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:06:32,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=142186.66666666666, ans=0.1 2023-09-28 21:06:33,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:06:33,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:06:36,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 21:06:36,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 21:06:38,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:06:38,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=142253.33333333334, ans=0.025 2023-09-28 21:06:39,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:06:41,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:06:42,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:06:42,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 21:06:44,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 21:06:44,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 21:06:47,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:06:47,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:06:47,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 21:06:48,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 21:06:50,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:06:50,619 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:06:52,030 INFO [train.py:1039] (3/4) Epoch 5, batch 100, loss[loss=0.2389, simple_loss=0.3092, pruned_loss=0.08435, over 24448.00 frames. ], tot_loss[loss=0.2613, simple_loss=0.3186, pruned_loss=0.102, over 1872865.19 frames. ], batch size: 63, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:06:52,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 21:06:53,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:06:55,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:06:57,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:06:57,297 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=142320.0, ans=0.07 2023-09-28 21:07:01,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:07:05,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 21:07:05,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:07:12,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:07:12,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:07:12,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:07:12,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:07:12,746 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:07:15,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 21:07:17,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 21:07:17,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:07:17,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:07:17,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:07:22,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 21:07:22,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:07:22,552 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=142386.66666666666, ans=0.0 2023-09-28 21:07:23,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:07:23,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:07:26,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:07:27,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=142453.33333333334, ans=0.025 2023-09-28 21:07:30,048 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 21:07:30,097 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 21:07:31,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:07:31,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:07:36,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:07:39,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:07:39,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:07:47,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:07:47,475 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 21:07:49,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 21:07:49,950 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.41 vs. limit=22.5 2023-09-28 21:07:52,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:07:52,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:07:53,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:07:58,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:08:01,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:08:03,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:08:06,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:08:06,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:08:09,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:08:09,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:08:09,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:08:09,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 21:08:11,209 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 21:08:11,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:08:11,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:08:12,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:12,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:08:12,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 21:08:12,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 21:08:14,817 INFO [train.py:1039] (3/4) Epoch 5, batch 150, loss[loss=0.2306, simple_loss=0.294, pruned_loss=0.0836, over 24590.00 frames. ], tot_loss[loss=0.2593, simple_loss=0.317, pruned_loss=0.1008, over 2507018.44 frames. ], batch size: 60, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:08:14,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 21:08:14,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:15,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:08:16,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:08:16,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:08:18,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:08:19,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:08:23,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:08:23,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:08:23,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:26,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:08:28,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:29,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:08:31,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:34,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=142720.0, ans=0.0 2023-09-28 21:08:35,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 21:08:35,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 21:08:35,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 21:08:38,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:08:38,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:08:40,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:08:42,383 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:08:42,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:08:42,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:43,782 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:08:43,961 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 21:08:47,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:08:47,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=142786.66666666666, ans=0.0 2023-09-28 21:08:52,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:08:56,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:08:57,027 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=13.10 vs. limit=15.0 2023-09-28 21:08:57,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 21:09:00,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:09:01,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:09:02,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:09:05,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:09:07,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:09:08,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:09:08,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:09:08,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 21:09:14,557 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.253e+02 2.610e+02 3.187e+02 7.657e+02, threshold=5.219e+02, percent-clipped=8.0 2023-09-28 21:09:14,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:09:14,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:09:14,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:09:14,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:09:18,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:09:18,837 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=142920.0, ans=0.125 2023-09-28 21:09:20,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 21:09:22,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:09:23,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:09:25,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:09:27,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:09:27,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 21:09:29,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:09:29,082 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 21:09:32,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:09:36,803 INFO [train.py:1039] (3/4) Epoch 5, batch 200, loss[loss=0.2764, simple_loss=0.3313, pruned_loss=0.1107, over 23373.00 frames. ], tot_loss[loss=0.2606, simple_loss=0.3178, pruned_loss=0.1017, over 3001526.47 frames. ], batch size: 93, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:09:36,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:09:38,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:09:40,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 21:09:41,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:09:41,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:09:43,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 21:09:43,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=142986.66666666666, ans=0.125 2023-09-28 21:09:46,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:09:48,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:09:48,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:09:53,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:09:53,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:09:53,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:09:57,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=143053.33333333334, ans=0.1 2023-09-28 21:09:59,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=143053.33333333334, ans=0.0 2023-09-28 21:10:07,690 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=143053.33333333334, ans=0.125 2023-09-28 21:10:09,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=143120.0, ans=0.125 2023-09-28 21:10:12,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:10:13,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:10:13,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=143120.0, ans=0.125 2023-09-28 21:10:15,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:10:16,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:10:17,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 21:10:17,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:10:20,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:10:22,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:10:22,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:10:22,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:10:24,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 21:10:25,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 21:10:25,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:10:31,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:10:34,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:10:37,111 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=143186.66666666666, ans=0.125 2023-09-28 21:10:43,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:10:43,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:10:52,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:10:52,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=143253.33333333334, ans=0.125 2023-09-28 21:10:54,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 21:10:55,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:10:55,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:10:55,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:10:57,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:10:59,100 INFO [train.py:1039] (3/4) Epoch 5, batch 250, loss[loss=0.2639, simple_loss=0.3232, pruned_loss=0.1023, over 24662.00 frames. ], tot_loss[loss=0.2568, simple_loss=0.3153, pruned_loss=0.09909, over 3391259.44 frames. ], batch size: 73, lr: 2.10e-02, grad_scale: 32.0 2023-09-28 21:10:59,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 21:11:00,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:11:02,069 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 21:11:03,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:11:07,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:11:07,383 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:11:08,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:11:10,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:11:10,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:11:11,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:11:16,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:11:29,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:11:31,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:11:32,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:11:38,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:11:39,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:11:39,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:11:39,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:11:41,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:11:41,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:11:41,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:11:44,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:11:47,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 21:11:47,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:11:49,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:11:49,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:11:49,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:11:51,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:11:53,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:11:53,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:11:54,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=143520.0, ans=0.0 2023-09-28 21:11:56,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:11:57,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:11:57,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:11:59,959 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.784e+02 2.236e+02 2.772e+02 3.274e+02 8.100e+02, threshold=5.544e+02, percent-clipped=4.0 2023-09-28 21:12:01,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:12:04,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:12:07,752 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.71 vs. limit=12.0 2023-09-28 21:12:08,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:12:16,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:12:16,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:12:19,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 21:12:21,376 INFO [train.py:1039] (3/4) Epoch 5, batch 300, loss[loss=0.2377, simple_loss=0.2986, pruned_loss=0.08838, over 24569.00 frames. ], tot_loss[loss=0.2548, simple_loss=0.3135, pruned_loss=0.09799, over 3694397.32 frames. ], batch size: 60, lr: 2.09e-02, grad_scale: 32.0 2023-09-28 21:12:21,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:12:21,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:12:24,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 21:12:24,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 21:12:26,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:12:26,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 21:12:30,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:12:33,336 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:12:36,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:12:38,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 21:12:39,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:12:41,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:12:41,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 21:12:41,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:12:45,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 21:12:52,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:12:52,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 21:12:55,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 21:12:55,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:12:58,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:12:59,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:12:59,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 21:12:59,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:13:02,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:13:03,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:13:05,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:13:10,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 21:13:10,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 21:13:11,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:13:14,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:13:14,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 21:13:16,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:13:20,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:13:23,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:13:23,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 21:13:28,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:13:28,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:13:31,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:13:31,511 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:13:33,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 21:13:33,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 21:13:33,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:13:34,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 21:13:36,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:13:38,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:13:38,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:13:38,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:13:40,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:13:44,697 INFO [train.py:1039] (3/4) Epoch 5, batch 350, loss[loss=0.2382, simple_loss=0.2629, pruned_loss=0.1067, over 19025.00 frames. ], tot_loss[loss=0.2528, simple_loss=0.3108, pruned_loss=0.09737, over 3913030.57 frames. ], batch size: 389, lr: 2.09e-02, grad_scale: 32.0 2023-09-28 21:13:44,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:13:44,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 21:13:50,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:13:53,498 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=143986.66666666666, ans=0.125 2023-09-28 21:13:56,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:13:59,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:13:59,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:14:04,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 21:14:05,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:14:05,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 21:14:08,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:14:08,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 21:14:10,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:14:14,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 21:14:15,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:14:16,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=144120.0, ans=0.04949747468305833 2023-09-28 21:14:17,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:14:18,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:14:20,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:14:20,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:14:21,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:14:21,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:14:21,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:14:23,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:14:23,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:14:30,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:14:30,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:14:32,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:14:32,400 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:14:38,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 21:14:38,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:14:44,464 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.897e+02 2.143e+02 2.367e+02 2.704e+02 4.411e+02, threshold=4.734e+02, percent-clipped=0.0 2023-09-28 21:14:44,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:14:44,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:14:44,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:14:48,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 21:14:48,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:14:49,547 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=144253.33333333334, ans=0.125 2023-09-28 21:14:50,764 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 21:14:52,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 21:14:52,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:14:55,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:14:55,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 21:14:57,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:15:02,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:15:04,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:15:04,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:15:04,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:15:05,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:15:07,256 INFO [train.py:1039] (3/4) Epoch 5, batch 400, loss[loss=0.2594, simple_loss=0.3069, pruned_loss=0.106, over 23712.00 frames. ], tot_loss[loss=0.2517, simple_loss=0.3098, pruned_loss=0.09682, over 4093483.70 frames. ], batch size: 164, lr: 2.09e-02, grad_scale: 32.0 2023-09-28 21:15:08,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:15:11,207 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.72 vs. limit=22.5 2023-09-28 21:15:11,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:15:13,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 21:15:13,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:15:14,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:15:14,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:15:16,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:15:18,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:15:20,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:15:23,914 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 21:15:27,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 21:15:27,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:15:27,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 21:15:28,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:15:33,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:15:33,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:15:33,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 21:15:33,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:15:33,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:15:33,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:15:35,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:15:37,173 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 21:15:38,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 21:15:41,271 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.83 vs. limit=12.0 2023-09-28 21:15:43,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:15:43,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:15:45,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 21:15:46,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 21:15:49,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:15:52,673 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:15:53,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=144453.33333333334, ans=0.125 2023-09-28 21:16:00,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 21:16:04,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 21:16:06,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 21:16:08,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:16:09,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:16:09,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 21:16:13,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:16:16,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:16:18,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:16:20,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:16:21,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 21:16:24,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 21:16:24,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 21:16:25,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:16:25,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:16:28,817 INFO [train.py:1039] (3/4) Epoch 5, batch 450, loss[loss=0.222, simple_loss=0.2899, pruned_loss=0.07711, over 24603.00 frames. ], tot_loss[loss=0.2525, simple_loss=0.3103, pruned_loss=0.09732, over 4238530.46 frames. ], batch size: 60, lr: 2.09e-02, grad_scale: 32.0 2023-09-28 21:16:28,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 21:16:32,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:16:32,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:16:34,032 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 21:16:34,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=144653.33333333334, ans=0.0 2023-09-28 21:16:36,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 21:16:36,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:16:37,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:16:39,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:16:39,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 21:16:39,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:16:40,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:16:44,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:16:46,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=144720.0, ans=0.0 2023-09-28 21:16:53,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:16:54,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:16:55,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 21:16:56,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 21:16:56,390 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=144720.0, ans=0.2 2023-09-28 21:16:59,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:17:02,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:17:05,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:17:09,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:17:11,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:17:14,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 21:17:14,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=144786.66666666666, ans=0.0 2023-09-28 21:17:15,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 21:17:17,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 21:17:17,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:17:19,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:17:19,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:17:21,639 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 21:17:22,935 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 21:17:22,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:17:25,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:17:25,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 21:17:30,463 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.848e+02 2.241e+02 2.627e+02 3.194e+02 6.560e+02, threshold=5.254e+02, percent-clipped=4.0 2023-09-28 21:17:30,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 21:17:30,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:17:32,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 21:17:32,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 21:17:35,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:17:35,644 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=144920.0, ans=0.1 2023-09-28 21:17:36,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:17:36,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:17:37,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=144920.0, ans=0.2 2023-09-28 21:17:38,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 21:17:43,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:17:43,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 21:17:45,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 21:17:47,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:17:51,518 INFO [train.py:1039] (3/4) Epoch 5, batch 500, loss[loss=0.269, simple_loss=0.3389, pruned_loss=0.09958, over 24334.00 frames. ], tot_loss[loss=0.2535, simple_loss=0.3117, pruned_loss=0.09766, over 4357032.78 frames. ], batch size: 74, lr: 2.08e-02, grad_scale: 32.0 2023-09-28 21:17:52,087 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=144986.66666666666, ans=0.125 2023-09-28 21:17:53,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:17:54,070 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.29 vs. limit=15.0 2023-09-28 21:17:55,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:17:57,022 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:17:57,057 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 21:18:01,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:18:01,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:18:01,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:18:01,950 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 21:18:02,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=144986.66666666666, ans=0.2 2023-09-28 21:18:03,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 21:18:03,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:18:06,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 21:18:11,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 21:18:11,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:18:12,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:18:13,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn1.whiten.whitening_limit, batch_count=145053.33333333334, ans=22.5 2023-09-28 21:18:14,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:18:14,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:18:25,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:18:25,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:18:26,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 21:18:26,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:18:27,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 21:18:27,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:18:32,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:18:33,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:18:33,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:18:33,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:18:33,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 21:18:35,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=145120.0, ans=0.2 2023-09-28 21:18:38,254 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 21:18:39,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:18:41,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:18:43,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:18:43,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:18:43,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:18:46,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 21:18:48,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:18:51,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:18:55,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:18:58,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:19:04,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=145253.33333333334, ans=0.125 2023-09-28 21:19:05,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:19:07,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 21:19:09,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:19:09,046 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:19:11,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=145253.33333333334, ans=0.0 2023-09-28 21:19:12,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 21:19:12,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 21:19:14,998 INFO [train.py:1039] (3/4) Epoch 5, batch 550, loss[loss=0.2675, simple_loss=0.3087, pruned_loss=0.1131, over 23815.00 frames. ], tot_loss[loss=0.2556, simple_loss=0.3134, pruned_loss=0.09889, over 4442878.06 frames. ], batch size: 164, lr: 2.08e-02, grad_scale: 32.0 2023-09-28 21:19:15,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:19:20,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 21:19:21,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 21:19:23,054 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:19:23,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 21:19:23,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:19:23,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:19:24,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:19:24,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:19:24,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:19:26,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:19:28,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:19:30,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 21:19:30,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:19:35,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:19:35,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:19:35,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=145386.66666666666, ans=0.1 2023-09-28 21:19:38,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:19:40,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:19:44,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 21:19:46,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 21:19:48,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:19:49,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=145453.33333333334, ans=0.125 2023-09-28 21:19:52,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:19:52,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:19:54,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:20:01,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:20:01,042 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 21:20:01,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:20:03,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 21:20:06,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:20:06,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:20:06,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:20:08,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:20:09,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 21:20:11,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 21:20:11,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:20:11,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:20:13,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:20:13,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:20:16,497 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.773e+02 2.228e+02 2.515e+02 3.038e+02 5.618e+02, threshold=5.030e+02, percent-clipped=1.0 2023-09-28 21:20:16,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:20:16,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:20:19,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:20:21,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:20:21,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 21:20:22,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:20:24,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:20:24,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:20:26,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:20:26,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 21:20:27,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 21:20:33,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 21:20:38,085 INFO [train.py:1039] (3/4) Epoch 5, batch 600, loss[loss=0.2486, simple_loss=0.2995, pruned_loss=0.09886, over 23676.00 frames. ], tot_loss[loss=0.2563, simple_loss=0.3138, pruned_loss=0.09946, over 4501907.80 frames. ], batch size: 135, lr: 2.08e-02, grad_scale: 16.0 2023-09-28 21:20:38,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 21:20:39,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:20:39,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:20:41,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:20:48,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:20:50,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:20:52,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 21:20:55,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:20:55,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:20:58,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:21:01,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 21:21:02,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:21:07,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 21:21:11,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:21:11,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:21:13,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:21:15,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=145786.66666666666, ans=0.0 2023-09-28 21:21:19,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:21:19,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:21:19,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:21:19,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=145786.66666666666, ans=0.125 2023-09-28 21:21:26,591 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:21:27,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=145853.33333333334, ans=0.125 2023-09-28 21:21:31,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:21:31,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:21:31,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:21:34,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=145853.33333333334, ans=0.05 2023-09-28 21:21:39,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 21:21:44,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 21:21:44,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:21:44,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=145920.0, ans=0.2 2023-09-28 21:21:49,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 21:21:49,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:21:51,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 21:21:51,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:21:51,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:21:58,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 21:21:59,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 21:22:00,847 INFO [train.py:1039] (3/4) Epoch 5, batch 650, loss[loss=0.2477, simple_loss=0.3134, pruned_loss=0.091, over 23653.00 frames. ], tot_loss[loss=0.2546, simple_loss=0.3118, pruned_loss=0.09869, over 4534287.14 frames. ], batch size: 85, lr: 2.08e-02, grad_scale: 16.0 2023-09-28 21:22:01,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:22:02,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:22:05,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:07,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 21:22:08,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:22:10,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=145986.66666666666, ans=0.125 2023-09-28 21:22:13,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:22:13,490 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:22:17,276 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:22:19,653 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.28 vs. limit=10.0 2023-09-28 21:22:20,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 21:22:22,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=146053.33333333334, ans=0.0 2023-09-28 21:22:22,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=146053.33333333334, ans=0.125 2023-09-28 21:22:24,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:22:25,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:22:30,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:22:30,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 21:22:33,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:22:33,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:34,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 21:22:35,192 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=146120.0, ans=0.0 2023-09-28 21:22:36,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:38,028 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:22:39,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=146120.0, ans=0.2 2023-09-28 21:22:40,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:22:41,017 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 21:22:41,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:22:41,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:22:44,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:45,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:22:46,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:22:46,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:22:47,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 21:22:50,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:22:51,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:22:53,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:22:53,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:22:53,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 21:22:55,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 21:22:56,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 21:22:57,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=146186.66666666666, ans=0.125 2023-09-28 21:22:58,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:22:58,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:22:58,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:22:58,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:23:00,292 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=146186.66666666666, ans=0.125 2023-09-28 21:23:01,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:23:04,874 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.759e+02 2.282e+02 2.474e+02 2.887e+02 4.172e+02, threshold=4.947e+02, percent-clipped=0.0 2023-09-28 21:23:06,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:23:06,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:23:08,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:23:11,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:23:11,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:23:12,088 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.39 vs. limit=22.5 2023-09-28 21:23:12,975 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:23:14,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=146253.33333333334, ans=0.0 2023-09-28 21:23:19,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:23:19,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:23:19,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:23:19,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:23:24,859 INFO [train.py:1039] (3/4) Epoch 5, batch 700, loss[loss=0.2776, simple_loss=0.3215, pruned_loss=0.1169, over 23454.00 frames. ], tot_loss[loss=0.2528, simple_loss=0.3097, pruned_loss=0.09795, over 4568129.44 frames. ], batch size: 285, lr: 2.08e-02, grad_scale: 16.0 2023-09-28 21:23:27,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 21:23:27,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 21:23:27,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=146320.0, ans=0.1 2023-09-28 21:23:30,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 21:23:31,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:23:33,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:23:33,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 21:23:39,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:23:42,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:23:44,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:23:46,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:23:46,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:23:48,934 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.36 vs. limit=15.0 2023-09-28 21:23:49,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:23:52,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 21:23:52,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:23:55,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 21:24:00,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 21:24:05,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:24:05,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:24:07,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:24:10,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:24:12,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 21:24:15,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:24:15,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:24:15,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 21:24:21,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:24:23,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:24:25,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:24:27,557 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=17.05 vs. limit=15.0 2023-09-28 21:24:30,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:24:31,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 21:24:31,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=146586.66666666666, ans=0.04949747468305833 2023-09-28 21:24:37,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 21:24:38,542 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 21:24:38,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:24:41,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:24:42,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:24:44,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:24:44,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 21:24:47,202 INFO [train.py:1039] (3/4) Epoch 5, batch 750, loss[loss=0.2487, simple_loss=0.2993, pruned_loss=0.09907, over 23229.00 frames. ], tot_loss[loss=0.2525, simple_loss=0.3095, pruned_loss=0.09777, over 4602352.44 frames. ], batch size: 119, lr: 2.07e-02, grad_scale: 16.0 2023-09-28 21:24:48,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 21:24:48,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 21:24:49,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 21:24:50,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 21:24:50,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=146653.33333333334, ans=0.2 2023-09-28 21:24:52,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 21:24:52,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:24:52,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=146653.33333333334, ans=0.0 2023-09-28 21:24:53,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 21:24:55,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:24:55,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:24:58,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:25:00,584 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:25:02,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 21:25:02,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:25:04,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:25:04,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=146720.0, ans=0.125 2023-09-28 21:25:05,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:25:08,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:25:12,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:25:14,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:25:14,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 21:25:14,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:25:17,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:25:19,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:25:19,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 21:25:21,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 21:25:21,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:25:23,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 21:25:24,866 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 21:25:24,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 21:25:25,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:25:25,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 21:25:28,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:25:34,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:25:34,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:25:34,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:25:37,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:25:39,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:25:39,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 21:25:40,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:25:42,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 21:25:43,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:25:47,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:25:47,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 21:25:47,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:25:51,067 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.840e+02 2.300e+02 2.781e+02 3.196e+02 5.681e+02, threshold=5.563e+02, percent-clipped=1.0 2023-09-28 21:25:52,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:25:55,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:25:55,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:25:58,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:25:59,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=146920.0, ans=0.0 2023-09-28 21:26:01,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 21:26:01,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:26:02,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:26:07,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:26:07,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:26:10,188 INFO [train.py:1039] (3/4) Epoch 5, batch 800, loss[loss=0.2547, simple_loss=0.3195, pruned_loss=0.09491, over 24362.00 frames. ], tot_loss[loss=0.2523, simple_loss=0.3098, pruned_loss=0.0974, over 4619466.50 frames. ], batch size: 77, lr: 2.07e-02, grad_scale: 32.0 2023-09-28 21:26:10,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:26:12,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:26:18,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:26:18,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:26:22,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:26:22,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:26:23,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:26:23,913 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:26:24,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:26:29,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:26:31,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:26:33,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=147053.33333333334, ans=0.0 2023-09-28 21:26:34,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 21:26:35,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:26:35,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=147053.33333333334, ans=0.025 2023-09-28 21:26:37,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:26:37,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:26:37,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:26:38,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 21:26:38,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:26:38,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 21:26:42,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:26:43,713 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:26:45,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:26:47,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:26:50,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:26:50,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:26:50,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=147120.0, ans=0.0 2023-09-28 21:26:56,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:26:56,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:26:56,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 21:26:58,627 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 21:26:58,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 21:26:58,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:27:00,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:27:01,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:27:01,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:27:07,099 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 21:27:07,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 21:27:08,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:27:10,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:27:13,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:27:18,170 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:27:18,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 21:27:20,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:27:22,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 21:27:29,393 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.77 vs. limit=6.0 2023-09-28 21:27:31,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:27:32,892 INFO [train.py:1039] (3/4) Epoch 5, batch 850, loss[loss=0.2166, simple_loss=0.2888, pruned_loss=0.07222, over 24502.00 frames. ], tot_loss[loss=0.254, simple_loss=0.3114, pruned_loss=0.0983, over 4637786.32 frames. ], batch size: 66, lr: 2.07e-02, grad_scale: 32.0 2023-09-28 21:27:33,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:27:34,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 21:27:34,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:27:36,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:27:38,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 21:27:38,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:27:40,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:27:42,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:27:43,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:27:45,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:27:46,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 21:27:46,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 21:27:46,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 21:27:48,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:27:48,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:27:51,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:27:51,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:27:52,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:27:58,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:27:58,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:27:58,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 21:28:03,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 21:28:06,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:28:07,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 21:28:11,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 21:28:12,594 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 21:28:14,774 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 21:28:14,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:28:14,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:28:14,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 21:28:15,960 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.48 vs. limit=15.0 2023-09-28 21:28:18,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:28:19,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:28:19,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 21:28:23,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:28:23,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=147520.0, ans=0.1 2023-09-28 21:28:24,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:28:24,629 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:28:24,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:28:26,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:28:28,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 21:28:28,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 21:28:34,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:28:34,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:28:35,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:28:36,311 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.721e+02 2.245e+02 2.598e+02 3.142e+02 5.686e+02, threshold=5.195e+02, percent-clipped=1.0 2023-09-28 21:28:36,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:28:36,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:28:38,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:28:40,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:28:41,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:28:42,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:28:43,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:28:51,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 21:28:53,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:28:53,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 21:28:54,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:28:54,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:28:56,163 INFO [train.py:1039] (3/4) Epoch 5, batch 900, loss[loss=0.2387, simple_loss=0.3014, pruned_loss=0.088, over 24643.00 frames. ], tot_loss[loss=0.255, simple_loss=0.3117, pruned_loss=0.09915, over 4648175.49 frames. ], batch size: 65, lr: 2.07e-02, grad_scale: 32.0 2023-09-28 21:28:57,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 21:29:05,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:29:07,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:29:07,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 21:29:10,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:29:12,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 21:29:12,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 21:29:14,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:29:14,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:29:14,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:29:15,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:29:19,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=147720.0, ans=0.0 2023-09-28 21:29:26,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:29:26,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:29:28,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:29:31,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:29:37,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 21:29:38,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:29:42,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:29:42,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:29:42,274 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 21:29:43,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 21:29:44,011 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=147853.33333333334, ans=0.0 2023-09-28 21:29:52,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 21:29:52,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:29:52,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:29:54,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=147853.33333333334, ans=10.0 2023-09-28 21:30:00,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:30:00,044 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:30:01,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 21:30:01,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:30:03,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=147920.0, ans=0.05 2023-09-28 21:30:05,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 21:30:08,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:30:08,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:30:10,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:30:10,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:30:15,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 21:30:15,077 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 21:30:16,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 21:30:16,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 21:30:18,137 INFO [train.py:1039] (3/4) Epoch 5, batch 950, loss[loss=0.2751, simple_loss=0.3295, pruned_loss=0.1104, over 23430.00 frames. ], tot_loss[loss=0.256, simple_loss=0.3127, pruned_loss=0.0996, over 4667561.87 frames. ], batch size: 105, lr: 2.07e-02, grad_scale: 32.0 2023-09-28 21:30:19,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:30:25,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 21:30:31,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:30:31,982 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=147986.66666666666, ans=0.125 2023-09-28 21:30:33,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:30:33,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:30:35,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 21:30:35,343 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 21:30:38,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:30:40,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:30:40,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:30:40,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:30:42,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 21:30:44,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 21:30:45,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:30:47,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 21:30:47,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:30:50,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:30:50,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:30:51,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:30:53,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 21:30:56,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 21:30:57,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:31:00,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:31:04,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:31:04,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:31:07,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 21:31:08,943 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 21:31:08,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:31:10,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:31:11,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:31:11,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:31:17,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 21:31:18,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=148186.66666666666, ans=0.125 2023-09-28 21:31:19,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:31:21,889 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.669e+02 2.107e+02 2.418e+02 2.816e+02 4.980e+02, threshold=4.836e+02, percent-clipped=0.0 2023-09-28 21:31:22,050 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:31:22,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:31:22,194 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 21:31:22,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:31:22,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:31:23,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 21:31:27,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=148253.33333333334, ans=0.125 2023-09-28 21:31:28,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:31:29,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:31:32,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=148253.33333333334, ans=0.125 2023-09-28 21:31:33,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=148253.33333333334, ans=0.125 2023-09-28 21:31:35,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:31:36,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 21:31:36,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 21:31:38,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=148253.33333333334, ans=0.1 2023-09-28 21:31:40,435 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=148320.0, ans=0.0 2023-09-28 21:31:41,334 INFO [train.py:1039] (3/4) Epoch 5, batch 1000, loss[loss=0.2074, simple_loss=0.2745, pruned_loss=0.07013, over 24340.00 frames. ], tot_loss[loss=0.2547, simple_loss=0.311, pruned_loss=0.09925, over 4674912.64 frames. ], batch size: 56, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:31:41,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:31:45,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 21:31:45,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:31:50,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:31:52,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 21:31:52,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 21:32:00,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:32:00,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:32:02,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:32:04,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 21:32:04,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=148386.66666666666, ans=0.05 2023-09-28 21:32:08,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 21:32:11,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 21:32:11,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:32:12,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 21:32:14,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 21:32:14,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 21:32:14,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:32:15,487 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.95 vs. limit=15.0 2023-09-28 21:32:16,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:32:23,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:32:23,472 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=148453.33333333334, ans=0.5 2023-09-28 21:32:24,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:32:26,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:32:26,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:32:26,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 21:32:28,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:32:28,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:32:29,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:32:29,771 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 21:32:34,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 21:32:34,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 21:32:34,784 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=148520.0, ans=0.0 2023-09-28 21:32:37,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 21:32:39,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:32:39,988 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=148520.0, ans=0.125 2023-09-28 21:32:47,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:32:47,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:32:47,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:32:49,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:32:51,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-28 21:32:51,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=148586.66666666666, ans=0.125 2023-09-28 21:32:53,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:32:53,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-28 21:32:54,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-28 21:32:56,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:32:56,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:32:59,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:33:02,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:33:03,583 INFO [train.py:1039] (3/4) Epoch 5, batch 1050, loss[loss=0.2682, simple_loss=0.3269, pruned_loss=0.1048, over 23333.00 frames. ], tot_loss[loss=0.2535, simple_loss=0.3094, pruned_loss=0.09884, over 4673343.20 frames. ], batch size: 93, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:33:03,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:33:07,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:33:09,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:33:10,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:33:12,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:33:15,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:33:16,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:33:18,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:33:21,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:33:21,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:33:21,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:33:24,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:33:25,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-28 21:33:26,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:33:28,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-28 21:33:29,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:33:29,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-28 21:33:29,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-28 21:33:34,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:33:36,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:33:36,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:33:39,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-28 21:33:39,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-28 21:33:39,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:33:43,820 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.07 vs. limit=10.0 2023-09-28 21:33:45,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-28 21:33:48,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-28 21:33:49,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:33:53,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 21:33:55,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-28 21:33:55,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:33:56,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:34:01,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:34:04,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-28 21:34:06,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-28 21:34:06,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-28 21:34:07,750 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.779e+02 2.205e+02 2.391e+02 2.864e+02 4.460e+02, threshold=4.781e+02, percent-clipped=0.0 2023-09-28 21:34:07,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:34:07,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:34:11,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-28 21:34:14,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:34:17,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:34:17,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:34:17,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:34:17,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:34:21,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:34:21,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-28 21:34:23,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:34:23,305 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-28 21:34:23,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-28 21:34:23,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=148920.0, ans=0.125 2023-09-28 21:34:24,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:34:27,612 INFO [train.py:1039] (3/4) Epoch 5, batch 1100, loss[loss=0.2637, simple_loss=0.3339, pruned_loss=0.09676, over 24559.00 frames. ], tot_loss[loss=0.2521, simple_loss=0.3085, pruned_loss=0.09788, over 4674630.21 frames. ], batch size: 71, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:34:29,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:34:34,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:34:40,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:34:40,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=148986.66666666666, ans=0.0 2023-09-28 21:34:41,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:34:41,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:34:43,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-28 21:34:44,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:34:47,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-28 21:34:48,279 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=149053.33333333334, ans=0.125 2023-09-28 21:34:49,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:34:51,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:34:52,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-28 21:34:54,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 21:34:56,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:34:56,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:34:58,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:34:59,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-28 21:35:05,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:35:07,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=149120.0, ans=0.2 2023-09-28 21:35:08,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-28 21:35:11,144 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-28 21:35:11,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:35:15,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:35:15,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-28 21:35:15,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=149186.66666666666, ans=0.125 2023-09-28 21:35:17,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:35:17,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-28 21:35:18,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:35:18,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:35:18,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:35:20,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:35:20,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-28 21:35:20,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=149186.66666666666, ans=0.125 2023-09-28 21:35:20,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=149186.66666666666, ans=0.125 2023-09-28 21:35:26,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:35:26,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-28 21:35:28,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:35:31,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=149253.33333333334, ans=0.0 2023-09-28 21:35:33,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:35:35,816 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-28 21:35:35,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-28 21:35:36,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=149253.33333333334, ans=0.125 2023-09-28 21:35:36,052 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=149253.33333333334, ans=0.0 2023-09-28 21:35:37,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:35:40,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:35:41,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:35:41,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-28 21:35:43,027 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.83 vs. limit=5.0 2023-09-28 21:35:43,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:35:45,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:35:45,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-28 21:35:45,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:35:47,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-28 21:35:48,570 INFO [train.py:1039] (3/4) Epoch 5, batch 1150, loss[loss=0.2707, simple_loss=0.3152, pruned_loss=0.1131, over 23347.00 frames. ], tot_loss[loss=0.2517, simple_loss=0.3086, pruned_loss=0.09735, over 4685633.52 frames. ], batch size: 285, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:35:48,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:35:48,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:35:50,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:35:56,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:35:56,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=149320.0, ans=0.0 2023-09-28 21:35:57,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:35:59,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:36:01,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:36:01,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-28 21:36:01,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:36:05,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-28 21:36:05,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:36:05,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:36:08,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=149386.66666666666, ans=0.95 2023-09-28 21:36:12,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-28 21:36:14,536 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:36:17,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:36:19,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:36:19,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-28 21:36:19,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:36:21,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:36:24,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-28 21:36:26,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:36:28,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:36:34,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=149453.33333333334, ans=0.125 2023-09-28 21:36:41,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:36:47,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:36:47,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-28 21:36:48,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:36:48,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:36:50,956 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.787e+02 2.166e+02 2.435e+02 2.809e+02 4.003e+02, threshold=4.871e+02, percent-clipped=0.0 2023-09-28 21:36:52,867 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-28 21:36:53,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=149586.66666666666, ans=0.125 2023-09-28 21:36:54,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:37:04,059 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-28 21:37:07,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:37:07,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:37:09,345 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:37:09,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:37:09,755 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=149653.33333333334, ans=0.125 2023-09-28 21:37:10,883 INFO [train.py:1039] (3/4) Epoch 5, batch 1200, loss[loss=0.2733, simple_loss=0.322, pruned_loss=0.1123, over 23550.00 frames. ], tot_loss[loss=0.2513, simple_loss=0.309, pruned_loss=0.09681, over 4704334.93 frames. ], batch size: 256, lr: 2.06e-02, grad_scale: 32.0 2023-09-28 21:37:11,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=149653.33333333334, ans=0.0 2023-09-28 21:37:13,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:37:20,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:37:20,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:37:22,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:37:22,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:37:22,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:37:25,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:37:28,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:37:30,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:37:30,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:37:31,984 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-28 21:37:35,609 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-28 21:37:38,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:37:41,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:37:43,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:37:47,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:37:47,365 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-28 21:37:48,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:37:55,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-28 21:37:55,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:37:55,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-28 21:37:57,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:38:00,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-28 21:38:00,472 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=149853.33333333334, ans=0.0 2023-09-28 21:38:04,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-28 21:38:04,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:38:06,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:38:09,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:38:09,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:38:11,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:38:11,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:38:12,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:38:13,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-28 21:38:14,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:38:14,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:38:14,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:38:18,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:38:18,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:38:23,391 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-28 21:38:24,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:38:28,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-28 21:38:31,827 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-28 21:38:33,133 INFO [train.py:1039] (3/4) Epoch 5, batch 1250, loss[loss=0.2565, simple_loss=0.3254, pruned_loss=0.0938, over 24465.00 frames. ], tot_loss[loss=0.2515, simple_loss=0.3092, pruned_loss=0.09687, over 4705345.52 frames. ], batch size: 66, lr: 2.05e-02, grad_scale: 32.0 2023-09-28 21:38:34,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:38:36,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:38:37,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:38:39,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:38:41,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-28 21:38:42,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=149986.66666666666, ans=0.125 2023-09-28 21:38:44,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:38:46,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:38:47,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-28 21:38:48,430 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.52 vs. limit=22.5 2023-09-28 21:38:49,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:38:51,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:38:56,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 21:38:56,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:38:57,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:38:57,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:39:00,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:39:03,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 21:39:03,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:39:03,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:39:04,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:39:06,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:39:09,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:39:12,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-28 21:39:12,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=150120.0, ans=0.2 2023-09-28 21:39:16,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-28 21:39:17,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:39:20,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:39:20,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-28 21:39:20,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:39:20,951 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-28 21:39:22,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:39:22,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:39:26,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:39:29,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:39:29,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:39:31,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-28 21:39:31,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-28 21:39:32,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-28 21:39:36,086 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.883e+02 2.272e+02 2.528e+02 2.863e+02 4.623e+02, threshold=5.057e+02, percent-clipped=0.0 2023-09-28 21:39:36,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:39:37,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-28 21:39:37,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:39:39,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-28 21:39:40,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:39:41,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-28 21:39:41,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-28 21:39:42,559 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:39:42,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-28 21:39:42,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:39:46,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-28 21:39:47,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:39:50,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:39:52,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:39:53,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:39:55,815 INFO [train.py:1039] (3/4) Epoch 5, batch 1300, loss[loss=0.2782, simple_loss=0.3337, pruned_loss=0.1113, over 24049.00 frames. ], tot_loss[loss=0.2544, simple_loss=0.3115, pruned_loss=0.09862, over 4695841.60 frames. ], batch size: 80, lr: 2.05e-02, grad_scale: 32.0 2023-09-28 21:39:57,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:39:57,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-28 21:40:03,902 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:40:05,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=150320.0, ans=0.125 2023-09-28 21:40:06,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-28 21:40:06,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:40:07,348 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=150320.0, ans=0.125 2023-09-28 21:40:09,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:40:10,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:40:10,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-28 21:40:16,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:40:17,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:40:20,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-28 21:40:22,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 21:40:26,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:40:27,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:40:30,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:40:30,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:40:32,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:40:32,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-28 21:40:34,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-28 21:40:39,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:40:39,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:40:41,269 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=150453.33333333334, ans=0.125 2023-09-28 21:40:42,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-28 21:40:42,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 21:40:45,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:40:48,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:40:48,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-28 21:40:48,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:40:48,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-28 21:40:51,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:40:53,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:40:53,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:40:58,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-28 21:40:59,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-28 21:41:01,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-28 21:41:06,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:41:09,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-28 21:41:11,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:41:13,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=150586.66666666666, ans=0.0 2023-09-28 21:41:18,065 INFO [train.py:1039] (3/4) Epoch 5, batch 1350, loss[loss=0.2566, simple_loss=0.2993, pruned_loss=0.1069, over 23766.00 frames. ], tot_loss[loss=0.2528, simple_loss=0.3105, pruned_loss=0.0976, over 4705055.56 frames. ], batch size: 212, lr: 2.05e-02, grad_scale: 32.0 2023-09-28 21:41:18,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-28 21:41:21,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:41:24,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:41:27,912 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:41:27,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:41:31,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:41:31,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:41:35,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:41:38,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-28 21:41:41,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:41:41,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:41:43,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-28 21:41:43,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:41:46,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:41:46,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-28 21:41:47,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-28 21:41:50,691 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.48 vs. limit=12.0 2023-09-28 21:41:51,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-28 21:41:52,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:41:52,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-28 21:42:04,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:42:12,691 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=150853.33333333334, ans=0.125 2023-09-28 21:42:15,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:42:15,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:42:16,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-28 21:42:16,238 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:42:19,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:42:20,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-28 21:42:20,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-28 21:42:22,128 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.780e+02 2.340e+02 2.561e+02 2.889e+02 4.488e+02, threshold=5.123e+02, percent-clipped=0.0 2023-09-28 21:42:22,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:42:25,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:42:27,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-28 21:42:30,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:42:35,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-28 21:42:36,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-28 21:42:40,569 INFO [train.py:1039] (3/4) Epoch 5, batch 1400, loss[loss=0.2361, simple_loss=0.3097, pruned_loss=0.08128, over 24674.00 frames. ], tot_loss[loss=0.2516, simple_loss=0.3095, pruned_loss=0.09684, over 4705914.96 frames. ], batch size: 65, lr: 2.05e-02, grad_scale: 16.0 2023-09-28 21:42:43,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-28 21:42:45,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:42:47,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:42:49,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:42:55,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-28 21:42:57,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-28 21:43:05,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=151053.33333333334, ans=0.0 2023-09-28 21:43:08,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:43:09,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:43:11,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:43:11,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-28 21:43:16,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:43:16,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 21:43:22,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=151120.0, ans=0.2 2023-09-28 21:43:27,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:43:27,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:43:27,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=151120.0, ans=0.0 2023-09-28 21:43:32,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-28 21:43:32,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:43:32,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:43:33,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:43:35,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:43:35,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:43:37,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:43:37,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:43:38,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-28 21:43:38,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:43:39,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=151186.66666666666, ans=0.125 2023-09-28 21:43:43,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:43:48,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:43:51,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=151253.33333333334, ans=0.125 2023-09-28 21:43:55,447 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=151253.33333333334, ans=0.1 2023-09-28 21:43:56,623 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-28 21:43:58,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 21:43:58,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:43:59,692 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.50 vs. limit=22.5 2023-09-28 21:44:01,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 21:44:02,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:44:03,461 INFO [train.py:1039] (3/4) Epoch 5, batch 1450, loss[loss=0.248, simple_loss=0.308, pruned_loss=0.09403, over 23635.00 frames. ], tot_loss[loss=0.2493, simple_loss=0.3072, pruned_loss=0.09572, over 4689071.50 frames. ], batch size: 149, lr: 2.05e-02, grad_scale: 16.0 2023-09-28 21:44:05,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:44:08,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:44:10,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:44:10,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:10,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-28 21:44:14,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:44:16,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 21:44:16,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:44:17,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-28 21:44:19,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:44:19,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-28 21:44:19,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:20,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:44:20,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-28 21:44:24,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:44:24,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:44:25,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 21:44:26,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:44:26,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:44:29,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:31,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:44:34,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:44:34,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:44:35,079 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:44:37,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:44:37,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:38,636 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.99 vs. limit=22.5 2023-09-28 21:44:40,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:44:41,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:44:41,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:44:41,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:44:44,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-28 21:44:45,791 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.08 vs. limit=12.0 2023-09-28 21:44:46,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=151453.33333333334, ans=0.125 2023-09-28 21:44:47,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:44:48,295 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_ff2.min_abs, batch_count=151453.33333333334, ans=0.1 2023-09-28 21:44:52,400 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-28 21:44:53,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:44:55,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:44:57,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:44:59,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-28 21:45:04,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:45:06,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-28 21:45:07,707 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.862e+02 2.279e+02 2.648e+02 3.024e+02 3.849e+02, threshold=5.296e+02, percent-clipped=0.0 2023-09-28 21:45:07,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-28 21:45:09,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:45:11,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:45:12,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:45:14,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-28 21:45:17,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-28 21:45:17,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-28 21:45:19,462 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:45:20,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 21:45:24,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=151653.33333333334, ans=0.0 2023-09-28 21:45:25,567 INFO [train.py:1039] (3/4) Epoch 5, batch 1500, loss[loss=0.2472, simple_loss=0.3138, pruned_loss=0.09032, over 24373.00 frames. ], tot_loss[loss=0.2494, simple_loss=0.3076, pruned_loss=0.09554, over 4708516.73 frames. ], batch size: 77, lr: 2.04e-02, grad_scale: 16.0 2023-09-28 21:45:31,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-28 21:45:31,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:45:31,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:45:31,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:45:32,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=151653.33333333334, ans=0.0 2023-09-28 21:45:33,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:45:35,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:45:37,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-28 21:45:38,258 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=13.80 vs. limit=15.0 2023-09-28 21:45:39,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:45:40,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-28 21:45:40,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:45:42,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:45:43,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:45:44,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:45:50,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:45:50,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-28 21:45:51,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:45:51,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:45:53,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:45:56,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-28 21:46:00,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-28 21:46:01,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:46:01,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-28 21:46:04,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-28 21:46:06,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:46:08,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:46:08,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:46:09,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-28 21:46:09,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:46:09,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:46:10,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=151786.66666666666, ans=0.0 2023-09-28 21:46:11,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-28 21:46:11,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:46:14,186 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=151853.33333333334, ans=0.0 2023-09-28 21:46:16,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=151853.33333333334, ans=0.125 2023-09-28 21:46:18,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:46:18,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-28 21:46:22,954 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 21:46:24,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:46:25,419 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.03 vs. limit=22.5 2023-09-28 21:46:29,583 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-28 21:46:30,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:46:30,918 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-28 21:46:32,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:46:34,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:46:34,107 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-28 21:46:35,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-28 21:46:39,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-28 21:46:40,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:46:44,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:46:44,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:46:44,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:46:44,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=151920.0, ans=0.125 2023-09-28 21:46:45,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:46:45,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:46:47,872 INFO [train.py:1039] (3/4) Epoch 5, batch 1550, loss[loss=0.2433, simple_loss=0.316, pruned_loss=0.08529, over 24646.00 frames. ], tot_loss[loss=0.2517, simple_loss=0.3095, pruned_loss=0.09689, over 4709582.43 frames. ], batch size: 68, lr: 2.04e-02, grad_scale: 16.0 2023-09-28 21:46:48,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-28 21:46:49,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-28 21:46:49,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:46:51,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-28 21:46:52,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-28 21:46:54,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:46:55,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:46:56,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:46:56,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:46:57,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:46:57,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:46:57,982 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=151986.66666666666, ans=0.125 2023-09-28 21:47:02,075 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-28 21:47:02,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:47:02,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:47:02,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 21:47:05,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:47:05,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-28 21:47:07,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:47:08,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-28 21:47:08,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-28 21:47:08,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-28 21:47:08,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:47:12,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:47:17,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:47:19,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=152120.0, ans=0.0 2023-09-28 21:47:20,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-28 21:47:20,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-28 21:47:27,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:47:30,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:47:31,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-28 21:47:32,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:47:32,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-28 21:47:32,380 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=152120.0, ans=0.5 2023-09-28 21:47:38,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 21:47:39,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:47:42,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:47:45,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:47:47,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:47:47,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-28 21:47:47,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:47:50,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:47:50,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:47:51,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-28 21:47:52,253 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.772e+02 2.346e+02 2.949e+02 3.489e+02 5.626e+02, threshold=5.898e+02, percent-clipped=1.0 2023-09-28 21:47:52,395 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-28 21:47:54,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:47:59,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-28 21:48:05,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:48:06,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:48:08,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-28 21:48:09,967 INFO [train.py:1039] (3/4) Epoch 5, batch 1600, loss[loss=0.2377, simple_loss=0.2964, pruned_loss=0.08952, over 22696.00 frames. ], tot_loss[loss=0.2524, simple_loss=0.3101, pruned_loss=0.09736, over 4710062.69 frames. ], batch size: 50, lr: 2.04e-02, grad_scale: 32.0 2023-09-28 21:48:10,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:48:12,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:48:12,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:48:12,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:48:13,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:48:18,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:48:19,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-28 21:48:19,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-28 21:48:22,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-28 21:48:25,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:48:26,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-28 21:48:28,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:48:30,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:48:35,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:48:38,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-28 21:48:43,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:48:44,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-28 21:48:44,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:48:44,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-28 21:48:49,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-28 21:48:51,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=152453.33333333334, ans=0.0 2023-09-28 21:48:58,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:48:59,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-28 21:49:00,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:49:00,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:49:00,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:49:01,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-28 21:49:07,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 21:49:08,928 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:49:08,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:49:09,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:49:10,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-28 21:49:13,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:49:13,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:49:16,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:49:19,674 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=152586.66666666666, ans=0.5 2023-09-28 21:49:22,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:49:23,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:49:27,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-28 21:49:27,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:49:28,545 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.23 vs. limit=10.0 2023-09-28 21:49:29,022 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-28 21:49:32,278 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.27 vs. limit=15.0 2023-09-28 21:49:32,723 INFO [train.py:1039] (3/4) Epoch 5, batch 1650, loss[loss=0.3295, simple_loss=0.3588, pruned_loss=0.1501, over 19639.00 frames. ], tot_loss[loss=0.254, simple_loss=0.3119, pruned_loss=0.0981, over 4714181.38 frames. ], batch size: 388, lr: 2.04e-02, grad_scale: 32.0 2023-09-28 21:49:33,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=152653.33333333334, ans=0.125 2023-09-28 21:49:34,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:49:35,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:49:37,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:49:37,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-28 21:49:37,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-28 21:49:37,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-28 21:49:37,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-28 21:49:40,420 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.84 vs. limit=22.5 2023-09-28 21:49:43,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:49:44,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:49:44,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:49:45,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-28 21:49:46,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:49:46,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=152653.33333333334, ans=0.125 2023-09-28 21:49:50,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-28 21:49:52,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:49:52,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:49:52,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:49:52,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:49:53,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-28 21:49:53,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-28 21:49:54,718 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.91 vs. limit=15.0 2023-09-28 21:49:58,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:50:02,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-28 21:50:10,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-28 21:50:12,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:50:12,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=152786.66666666666, ans=0.1 2023-09-28 21:50:16,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-28 21:50:19,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:50:22,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:50:22,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:50:22,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:50:23,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:50:23,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:50:27,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:50:27,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:50:28,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:50:28,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:50:30,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:50:30,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 21:50:33,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:50:35,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-28 21:50:36,173 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.210e+02 2.496e+02 2.822e+02 4.651e+02, threshold=4.993e+02, percent-clipped=0.0 2023-09-28 21:50:38,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:50:38,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-28 21:50:38,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-28 21:50:40,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-28 21:50:40,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:50:40,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=152920.0, ans=0.125 2023-09-28 21:50:41,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:50:41,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:50:41,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:50:41,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-28 21:50:45,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:50:46,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:50:47,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:50:50,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-28 21:50:54,699 INFO [train.py:1039] (3/4) Epoch 5, batch 1700, loss[loss=0.2241, simple_loss=0.2998, pruned_loss=0.07419, over 24639.00 frames. ], tot_loss[loss=0.2527, simple_loss=0.3101, pruned_loss=0.09761, over 4710249.77 frames. ], batch size: 68, lr: 2.04e-02, grad_scale: 16.0 2023-09-28 21:50:56,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:50:56,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:50:56,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-28 21:50:56,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=152986.66666666666, ans=0.05 2023-09-28 21:50:57,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:50:58,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:50:58,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:50:59,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:50:59,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:51:01,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-28 21:51:04,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 21:51:09,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=153053.33333333334, ans=0.1 2023-09-28 21:51:13,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:51:15,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:51:19,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=153053.33333333334, ans=0.125 2023-09-28 21:51:19,593 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=153053.33333333334, ans=0.0 2023-09-28 21:51:22,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:51:22,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:51:24,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:51:24,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:51:26,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-28 21:51:26,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=153120.0, ans=0.1 2023-09-28 21:51:28,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:51:28,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:51:29,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-28 21:51:31,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-28 21:51:34,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-28 21:51:34,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-28 21:51:35,831 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:51:39,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-28 21:51:39,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:51:40,017 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.23 vs. limit=15.0 2023-09-28 21:51:44,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=153186.66666666666, ans=0.125 2023-09-28 21:51:48,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:51:49,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:51:50,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:51:53,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-28 21:51:54,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-28 21:51:54,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:51:57,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:51:57,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-28 21:51:58,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:51:58,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:51:58,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:51:59,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:51:59,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=153253.33333333334, ans=0.0 2023-09-28 21:52:00,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:52:00,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:52:02,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:52:02,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:52:02,377 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:52:05,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:52:05,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-28 21:52:09,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:52:10,771 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:52:11,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=153253.33333333334, ans=0.0 2023-09-28 21:52:12,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-28 21:52:15,819 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=153320.0, ans=0.0 2023-09-28 21:52:16,903 INFO [train.py:1039] (3/4) Epoch 5, batch 1750, loss[loss=0.2309, simple_loss=0.2894, pruned_loss=0.08617, over 24305.00 frames. ], tot_loss[loss=0.2511, simple_loss=0.3076, pruned_loss=0.09726, over 4697132.70 frames. ], batch size: 56, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:52:20,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:52:22,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:52:22,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-28 21:52:25,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-28 21:52:25,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:52:27,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:52:27,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:52:30,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-28 21:52:34,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:52:36,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-28 21:52:36,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:52:38,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:52:42,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 21:52:44,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-28 21:52:44,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:52:45,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-28 21:52:48,175 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.32 vs. limit=15.0 2023-09-28 21:52:54,913 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:52:57,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:52:57,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:53:02,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:53:02,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:53:04,669 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.34 vs. limit=15.0 2023-09-28 21:53:05,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:53:05,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:53:09,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:53:09,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:53:10,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-28 21:53:13,006 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.40 vs. limit=15.0 2023-09-28 21:53:13,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:53:17,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-28 21:53:18,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:53:20,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:53:21,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:53:23,251 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.759e+02 2.199e+02 2.496e+02 2.934e+02 4.192e+02, threshold=4.992e+02, percent-clipped=0.0 2023-09-28 21:53:24,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:53:25,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-28 21:53:26,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:53:26,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=153586.66666666666, ans=0.125 2023-09-28 21:53:28,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:53:31,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:53:34,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:53:36,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:53:36,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-28 21:53:36,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:53:38,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-28 21:53:38,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:53:38,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-28 21:53:38,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:53:39,951 INFO [train.py:1039] (3/4) Epoch 5, batch 1800, loss[loss=0.2234, simple_loss=0.2898, pruned_loss=0.07854, over 24438.00 frames. ], tot_loss[loss=0.2492, simple_loss=0.3059, pruned_loss=0.09621, over 4690149.62 frames. ], batch size: 58, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:53:40,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:53:42,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 21:53:43,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:53:45,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 21:53:45,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=153653.33333333334, ans=0.125 2023-09-28 21:53:48,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:53:52,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 21:53:53,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:53:56,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:54:01,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:54:01,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:54:02,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:54:06,319 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-28 21:54:06,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-28 21:54:06,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:54:08,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:54:13,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-28 21:54:16,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-28 21:54:16,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-28 21:54:18,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:54:18,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:54:18,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:54:19,753 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-28 21:54:26,502 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-28 21:54:26,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=153786.66666666666, ans=0.1 2023-09-28 21:54:28,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:54:28,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:54:31,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-28 21:54:31,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-28 21:54:32,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-28 21:54:32,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:54:33,598 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.29 vs. limit=15.0 2023-09-28 21:54:34,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 21:54:39,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-28 21:54:44,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:54:46,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-28 21:54:46,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:54:46,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:54:47,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:54:47,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-28 21:54:51,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:54:51,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:54:54,830 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=153920.0, ans=0.04949747468305833 2023-09-28 21:54:55,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-28 21:54:55,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:54:58,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:54:59,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:54:59,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:55:01,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:55:01,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 21:55:02,710 INFO [train.py:1039] (3/4) Epoch 5, batch 1850, loss[loss=0.2484, simple_loss=0.3168, pruned_loss=0.09001, over 24648.00 frames. ], tot_loss[loss=0.2494, simple_loss=0.307, pruned_loss=0.09589, over 4709804.01 frames. ], batch size: 73, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:55:04,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:55:04,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:55:07,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:55:07,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:55:15,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:55:15,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-28 21:55:19,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-28 21:55:21,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-28 21:55:21,905 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.32 vs. limit=12.0 2023-09-28 21:55:26,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:55:26,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-28 21:55:26,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 21:55:31,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=154053.33333333334, ans=0.0 2023-09-28 21:55:35,084 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.39 vs. limit=6.0 2023-09-28 21:55:35,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-28 21:55:37,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-28 21:55:40,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:55:40,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:55:40,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=154120.0, ans=0.125 2023-09-28 21:55:41,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=154120.0, ans=0.125 2023-09-28 21:55:44,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-28 21:55:44,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:55:46,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 21:55:47,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:55:49,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 21:55:52,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:55:54,418 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-28 21:55:57,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-28 21:55:57,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:55:58,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 21:55:58,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:56:00,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:56:02,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 21:56:06,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-28 21:56:08,701 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.275e+02 2.646e+02 3.136e+02 5.874e+02, threshold=5.291e+02, percent-clipped=3.0 2023-09-28 21:56:08,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:56:13,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-28 21:56:13,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 21:56:13,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-28 21:56:13,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-28 21:56:16,375 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-28 21:56:16,508 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-28 21:56:18,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:56:20,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:56:20,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:56:20,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:56:21,547 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-28 21:56:22,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:56:23,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:56:24,334 INFO [train.py:1039] (3/4) Epoch 5, batch 1900, loss[loss=0.247, simple_loss=0.3029, pruned_loss=0.09557, over 23780.00 frames. ], tot_loss[loss=0.2498, simple_loss=0.308, pruned_loss=0.09582, over 4715356.00 frames. ], batch size: 179, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:56:24,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-28 21:56:26,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:56:27,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:56:28,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-28 21:56:31,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:56:31,180 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-28 21:56:31,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 21:56:32,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:56:35,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:56:39,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 21:56:41,068 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-28 21:56:42,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-28 21:56:44,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-28 21:56:46,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:56:46,154 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-28 21:56:46,208 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-28 21:56:48,186 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=154386.66666666666, ans=0.125 2023-09-28 21:56:49,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-28 21:56:50,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:56:55,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-28 21:56:55,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=154453.33333333334, ans=0.0 2023-09-28 21:56:58,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-28 21:57:05,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-28 21:57:07,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=154453.33333333334, ans=0.1 2023-09-28 21:57:08,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-28 21:57:08,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:57:10,079 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-28 21:57:10,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-28 21:57:11,991 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-28 21:57:12,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-28 21:57:12,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:57:15,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-28 21:57:20,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 21:57:22,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:57:22,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-28 21:57:25,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 21:57:26,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-28 21:57:28,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:57:28,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=154586.66666666666, ans=0.125 2023-09-28 21:57:35,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 21:57:35,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:57:35,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:57:36,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:57:38,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 21:57:38,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-28 21:57:39,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-28 21:57:42,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:57:42,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:57:46,299 INFO [train.py:1039] (3/4) Epoch 5, batch 1950, loss[loss=0.2676, simple_loss=0.3079, pruned_loss=0.1137, over 23772.00 frames. ], tot_loss[loss=0.251, simple_loss=0.3094, pruned_loss=0.09636, over 4725870.90 frames. ], batch size: 164, lr: 2.03e-02, grad_scale: 16.0 2023-09-28 21:57:46,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 21:57:46,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:57:46,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-28 21:57:48,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:57:48,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=154653.33333333334, ans=0.125 2023-09-28 21:57:51,554 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:57:53,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-28 21:57:55,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:57:55,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:57:58,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-28 21:57:58,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 21:57:59,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:57:59,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:58:01,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:58:03,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:03,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:58:06,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:58:09,417 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 21:58:09,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 21:58:09,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 21:58:09,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:58:14,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:58:17,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-28 21:58:17,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:17,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-28 21:58:17,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-28 21:58:19,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 21:58:19,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:58:21,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:58:24,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:58:29,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:58:32,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 21:58:32,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=154786.66666666666, ans=0.0 2023-09-28 21:58:35,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 21:58:35,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:58:37,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-28 21:58:37,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:58:40,065 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=154853.33333333334, ans=0.04949747468305833 2023-09-28 21:58:40,391 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.11 vs. limit=22.5 2023-09-28 21:58:41,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:58:41,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-28 21:58:42,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:58:50,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:52,302 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.827e+02 2.269e+02 2.574e+02 2.905e+02 4.607e+02, threshold=5.149e+02, percent-clipped=0.0 2023-09-28 21:58:52,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:55,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:58:57,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:59:00,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-28 21:59:01,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 21:59:02,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-28 21:59:02,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 21:59:03,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:59:04,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-28 21:59:06,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:59:08,890 INFO [train.py:1039] (3/4) Epoch 5, batch 2000, loss[loss=0.2712, simple_loss=0.3267, pruned_loss=0.1078, over 23339.00 frames. ], tot_loss[loss=0.2516, simple_loss=0.31, pruned_loss=0.09664, over 4733846.88 frames. ], batch size: 93, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 21:59:10,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-28 21:59:12,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 21:59:12,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 21:59:15,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 21:59:17,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:59:18,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-28 21:59:20,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-28 21:59:23,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-28 21:59:25,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-28 21:59:27,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 21:59:27,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 21:59:30,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-28 21:59:30,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-28 21:59:33,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:59:35,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:59:35,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:59:37,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-28 21:59:37,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 21:59:38,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-28 21:59:38,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:59:40,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=155120.0, ans=0.0 2023-09-28 21:59:41,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 21:59:43,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-28 21:59:43,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 21:59:45,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-28 21:59:46,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-28 21:59:46,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-28 21:59:48,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-28 21:59:48,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-28 21:59:48,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-28 21:59:55,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 21:59:58,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 21:59:58,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 21:59:58,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:00:01,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:00:01,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:00:01,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:00:01,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:00:03,429 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:05,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:00:05,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=155186.66666666666, ans=0.125 2023-09-28 22:00:06,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-28 22:00:09,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=155186.66666666666, ans=0.125 2023-09-28 22:00:13,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:00:14,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:00:15,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=155253.33333333334, ans=0.2 2023-09-28 22:00:18,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:00:18,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:00:21,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:21,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=155253.33333333334, ans=0.09899494936611666 2023-09-28 22:00:23,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:00:23,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:24,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:00:24,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:00:26,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=155253.33333333334, ans=0.0 2023-09-28 22:00:27,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:00:29,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:31,034 INFO [train.py:1039] (3/4) Epoch 5, batch 2050, loss[loss=0.2513, simple_loss=0.3037, pruned_loss=0.0995, over 23602.00 frames. ], tot_loss[loss=0.2509, simple_loss=0.3094, pruned_loss=0.09622, over 4731694.93 frames. ], batch size: 149, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 22:00:31,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:00:32,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:37,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:00:40,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:00:40,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:00:41,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:00:43,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-28 22:00:43,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:00:43,557 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=155320.0, ans=0.0 2023-09-28 22:00:44,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:00:44,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:00:48,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=155386.66666666666, ans=0.95 2023-09-28 22:00:54,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:00:54,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:00:58,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=155386.66666666666, ans=0.0 2023-09-28 22:00:59,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-28 22:01:02,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:01:04,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-28 22:01:04,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:01:08,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:01:10,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:01:11,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-28 22:01:13,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:01:14,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:01:16,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:01:16,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:01:19,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:01:21,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:01:24,713 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:01:24,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:01:28,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:01:32,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=155520.0, ans=0.2 2023-09-28 22:01:33,931 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:01:34,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-28 22:01:36,989 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.729e+02 2.377e+02 2.692e+02 3.102e+02 5.014e+02, threshold=5.385e+02, percent-clipped=0.0 2023-09-28 22:01:41,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:01:42,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:01:45,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:01:47,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-28 22:01:49,244 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-28 22:01:49,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:01:49,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:01:51,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:01:52,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:01:52,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-28 22:01:52,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-28 22:01:53,862 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=4.15 vs. limit=12.0 2023-09-28 22:01:54,314 INFO [train.py:1039] (3/4) Epoch 5, batch 2100, loss[loss=0.2243, simple_loss=0.2477, pruned_loss=0.1004, over 19189.00 frames. ], tot_loss[loss=0.2492, simple_loss=0.3077, pruned_loss=0.09535, over 4733135.29 frames. ], batch size: 388, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 22:01:54,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:01:57,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:01:59,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:02:01,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:02:01,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:02:01,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-28 22:02:04,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:02:04,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-28 22:02:04,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-28 22:02:05,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:02:05,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:02:05,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=155653.33333333334, ans=0.0 2023-09-28 22:02:07,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-28 22:02:07,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 22:02:15,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-28 22:02:15,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:02:18,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:02:18,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:02:18,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=155720.0, ans=0.1 2023-09-28 22:02:23,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:02:23,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-28 22:02:24,501 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.42 vs. limit=15.0 2023-09-28 22:02:25,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:02:25,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 22:02:25,762 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.19 vs. limit=6.0 2023-09-28 22:02:28,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-28 22:02:28,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:02:28,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-28 22:02:28,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-28 22:02:28,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-28 22:02:31,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-28 22:02:33,482 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:02:35,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:02:35,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:02:38,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:02:39,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:02:39,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-28 22:02:39,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:02:39,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:02:41,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:02:43,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-28 22:02:44,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-28 22:02:46,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-28 22:02:50,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:02:53,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=155853.33333333334, ans=0.125 2023-09-28 22:02:53,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=155853.33333333334, ans=0.2 2023-09-28 22:02:54,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:02:55,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-28 22:03:01,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:03:04,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:03:04,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:03:04,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:03:05,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-28 22:03:05,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:03:07,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:03:07,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:03:09,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:03:10,599 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:03:12,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-28 22:03:13,721 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-28 22:03:15,135 INFO [train.py:1039] (3/4) Epoch 5, batch 2150, loss[loss=0.2368, simple_loss=0.2892, pruned_loss=0.0922, over 23692.00 frames. ], tot_loss[loss=0.2486, simple_loss=0.3072, pruned_loss=0.09501, over 4738324.61 frames. ], batch size: 232, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 22:03:15,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:03:18,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:03:18,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:03:18,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:03:18,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:03:22,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=155986.66666666666, ans=0.1 2023-09-28 22:03:25,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 22:03:27,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:03:28,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:03:30,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:03:30,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:03:30,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=156053.33333333334, ans=0.2 2023-09-28 22:03:31,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:03:36,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:03:36,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:03:36,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:03:41,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:03:41,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-28 22:03:48,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:03:48,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:03:49,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:03:49,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:03:49,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:03:49,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:03:51,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:03:51,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:03:51,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:03:53,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-28 22:03:54,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:03:55,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:03:56,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:03:57,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:03:58,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:04:01,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:04:01,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:04:01,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=156120.0, ans=0.1 2023-09-28 22:04:03,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:04:03,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-28 22:04:03,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-28 22:04:06,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:04:06,988 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.57 vs. limit=15.0 2023-09-28 22:04:07,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:09,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:04:10,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=156186.66666666666, ans=0.04949747468305833 2023-09-28 22:04:11,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:04:12,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:12,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:12,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-28 22:04:14,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-28 22:04:15,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:04:15,837 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-28 22:04:15,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:17,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:04:19,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-28 22:04:19,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:04:19,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-28 22:04:19,565 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-28 22:04:19,565 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-28 22:04:19,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-28 22:04:19,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=156253.33333333334, ans=0.1 2023-09-28 22:04:21,001 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 2.209e+02 2.562e+02 3.022e+02 4.431e+02, threshold=5.124e+02, percent-clipped=0.0 2023-09-28 22:04:22,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:22,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:04:22,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:04:24,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:24,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 22:04:27,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:27,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:28,060 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.45 vs. limit=12.0 2023-09-28 22:04:29,195 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=156253.33333333334, ans=0.1 2023-09-28 22:04:36,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=156320.0, ans=0.0 2023-09-28 22:04:37,520 INFO [train.py:1039] (3/4) Epoch 5, batch 2200, loss[loss=0.2494, simple_loss=0.2995, pruned_loss=0.09964, over 23194.00 frames. ], tot_loss[loss=0.2483, simple_loss=0.3069, pruned_loss=0.0948, over 4734699.92 frames. ], batch size: 105, lr: 2.02e-02, grad_scale: 32.0 2023-09-28 22:04:37,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:04:37,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-28 22:04:38,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=156320.0, ans=0.1 2023-09-28 22:04:42,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:04:47,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:04:49,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:04:49,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:04:49,536 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=156320.0, ans=0.0 2023-09-28 22:04:50,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:04:54,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:04:54,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:04:54,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-28 22:05:00,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-28 22:05:01,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:05:08,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-28 22:05:10,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:05:11,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:05:11,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:05:17,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:05:17,659 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.16 vs. limit=15.0 2023-09-28 22:05:18,345 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-28 22:05:22,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:05:22,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:05:22,481 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-28 22:05:25,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:05:27,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:05:28,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:05:30,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:05:32,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-28 22:05:34,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:05:36,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-28 22:05:37,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:05:37,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:05:37,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:05:39,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:05:40,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:05:40,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:05:40,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:05:42,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-28 22:05:42,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:05:42,772 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=156586.66666666666, ans=0.125 2023-09-28 22:05:46,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:05:49,309 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 22:05:50,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:05:54,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:05:54,848 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-28 22:05:57,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:05:57,816 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-28 22:05:59,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-28 22:05:59,485 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-28 22:06:00,778 INFO [train.py:1039] (3/4) Epoch 5, batch 2250, loss[loss=0.2637, simple_loss=0.3127, pruned_loss=0.1073, over 23597.00 frames. ], tot_loss[loss=0.2498, simple_loss=0.3082, pruned_loss=0.09572, over 4720365.36 frames. ], batch size: 256, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:06:02,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:06:02,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-28 22:06:03,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:06:05,567 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-28 22:06:05,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=156653.33333333334, ans=0.125 2023-09-28 22:06:07,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:06:09,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:06:11,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=156653.33333333334, ans=10.0 2023-09-28 22:06:14,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:06:16,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:06:17,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:06:19,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:06:19,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:06:22,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-28 22:06:22,782 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:06:22,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:06:26,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-28 22:06:26,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:06:26,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:06:28,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:06:31,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=156720.0, ans=0.125 2023-09-28 22:06:32,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:06:34,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 22:06:34,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-28 22:06:35,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-28 22:06:37,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:06:41,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:06:44,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:06:47,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:06:48,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:06:50,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:06:52,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:06:53,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:06:58,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:07:00,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-28 22:07:05,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:07:05,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:07:05,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=156920.0, ans=0.2 2023-09-28 22:07:06,667 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.676e+02 2.161e+02 2.544e+02 3.130e+02 4.790e+02, threshold=5.087e+02, percent-clipped=0.0 2023-09-28 22:07:06,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:07:13,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 22:07:13,557 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=156920.0, ans=0.0 2023-09-28 22:07:16,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:07:16,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-28 22:07:16,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:07:18,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:07:20,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=156920.0, ans=0.125 2023-09-28 22:07:21,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-28 22:07:22,984 INFO [train.py:1039] (3/4) Epoch 5, batch 2300, loss[loss=0.2454, simple_loss=0.3133, pruned_loss=0.08874, over 24431.00 frames. ], tot_loss[loss=0.2514, simple_loss=0.3096, pruned_loss=0.0966, over 4727296.69 frames. ], batch size: 69, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:07:24,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:07:24,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:07:31,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:07:31,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:07:35,334 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-28 22:07:38,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:07:41,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=157053.33333333334, ans=0.0 2023-09-28 22:07:44,764 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:07:44,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-28 22:07:44,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:07:44,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:07:44,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-28 22:07:47,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:07:50,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:07:50,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:07:55,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:07:58,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:08:01,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:08:06,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:08:06,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:08:10,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:08:13,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:08:17,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:08:17,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=157186.66666666666, ans=0.0 2023-09-28 22:08:18,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 22:08:18,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:08:18,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-28 22:08:23,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 22:08:23,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:08:24,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:08:24,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:08:26,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:08:26,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 22:08:26,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:08:28,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-28 22:08:28,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:08:28,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:08:29,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-28 22:08:30,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=157253.33333333334, ans=0.035 2023-09-28 22:08:31,649 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=157253.33333333334, ans=0.2 2023-09-28 22:08:33,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=157253.33333333334, ans=0.025 2023-09-28 22:08:35,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:08:39,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:08:45,645 INFO [train.py:1039] (3/4) Epoch 5, batch 2350, loss[loss=0.2599, simple_loss=0.3262, pruned_loss=0.0968, over 24673.00 frames. ], tot_loss[loss=0.2517, simple_loss=0.3105, pruned_loss=0.09651, over 4719634.67 frames. ], batch size: 68, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:08:45,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:08:45,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:08:45,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:08:49,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:08:49,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:08:49,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:08:50,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-28 22:08:57,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:08:57,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-28 22:09:02,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-28 22:09:06,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:09:08,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:09:08,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:09:09,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:09:09,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:09:10,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-28 22:09:15,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:09:20,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-28 22:09:21,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:09:23,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:09:24,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:09:27,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:09:29,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-28 22:09:31,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:09:33,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:09:33,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:09:33,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:09:37,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:09:39,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-28 22:09:40,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:09:43,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:09:43,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:09:44,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-28 22:09:44,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:09:49,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-28 22:09:49,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:09:51,234 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.798e+02 2.164e+02 2.489e+02 2.830e+02 4.285e+02, threshold=4.978e+02, percent-clipped=0.0 2023-09-28 22:09:55,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-28 22:09:58,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-28 22:09:59,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:09:59,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-28 22:09:59,875 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-28 22:09:59,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-28 22:10:02,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-28 22:10:04,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:10:08,327 INFO [train.py:1039] (3/4) Epoch 5, batch 2400, loss[loss=0.2515, simple_loss=0.3052, pruned_loss=0.09885, over 23412.00 frames. ], tot_loss[loss=0.2514, simple_loss=0.3095, pruned_loss=0.09667, over 4709661.87 frames. ], batch size: 105, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:10:10,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:10:13,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:10:15,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:10:15,267 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-28 22:10:15,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-28 22:10:15,579 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=157653.33333333334, ans=0.2 2023-09-28 22:10:24,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:10:24,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:10:27,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-28 22:10:29,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:10:30,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:10:31,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-28 22:10:33,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=157720.0, ans=0.0 2023-09-28 22:10:36,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:10:38,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-28 22:10:43,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:10:48,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-28 22:10:50,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:10:51,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=157786.66666666666, ans=0.0 2023-09-28 22:10:52,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:10:56,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:10:56,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-28 22:10:58,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:11:04,161 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=157853.33333333334, ans=0.0 2023-09-28 22:11:06,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:11:08,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:11:11,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:11:13,010 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:11:13,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-28 22:11:13,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:11:13,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:11:14,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:11:14,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:11:19,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:11:19,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:11:19,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-28 22:11:21,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-28 22:11:23,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:11:23,898 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=157920.0, ans=0.125 2023-09-28 22:11:24,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:11:25,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-28 22:11:25,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-28 22:11:25,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-28 22:11:25,170 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-28 22:11:26,619 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-28 22:11:26,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:11:28,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:11:28,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:11:31,690 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-28 22:11:31,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:11:33,743 INFO [train.py:1039] (3/4) Epoch 5, batch 2450, loss[loss=0.2287, simple_loss=0.2576, pruned_loss=0.09992, over 19122.00 frames. ], tot_loss[loss=0.2492, simple_loss=0.3064, pruned_loss=0.09594, over 4694361.58 frames. ], batch size: 388, lr: 2.01e-02, grad_scale: 32.0 2023-09-28 22:11:33,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-28 22:11:37,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:11:37,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:11:40,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:11:40,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:11:41,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-28 22:11:43,722 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=157986.66666666666, ans=0.2 2023-09-28 22:11:46,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=157986.66666666666, ans=0.125 2023-09-28 22:11:47,407 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.21 vs. limit=15.0 2023-09-28 22:11:48,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:11:48,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:11:51,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:11:51,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:11:51,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:11:51,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-28 22:11:54,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=158053.33333333334, ans=0.1 2023-09-28 22:11:58,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:11:59,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:11:59,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:12:03,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:12:05,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:12:06,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:12:06,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:12:10,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-28 22:12:10,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:12:12,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=158120.0, ans=0.125 2023-09-28 22:12:15,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=158120.0, ans=0.125 2023-09-28 22:12:18,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:12:19,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:12:19,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:12:19,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:12:21,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:12:21,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:12:21,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=158186.66666666666, ans=0.1 2023-09-28 22:12:23,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-28 22:12:23,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=158186.66666666666, ans=0.125 2023-09-28 22:12:26,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:12:28,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:12:31,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:12:31,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:12:37,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:12:37,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-28 22:12:38,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:12:39,978 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 2.190e+02 2.639e+02 3.152e+02 5.360e+02, threshold=5.279e+02, percent-clipped=2.0 2023-09-28 22:12:40,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:12:40,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-28 22:12:40,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:12:40,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:12:45,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:12:48,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:12:48,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:12:51,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-28 22:12:53,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:12:55,214 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.34 vs. limit=15.0 2023-09-28 22:12:55,803 INFO [train.py:1039] (3/4) Epoch 5, batch 2500, loss[loss=0.2668, simple_loss=0.3141, pruned_loss=0.1097, over 23334.00 frames. ], tot_loss[loss=0.2486, simple_loss=0.306, pruned_loss=0.0956, over 4702252.23 frames. ], batch size: 119, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:13:01,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:13:11,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:13:11,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:13:13,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:13:13,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-28 22:13:20,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:13:21,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:13:21,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-28 22:13:21,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 22:13:23,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-28 22:13:23,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:13:24,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:13:25,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-28 22:13:25,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:13:26,581 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-28 22:13:26,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:13:31,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:13:31,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:13:35,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:13:36,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-28 22:13:36,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:13:39,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:13:43,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:13:48,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:13:51,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:13:56,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-28 22:13:58,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-28 22:13:58,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:13:58,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-28 22:14:01,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:14:01,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:14:01,754 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-28 22:14:01,755 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-28 22:14:01,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-28 22:14:03,506 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=158586.66666666666, ans=0.0 2023-09-28 22:14:05,119 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=158586.66666666666, ans=0.125 2023-09-28 22:14:06,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:14:09,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-28 22:14:09,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-28 22:14:11,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:14:11,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-28 22:14:16,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-28 22:14:20,167 INFO [train.py:1039] (3/4) Epoch 5, batch 2550, loss[loss=0.2331, simple_loss=0.3073, pruned_loss=0.07948, over 24669.00 frames. ], tot_loss[loss=0.2487, simple_loss=0.306, pruned_loss=0.09568, over 4700123.05 frames. ], batch size: 68, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:14:20,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:14:21,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:14:23,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:14:25,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:14:26,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-28 22:14:26,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=158653.33333333334, ans=0.1 2023-09-28 22:14:28,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:14:31,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-28 22:14:32,859 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:14:34,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:14:37,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:14:37,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 22:14:37,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:14:38,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:14:38,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:14:43,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:14:43,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-28 22:14:43,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-28 22:14:43,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:14:43,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-28 22:14:59,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:15:04,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:15:04,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:15:04,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:15:07,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:15:09,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=158853.33333333334, ans=0.0 2023-09-28 22:15:12,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:15:14,799 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.77 vs. limit=15.0 2023-09-28 22:15:16,026 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.90 vs. limit=15.0 2023-09-28 22:15:16,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:15:17,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:15:17,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:15:17,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-28 22:15:17,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:15:17,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=158853.33333333334, ans=0.0 2023-09-28 22:15:21,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:15:21,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:15:24,943 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.838e+02 2.422e+02 2.826e+02 3.525e+02 6.917e+02, threshold=5.653e+02, percent-clipped=3.0 2023-09-28 22:15:29,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:15:30,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-28 22:15:30,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:15:30,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:15:30,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:15:33,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 22:15:34,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:15:39,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:15:42,660 INFO [train.py:1039] (3/4) Epoch 5, batch 2600, loss[loss=0.2552, simple_loss=0.3115, pruned_loss=0.09945, over 23315.00 frames. ], tot_loss[loss=0.2502, simple_loss=0.3077, pruned_loss=0.09638, over 4690984.18 frames. ], batch size: 105, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:15:42,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:15:45,777 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-28 22:15:47,475 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-28 22:15:47,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:15:48,876 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-28 22:15:48,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-28 22:15:49,012 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-28 22:15:52,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:15:52,098 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-28 22:15:53,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-28 22:15:55,269 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-28 22:15:56,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:15:57,448 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.19 vs. limit=6.0 2023-09-28 22:15:58,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-28 22:16:01,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-28 22:16:02,132 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.76 vs. limit=10.0 2023-09-28 22:16:02,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:16:02,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-28 22:16:04,640 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-28 22:16:04,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-28 22:16:14,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=159120.0, ans=0.2 2023-09-28 22:16:15,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:16:15,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:16:16,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:16:16,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-28 22:16:19,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:16:20,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=159120.0, ans=0.125 2023-09-28 22:16:23,908 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-28 22:16:28,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:16:28,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:16:30,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-28 22:16:31,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:16:31,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:16:33,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-28 22:16:37,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:16:37,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:16:38,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:16:42,545 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-28 22:16:42,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:16:42,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:16:47,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:16:48,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:16:50,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-28 22:16:50,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:16:52,898 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.58 vs. limit=22.5 2023-09-28 22:16:53,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:16:53,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:16:59,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-28 22:17:00,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:17:02,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:17:04,013 INFO [train.py:1039] (3/4) Epoch 5, batch 2650, loss[loss=0.2447, simple_loss=0.2981, pruned_loss=0.09567, over 24441.00 frames. ], tot_loss[loss=0.2507, simple_loss=0.308, pruned_loss=0.0967, over 4702115.46 frames. ], batch size: 58, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:17:09,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-28 22:17:09,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:17:09,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=159320.0, ans=0.1 2023-09-28 22:17:10,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:17:12,811 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-28 22:17:12,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:17:15,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:17:17,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:17:17,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=159320.0, ans=0.0 2023-09-28 22:17:18,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:17:21,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:17:22,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-28 22:17:22,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:17:22,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:17:25,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-28 22:17:28,203 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-28 22:17:31,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:17:32,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-28 22:17:32,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:17:32,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-28 22:17:38,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:17:38,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:17:38,974 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:17:39,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:17:42,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-28 22:17:42,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-28 22:17:45,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=159453.33333333334, ans=0.1 2023-09-28 22:17:47,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:17:50,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-28 22:17:52,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:17:52,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:17:52,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-28 22:17:53,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:17:53,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:17:55,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:17:58,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:17:59,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:17:59,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:18:02,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:18:03,139 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=159520.0, ans=0.0 2023-09-28 22:18:03,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=159520.0, ans=0.125 2023-09-28 22:18:04,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:18:05,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:18:05,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:18:08,808 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.817e+02 2.225e+02 2.541e+02 3.251e+02 5.495e+02, threshold=5.083e+02, percent-clipped=0.0 2023-09-28 22:18:08,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:18:08,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-28 22:18:12,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:18:12,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=159586.66666666666, ans=0.125 2023-09-28 22:18:13,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:18:13,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:18:13,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-28 22:18:17,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:18:20,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:18:22,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:18:24,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:18:24,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-28 22:18:25,832 INFO [train.py:1039] (3/4) Epoch 5, batch 2700, loss[loss=0.2605, simple_loss=0.33, pruned_loss=0.09553, over 24465.00 frames. ], tot_loss[loss=0.251, simple_loss=0.3086, pruned_loss=0.09672, over 4719993.91 frames. ], batch size: 69, lr: 2.00e-02, grad_scale: 32.0 2023-09-28 22:18:25,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:18:28,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:18:28,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-28 22:18:31,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:18:32,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 22:18:33,341 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.87 vs. limit=15.0 2023-09-28 22:18:35,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:18:35,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:18:35,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:18:38,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:18:38,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:18:38,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:18:38,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-28 22:18:38,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-28 22:18:40,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:18:41,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:18:43,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:18:43,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:18:43,922 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:18:43,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=159720.0, ans=0.0 2023-09-28 22:18:46,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:18:49,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-28 22:18:49,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:18:54,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:18:54,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:18:54,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=159720.0, ans=0.2 2023-09-28 22:19:01,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:19:01,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:19:01,275 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=159786.66666666666, ans=0.125 2023-09-28 22:19:01,291 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=159786.66666666666, ans=0.5 2023-09-28 22:19:01,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=159786.66666666666, ans=0.125 2023-09-28 22:19:02,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:19:02,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:19:03,528 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.84 vs. limit=15.0 2023-09-28 22:19:06,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:19:07,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:19:07,840 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:19:07,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:19:12,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:19:12,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:19:20,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:19:21,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:19:24,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:19:24,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:19:30,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:19:30,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:19:31,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:19:33,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:19:35,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:19:35,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:19:37,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:19:38,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:19:38,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:19:41,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-28 22:19:41,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:19:44,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:19:44,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-28 22:19:47,726 INFO [train.py:1039] (3/4) Epoch 5, batch 2750, loss[loss=0.2446, simple_loss=0.3162, pruned_loss=0.0865, over 24569.00 frames. ], tot_loss[loss=0.2519, simple_loss=0.3098, pruned_loss=0.09697, over 4715781.49 frames. ], batch size: 71, lr: 1.99e-02, grad_scale: 16.0 2023-09-28 22:19:47,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-28 22:19:47,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:19:54,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:19:54,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:19:57,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:19:57,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:19:59,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:00,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:20:00,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 22:20:03,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:20:03,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:03,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-28 22:20:03,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:20:03,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:20:08,559 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=160053.33333333334, ans=0.125 2023-09-28 22:20:11,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-28 22:20:12,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:20:12,859 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:14,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:20:14,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:20:15,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:20:17,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:20:17,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:20:19,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:20:23,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:20:23,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=160120.0, ans=0.1 2023-09-28 22:20:25,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 22:20:25,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:20:26,106 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.07 vs. limit=22.5 2023-09-28 22:20:26,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:28,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:20:36,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:20:38,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:20:38,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:20:42,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:20:42,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:20:42,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:20:49,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:20:50,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:20:50,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-28 22:20:54,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:20:55,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=160253.33333333334, ans=0.05 2023-09-28 22:20:56,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-28 22:20:57,661 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.836e+02 2.252e+02 2.534e+02 3.037e+02 4.293e+02, threshold=5.069e+02, percent-clipped=0.0 2023-09-28 22:20:58,309 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:21:01,375 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=160253.33333333334, ans=0.07 2023-09-28 22:21:02,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-28 22:21:04,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:21:04,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-28 22:21:05,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:21:08,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:21:08,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-28 22:21:08,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:21:12,431 INFO [train.py:1039] (3/4) Epoch 5, batch 2800, loss[loss=0.2361, simple_loss=0.3125, pruned_loss=0.07982, over 23933.00 frames. ], tot_loss[loss=0.2502, simple_loss=0.309, pruned_loss=0.09569, over 4716167.40 frames. ], batch size: 80, lr: 1.99e-02, grad_scale: 32.0 2023-09-28 22:21:12,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-28 22:21:12,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:21:14,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:21:14,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-28 22:21:14,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:21:16,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:21:18,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:21:18,661 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-28 22:21:18,662 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-28 22:21:21,092 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.88 vs. limit=15.0 2023-09-28 22:21:21,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:21:25,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:21:25,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:21:28,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:21:30,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=160386.66666666666, ans=0.07 2023-09-28 22:21:31,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-28 22:21:34,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-28 22:21:35,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-28 22:21:37,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:21:37,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:21:37,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:21:40,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:21:41,154 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=160386.66666666666, ans=0.125 2023-09-28 22:21:42,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:21:42,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-28 22:21:44,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:21:54,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:21:56,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:21:58,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:22:01,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:22:01,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:22:04,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:22:04,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-28 22:22:05,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:22:06,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:22:06,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:22:07,300 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.25 vs. limit=22.5 2023-09-28 22:22:10,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:22:10,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:22:13,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:22:15,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:22:15,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:22:15,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:22:17,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 22:22:17,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:22:17,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:22:19,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-28 22:22:19,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:22:21,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:22:22,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:22:25,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-28 22:22:25,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:22:25,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:22:26,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:22:28,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-28 22:22:33,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:22:34,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 22:22:34,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:22:35,967 INFO [train.py:1039] (3/4) Epoch 5, batch 2850, loss[loss=0.2486, simple_loss=0.269, pruned_loss=0.1141, over 19036.00 frames. ], tot_loss[loss=0.249, simple_loss=0.3078, pruned_loss=0.09511, over 4711820.43 frames. ], batch size: 389, lr: 1.99e-02, grad_scale: 32.0 2023-09-28 22:22:37,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:22:42,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:22:42,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:22:43,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:22:44,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=160653.33333333334, ans=0.0 2023-09-28 22:22:45,536 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:22:45,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:22:47,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:22:47,298 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-28 22:22:53,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-28 22:22:53,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:22:55,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-28 22:22:55,471 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=160720.0, ans=0.0 2023-09-28 22:22:56,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:22:59,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-28 22:23:01,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-28 22:23:03,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:23:15,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:23:16,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:23:18,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:23:18,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:23:18,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:23:19,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:23:19,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=160786.66666666666, ans=0.05 2023-09-28 22:23:21,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:23:22,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-28 22:23:25,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:23:25,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:23:27,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:23:27,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:23:30,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:23:30,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:23:33,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:23:34,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:23:37,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:23:38,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:23:40,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:23:41,936 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.779e+02 2.140e+02 2.376e+02 2.803e+02 4.746e+02, threshold=4.753e+02, percent-clipped=0.0 2023-09-28 22:23:42,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:23:44,883 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.67 vs. limit=8.0 2023-09-28 22:23:46,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:23:48,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-28 22:23:48,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-28 22:23:49,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:23:50,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:23:50,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-28 22:23:51,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:23:51,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:23:51,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:23:53,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:23:53,194 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-28 22:23:53,254 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-28 22:23:53,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:23:54,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:23:56,232 INFO [train.py:1039] (3/4) Epoch 5, batch 2900, loss[loss=0.2919, simple_loss=0.3206, pruned_loss=0.1316, over 19321.00 frames. ], tot_loss[loss=0.2483, simple_loss=0.3071, pruned_loss=0.09469, over 4703174.64 frames. ], batch size: 388, lr: 1.99e-02, grad_scale: 32.0 2023-09-28 22:23:57,140 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.34 vs. limit=22.5 2023-09-28 22:23:57,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=160986.66666666666, ans=15.0 2023-09-28 22:23:58,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-28 22:23:58,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:23:58,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:24:00,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-28 22:24:06,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:24:06,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-28 22:24:08,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-28 22:24:09,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:24:09,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:24:12,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:24:12,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:24:15,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:24:15,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:24:17,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:24:18,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-28 22:24:19,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:24:20,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:24:23,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-28 22:24:24,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-28 22:24:27,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:24:27,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-28 22:24:28,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:24:30,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=161120.0, ans=0.125 2023-09-28 22:24:31,079 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:24:31,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-28 22:24:34,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:24:34,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:24:39,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:24:42,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:24:43,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-28 22:24:45,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-28 22:24:45,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:24:48,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:24:51,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-28 22:24:51,355 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:24:57,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:24:59,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=161186.66666666666, ans=0.0 2023-09-28 22:25:03,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=161253.33333333334, ans=0.1 2023-09-28 22:25:07,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:25:07,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:25:09,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-28 22:25:13,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:25:13,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-28 22:25:15,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:25:15,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:25:20,022 INFO [train.py:1039] (3/4) Epoch 5, batch 2950, loss[loss=0.3367, simple_loss=0.362, pruned_loss=0.1557, over 19378.00 frames. ], tot_loss[loss=0.2491, simple_loss=0.3075, pruned_loss=0.09536, over 4700294.20 frames. ], batch size: 388, lr: 1.99e-02, grad_scale: 32.0 2023-09-28 22:25:20,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=161320.0, ans=0.0 2023-09-28 22:25:21,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:25:23,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-28 22:25:25,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:25:25,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:25:26,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:25:26,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer_na.min_abs, batch_count=161320.0, ans=0.02 2023-09-28 22:25:28,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:25:29,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-28 22:25:29,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-28 22:25:31,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:25:31,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:25:34,793 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.29 vs. limit=15.0 2023-09-28 22:25:38,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:25:40,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:25:45,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:25:45,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:25:45,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=161386.66666666666, ans=0.2 2023-09-28 22:25:49,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:25:49,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:25:49,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:25:51,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:25:51,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:25:56,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-28 22:25:57,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-28 22:25:59,152 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-28 22:25:59,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:26:02,260 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-28 22:26:03,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-28 22:26:03,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:26:03,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:26:03,907 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-28 22:26:05,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:26:07,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-28 22:26:08,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:26:09,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:26:11,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:26:11,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=161520.0, ans=0.1 2023-09-28 22:26:14,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:26:14,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:26:14,581 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-28 22:26:14,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:26:14,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-28 22:26:21,889 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:26:24,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:26:24,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-28 22:26:24,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:26:25,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-28 22:26:27,624 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.929e+02 2.317e+02 2.702e+02 3.273e+02 4.611e+02, threshold=5.405e+02, percent-clipped=0.0 2023-09-28 22:26:29,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:26:32,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:26:32,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:26:34,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:26:34,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 22:26:34,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:26:35,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:26:35,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:26:37,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:26:38,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:26:39,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:26:40,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:26:40,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-28 22:26:41,942 INFO [train.py:1039] (3/4) Epoch 5, batch 3000, loss[loss=0.2166, simple_loss=0.2966, pruned_loss=0.0683, over 24329.00 frames. ], tot_loss[loss=0.2493, simple_loss=0.3079, pruned_loss=0.09538, over 4711384.96 frames. ], batch size: 74, lr: 1.98e-02, grad_scale: 32.0 2023-09-28 22:26:41,942 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-28 22:26:57,277 INFO [train.py:1071] (3/4) Epoch 5, validation: loss=0.3788, simple_loss=0.3301, pruned_loss=0.2137, over 1125622.00 frames. 2023-09-28 22:26:57,278 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-28 22:26:57,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:26:59,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:27:00,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:27:02,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=161653.33333333334, ans=0.125 2023-09-28 22:27:04,958 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-28 22:27:05,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-28 22:27:05,785 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.77 vs. limit=15.0 2023-09-28 22:27:07,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:27:07,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:27:08,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-28 22:27:08,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:27:12,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:27:22,460 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:27:30,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-28 22:27:30,965 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=161786.66666666666, ans=0.125 2023-09-28 22:27:32,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:27:35,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:27:35,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:27:37,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:27:39,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:27:39,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-28 22:27:42,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-28 22:27:43,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:27:45,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 22:27:46,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:27:46,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:27:48,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:27:48,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:27:51,822 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.81 vs. limit=15.0 2023-09-28 22:27:52,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:27:52,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:27:52,805 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:27:55,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:27:57,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-28 22:27:59,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:28:01,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:28:01,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:28:04,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:28:04,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:28:04,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=161920.0, ans=0.0 2023-09-28 22:28:06,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-28 22:28:08,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-28 22:28:08,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:28:08,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-28 22:28:08,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=161920.0, ans=0.05 2023-09-28 22:28:09,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:28:11,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-28 22:28:14,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:28:17,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:28:17,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-28 22:28:18,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=161986.66666666666, ans=0.125 2023-09-28 22:28:19,145 INFO [train.py:1039] (3/4) Epoch 5, batch 3050, loss[loss=0.2534, simple_loss=0.3167, pruned_loss=0.09504, over 23741.00 frames. ], tot_loss[loss=0.2504, simple_loss=0.3089, pruned_loss=0.09589, over 4719392.81 frames. ], batch size: 85, lr: 1.98e-02, grad_scale: 16.0 2023-09-28 22:28:19,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-28 22:28:19,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 22:28:19,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=161986.66666666666, ans=0.2 2023-09-28 22:28:20,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:28:20,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:28:20,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-28 22:28:21,587 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.65 vs. limit=15.0 2023-09-28 22:28:22,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:28:22,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:28:25,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-28 22:28:27,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:28:28,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:28:30,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:28:32,716 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.49 vs. limit=10.0 2023-09-28 22:28:33,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:28:38,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-28 22:28:45,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-28 22:28:45,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-28 22:28:45,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:28:51,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:28:52,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:28:52,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:28:54,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:28:57,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:28:57,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:28:57,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:28:58,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:28:58,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:29:00,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:29:00,993 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=162120.0, ans=0.0 2023-09-28 22:29:02,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:29:03,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:29:05,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-28 22:29:07,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:29:07,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:29:10,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:29:11,736 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:29:11,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:29:11,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:29:14,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=162186.66666666666, ans=0.0 2023-09-28 22:29:18,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:29:18,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:29:25,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:29:25,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:29:25,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:29:26,624 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.41 vs. limit=6.0 2023-09-28 22:29:27,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:29:27,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 22:29:28,688 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.152e+02 2.476e+02 2.829e+02 3.891e+02, threshold=4.952e+02, percent-clipped=0.0 2023-09-28 22:29:28,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:29:30,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-28 22:29:31,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:29:31,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:29:33,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-28 22:29:34,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:29:40,978 INFO [train.py:1039] (3/4) Epoch 5, batch 3100, loss[loss=0.24, simple_loss=0.2806, pruned_loss=0.09968, over 23500.00 frames. ], tot_loss[loss=0.2509, simple_loss=0.309, pruned_loss=0.0964, over 4712645.91 frames. ], batch size: 256, lr: 1.98e-02, grad_scale: 16.0 2023-09-28 22:29:43,150 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:29:44,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:29:47,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 22:29:49,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-28 22:29:51,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-28 22:29:53,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-28 22:29:53,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:29:58,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:29:58,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:30:00,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-28 22:30:03,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:30:07,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-28 22:30:12,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 22:30:13,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:15,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:30:15,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:30:15,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-28 22:30:18,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:30:18,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-28 22:30:18,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:30:20,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:30:20,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-28 22:30:22,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:30:25,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:30:27,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-28 22:30:27,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-28 22:30:31,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:31,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:30:34,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:30:34,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:35,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:30:37,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:30:37,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:30:37,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=162520.0, ans=0.1 2023-09-28 22:30:38,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:30:38,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:30:38,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:38,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 22:30:43,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=162520.0, ans=0.125 2023-09-28 22:30:44,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:30:46,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-28 22:30:49,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:30:49,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-28 22:30:51,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:30:51,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:30:53,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-28 22:31:04,535 INFO [train.py:1039] (3/4) Epoch 5, batch 3150, loss[loss=0.232, simple_loss=0.2973, pruned_loss=0.08336, over 24671.00 frames. ], tot_loss[loss=0.2491, simple_loss=0.3073, pruned_loss=0.09542, over 4716866.45 frames. ], batch size: 65, lr: 1.98e-02, grad_scale: 16.0 2023-09-28 22:31:04,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-28 22:31:08,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:31:08,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:31:10,204 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:31:10,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:31:10,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-28 22:31:11,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:31:11,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-28 22:31:13,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-28 22:31:15,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:31:16,845 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-28 22:31:18,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-28 22:31:18,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:31:20,106 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-28 22:31:22,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-28 22:31:24,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-28 22:31:25,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-28 22:31:25,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-28 22:31:25,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:31:25,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:31:26,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:31:30,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-28 22:31:31,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:31:31,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:31:33,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:31:36,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:31:40,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-28 22:31:41,590 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.59 vs. limit=22.5 2023-09-28 22:31:42,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:31:43,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:31:45,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:31:45,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-28 22:31:48,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-28 22:31:49,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:31:50,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 22:31:51,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 22:31:51,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:31:51,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:31:53,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:31:53,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-28 22:31:54,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-28 22:31:54,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:31:54,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:31:57,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:31:57,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:31:57,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-28 22:31:59,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:32:01,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-28 22:32:01,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:03,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-28 22:32:05,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-28 22:32:06,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:32:07,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=162853.33333333334, ans=0.125 2023-09-28 22:32:08,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:32:09,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-28 22:32:09,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 22:32:11,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:32:13,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:32:15,468 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.706e+02 2.257e+02 2.534e+02 2.930e+02 4.234e+02, threshold=5.067e+02, percent-clipped=0.0 2023-09-28 22:32:15,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:17,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:32:21,782 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:32:21,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:22,883 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.78 vs. limit=10.0 2023-09-28 22:32:24,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-28 22:32:27,684 INFO [train.py:1039] (3/4) Epoch 5, batch 3200, loss[loss=0.2476, simple_loss=0.3162, pruned_loss=0.08953, over 24470.00 frames. ], tot_loss[loss=0.2487, simple_loss=0.3069, pruned_loss=0.09527, over 4713780.34 frames. ], batch size: 66, lr: 1.98e-02, grad_scale: 32.0 2023-09-28 22:32:30,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:32:30,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-28 22:32:34,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:36,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:32:36,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-28 22:32:39,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:32:44,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:32:44,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=163053.33333333334, ans=0.125 2023-09-28 22:32:48,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:32:58,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:32:58,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=163053.33333333334, ans=0.125 2023-09-28 22:33:00,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=163120.0, ans=0.0 2023-09-28 22:33:05,387 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.04 vs. limit=22.5 2023-09-28 22:33:08,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-28 22:33:08,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:33:11,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-28 22:33:11,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=163120.0, ans=0.2 2023-09-28 22:33:13,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 22:33:15,399 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.55 vs. limit=15.0 2023-09-28 22:33:16,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:33:16,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:33:17,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:33:22,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-28 22:33:24,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-28 22:33:26,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-28 22:33:30,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-28 22:33:33,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:33:39,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:33:39,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 22:33:39,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:33:40,054 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-28 22:33:40,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:33:45,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:33:47,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-28 22:33:48,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-28 22:33:48,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-28 22:33:49,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-28 22:33:50,384 INFO [train.py:1039] (3/4) Epoch 5, batch 3250, loss[loss=0.2467, simple_loss=0.3021, pruned_loss=0.0957, over 23577.00 frames. ], tot_loss[loss=0.2492, simple_loss=0.3073, pruned_loss=0.09555, over 4712785.05 frames. ], batch size: 256, lr: 1.98e-02, grad_scale: 32.0 2023-09-28 22:33:52,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:33:53,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:33:53,770 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-28 22:33:55,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:33:55,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:33:56,800 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-28 22:34:02,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:34:05,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:34:13,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:34:13,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-28 22:34:13,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:34:14,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:34:14,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:34:16,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:34:17,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:34:20,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:34:20,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:34:20,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:34:22,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:34:22,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:34:22,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:34:22,667 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=163453.33333333334, ans=0.09899494936611666 2023-09-28 22:34:25,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:34:26,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:34:28,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:34:29,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:34:30,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:34:32,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:34:32,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:34:35,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=163453.33333333334, ans=0.2 2023-09-28 22:34:40,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-28 22:34:40,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:34:40,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:34:41,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:34:43,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:34:48,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:34:54,184 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.95 vs. limit=15.0 2023-09-28 22:34:54,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:34:55,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:34:55,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-28 22:34:55,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:34:55,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 22:34:57,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:34:59,772 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.181e+02 2.539e+02 2.910e+02 4.275e+02, threshold=5.078e+02, percent-clipped=0.0 2023-09-28 22:34:59,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-28 22:35:00,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-28 22:35:00,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:35:01,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:35:01,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:35:03,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-28 22:35:03,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:35:08,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:35:08,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:35:11,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-28 22:35:11,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:35:13,131 INFO [train.py:1039] (3/4) Epoch 5, batch 3300, loss[loss=0.2556, simple_loss=0.3022, pruned_loss=0.1046, over 23761.00 frames. ], tot_loss[loss=0.2489, simple_loss=0.3072, pruned_loss=0.09526, over 4719626.54 frames. ], batch size: 212, lr: 1.97e-02, grad_scale: 32.0 2023-09-28 22:35:13,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:35:13,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-28 22:35:15,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=163653.33333333334, ans=0.015 2023-09-28 22:35:16,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:35:16,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-28 22:35:19,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-28 22:35:19,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-28 22:35:19,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:35:22,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:35:24,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:35:24,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:35:27,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 22:35:27,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:35:31,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:35:33,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:35:35,471 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=23.47 vs. limit=22.5 2023-09-28 22:35:36,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-28 22:35:37,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:35:37,921 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:35:40,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:35:40,263 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-28 22:35:40,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=163720.0, ans=0.0 2023-09-28 22:35:41,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:35:41,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 22:35:43,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:35:43,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:35:43,485 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-28 22:35:47,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:35:47,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:35:50,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:35:50,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-28 22:35:50,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-28 22:35:51,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:35:51,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:35:55,058 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-28 22:35:57,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-28 22:35:58,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:36:00,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-28 22:36:01,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:36:03,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-28 22:36:05,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:36:07,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=163853.33333333334, ans=0.125 2023-09-28 22:36:08,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:36:08,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:36:08,440 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:36:09,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:36:11,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:36:11,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:36:13,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:36:15,335 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-28 22:36:15,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-28 22:36:17,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:36:18,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:36:18,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:36:20,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:36:20,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:36:21,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=163920.0, ans=0.125 2023-09-28 22:36:21,126 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=163920.0, ans=0.125 2023-09-28 22:36:22,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:36:23,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:36:23,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:36:23,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:36:26,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:36:28,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-28 22:36:28,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:36:30,046 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:36:33,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:36:33,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:36:35,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:36:36,541 INFO [train.py:1039] (3/4) Epoch 5, batch 3350, loss[loss=0.284, simple_loss=0.3436, pruned_loss=0.1122, over 23975.00 frames. ], tot_loss[loss=0.2487, simple_loss=0.3074, pruned_loss=0.09504, over 4723741.59 frames. ], batch size: 86, lr: 1.97e-02, grad_scale: 32.0 2023-09-28 22:36:36,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:36:36,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:36:37,100 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=163986.66666666666, ans=0.0 2023-09-28 22:36:41,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:36:41,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=163986.66666666666, ans=0.125 2023-09-28 22:36:43,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:36:44,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:36:47,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:36:50,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:36:51,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:36:51,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:36:54,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-28 22:36:56,437 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=164053.33333333334, ans=0.125 2023-09-28 22:36:58,146 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-28 22:36:58,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:36:58,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=164053.33333333334, ans=0.125 2023-09-28 22:37:01,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-28 22:37:01,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-28 22:37:01,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:37:01,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:37:03,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:04,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-28 22:37:04,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:37:04,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:37:06,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:37:08,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:37:08,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:37:08,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:37:14,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:37:17,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:37:17,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:37:22,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:37:22,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:37:24,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:37:24,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:28,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:30,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-28 22:37:30,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=164186.66666666666, ans=0.125 2023-09-28 22:37:31,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:37:31,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-28 22:37:32,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:37:34,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-28 22:37:34,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:37:35,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:37:39,426 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:37:42,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:42,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-28 22:37:44,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:37:46,088 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.765e+02 2.271e+02 2.616e+02 3.188e+02 4.875e+02, threshold=5.232e+02, percent-clipped=0.0 2023-09-28 22:37:46,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:37:47,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:37:50,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:37:53,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-28 22:37:54,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:37:54,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=164253.33333333334, ans=0.1 2023-09-28 22:37:55,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:37:56,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:37:57,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-28 22:37:58,969 INFO [train.py:1039] (3/4) Epoch 5, batch 3400, loss[loss=0.2697, simple_loss=0.3415, pruned_loss=0.09895, over 24563.00 frames. ], tot_loss[loss=0.2508, simple_loss=0.3093, pruned_loss=0.09618, over 4721375.28 frames. ], batch size: 71, lr: 1.97e-02, grad_scale: 32.0 2023-09-28 22:37:59,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:37:59,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-28 22:38:02,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:38:02,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:38:02,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-28 22:38:03,023 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.90 vs. limit=22.5 2023-09-28 22:38:03,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:38:05,284 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-28 22:38:07,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=164320.0, ans=0.125 2023-09-28 22:38:10,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-28 22:38:10,479 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-28 22:38:10,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:38:15,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:38:15,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:38:15,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:38:17,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:38:22,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:38:22,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-28 22:38:27,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:38:31,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:38:32,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:38:32,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-28 22:38:37,469 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=164453.33333333334, ans=0.125 2023-09-28 22:38:38,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:38:43,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-28 22:38:49,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:38:51,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:38:51,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-28 22:38:52,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:38:52,561 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=164520.0, ans=0.125 2023-09-28 22:38:53,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:38:53,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:38:53,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:38:58,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:39:01,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:39:01,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:39:07,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:39:10,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-28 22:39:16,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 22:39:21,482 INFO [train.py:1039] (3/4) Epoch 5, batch 3450, loss[loss=0.2421, simple_loss=0.3118, pruned_loss=0.08615, over 24431.00 frames. ], tot_loss[loss=0.2495, simple_loss=0.3089, pruned_loss=0.09507, over 4733021.67 frames. ], batch size: 69, lr: 1.97e-02, grad_scale: 16.0 2023-09-28 22:39:21,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-28 22:39:25,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-28 22:39:27,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:39:28,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:39:28,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-28 22:39:31,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:39:34,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=164653.33333333334, ans=0.04949747468305833 2023-09-28 22:39:36,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:39:39,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:39:39,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:39:41,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:39:41,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:39:43,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:39:43,963 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.04 vs. limit=12.0 2023-09-28 22:39:45,632 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.11 vs. limit=15.0 2023-09-28 22:39:48,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-28 22:39:54,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-28 22:39:54,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 22:39:54,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:39:57,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:40:00,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=164786.66666666666, ans=15.0 2023-09-28 22:40:03,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-28 22:40:03,557 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=164786.66666666666, ans=0.04949747468305833 2023-09-28 22:40:04,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:40:09,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:40:09,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:40:11,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-28 22:40:13,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:40:15,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-28 22:40:15,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:40:16,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:40:18,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:40:21,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-28 22:40:21,706 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 22:40:25,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:40:26,974 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=164920.0, ans=0.0 2023-09-28 22:40:28,945 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.89 vs. limit=15.0 2023-09-28 22:40:29,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:40:31,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:40:33,101 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 2.242e+02 2.572e+02 2.948e+02 4.937e+02, threshold=5.144e+02, percent-clipped=0.0 2023-09-28 22:40:34,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:40:39,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:40:40,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:40:40,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:40:40,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:40:45,246 INFO [train.py:1039] (3/4) Epoch 5, batch 3500, loss[loss=0.2637, simple_loss=0.3104, pruned_loss=0.1085, over 23705.00 frames. ], tot_loss[loss=0.2482, simple_loss=0.3071, pruned_loss=0.09465, over 4708901.27 frames. ], batch size: 149, lr: 1.97e-02, grad_scale: 16.0 2023-09-28 22:40:45,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:40:49,278 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=164986.66666666666, ans=0.1 2023-09-28 22:40:50,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:40:50,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-28 22:40:50,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=164986.66666666666, ans=0.05 2023-09-28 22:40:52,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 22:40:52,290 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=164986.66666666666, ans=0.0 2023-09-28 22:40:56,674 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-28 22:41:00,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:41:01,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-28 22:41:06,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:41:07,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:41:10,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:41:10,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:41:10,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-28 22:41:11,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:11,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:41:13,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-28 22:41:16,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:16,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-28 22:41:19,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:41:19,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=165120.0, ans=0.0 2023-09-28 22:41:21,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=165120.0, ans=0.125 2023-09-28 22:41:22,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:22,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-28 22:41:22,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:41:25,265 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=165120.0, ans=0.125 2023-09-28 22:41:26,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:41:29,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:41:29,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:31,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:41:31,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:41:32,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-28 22:41:34,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-28 22:41:34,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-28 22:41:35,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:41:37,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:39,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:41:39,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:41:44,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 22:41:44,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:41:51,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:41:53,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-28 22:41:53,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-28 22:41:53,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:41:56,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:41:56,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:41:56,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:41:59,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-28 22:42:01,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:42:03,358 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:42:04,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-28 22:42:06,454 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-28 22:42:07,725 INFO [train.py:1039] (3/4) Epoch 5, batch 3550, loss[loss=0.2371, simple_loss=0.311, pruned_loss=0.08156, over 24679.00 frames. ], tot_loss[loss=0.2461, simple_loss=0.305, pruned_loss=0.09365, over 4708383.09 frames. ], batch size: 73, lr: 1.96e-02, grad_scale: 16.0 2023-09-28 22:42:08,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:42:09,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:42:09,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:42:11,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:42:14,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:42:22,246 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.17 vs. limit=22.5 2023-09-28 22:42:25,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:42:26,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 22:42:27,126 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=165386.66666666666, ans=0.1 2023-09-28 22:42:28,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:42:30,147 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:42:32,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:42:33,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:42:33,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:42:35,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=165386.66666666666, ans=0.125 2023-09-28 22:42:36,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:42:36,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:42:37,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:42:37,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-28 22:42:38,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:42:43,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:42:44,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:42:46,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:42:46,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:42:46,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:42:46,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-28 22:42:46,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:42:50,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:42:52,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-28 22:42:54,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=165453.33333333334, ans=0.2 2023-09-28 22:42:57,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:42:57,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:42:59,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:43:02,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-28 22:43:02,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:43:04,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-28 22:43:04,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:43:07,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:43:07,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:43:11,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-28 22:43:12,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:43:17,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:43:17,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-28 22:43:19,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:43:20,838 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.780e+02 2.191e+02 2.567e+02 2.914e+02 4.741e+02, threshold=5.134e+02, percent-clipped=0.0 2023-09-28 22:43:21,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=165586.66666666666, ans=0.2 2023-09-28 22:43:24,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:43:28,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-28 22:43:30,699 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=165586.66666666666, ans=0.125 2023-09-28 22:43:32,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=165653.33333333334, ans=0.0 2023-09-28 22:43:33,470 INFO [train.py:1039] (3/4) Epoch 5, batch 3600, loss[loss=0.2441, simple_loss=0.2992, pruned_loss=0.09453, over 23513.00 frames. ], tot_loss[loss=0.2465, simple_loss=0.305, pruned_loss=0.09404, over 4707881.28 frames. ], batch size: 134, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:43:35,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-28 22:43:35,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:43:36,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:43:36,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:43:38,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:43:38,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:43:43,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:43:45,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:43:45,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=165653.33333333334, ans=0.0 2023-09-28 22:43:46,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:43:46,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:43:48,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:43:48,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-28 22:43:51,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:43:54,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:43:56,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:43:59,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:44:01,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:44:01,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:44:03,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-28 22:44:04,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:44:08,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:44:08,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:44:11,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:44:14,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:44:14,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:44:14,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-28 22:44:22,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:44:24,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 22:44:25,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-28 22:44:30,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:44:35,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:44:37,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:44:45,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:44:45,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:44:45,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-28 22:44:46,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-28 22:44:47,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-28 22:44:50,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:44:50,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:44:51,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-28 22:44:51,816 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:44:53,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:44:53,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:44:55,053 INFO [train.py:1039] (3/4) Epoch 5, batch 3650, loss[loss=0.2441, simple_loss=0.2995, pruned_loss=0.09437, over 23733.00 frames. ], tot_loss[loss=0.2471, simple_loss=0.3055, pruned_loss=0.09433, over 4715146.64 frames. ], batch size: 164, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:44:55,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-28 22:44:55,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-28 22:44:58,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:44:59,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-28 22:45:03,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=165986.66666666666, ans=0.2 2023-09-28 22:45:04,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-28 22:45:06,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:45:09,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-28 22:45:10,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-28 22:45:15,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:45:15,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-28 22:45:15,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:45:18,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-28 22:45:20,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:45:20,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-28 22:45:21,388 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.91 vs. limit=15.0 2023-09-28 22:45:21,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-28 22:45:21,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:45:23,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-28 22:45:25,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 22:45:25,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:45:25,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:45:28,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:45:30,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-28 22:45:33,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-28 22:45:33,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:45:34,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-28 22:45:36,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:45:36,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:45:36,408 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=166120.0, ans=0.125 2023-09-28 22:45:38,701 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.81 vs. limit=15.0 2023-09-28 22:45:42,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:45:43,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:45:44,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:45:46,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:45:46,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:45:50,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:45:53,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:45:54,159 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.13 vs. limit=15.0 2023-09-28 22:45:54,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:45:54,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:45:56,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:45:56,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:45:58,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:46:02,122 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=166253.33333333334, ans=0.1 2023-09-28 22:46:03,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=166253.33333333334, ans=0.1 2023-09-28 22:46:06,460 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.698e+02 2.312e+02 2.641e+02 2.987e+02 4.263e+02, threshold=5.283e+02, percent-clipped=0.0 2023-09-28 22:46:06,566 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-28 22:46:11,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:46:11,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:46:12,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:46:12,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:46:12,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-28 22:46:13,705 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.87 vs. limit=10.0 2023-09-28 22:46:14,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:46:17,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-28 22:46:17,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:46:18,620 INFO [train.py:1039] (3/4) Epoch 5, batch 3700, loss[loss=0.2539, simple_loss=0.3096, pruned_loss=0.09909, over 23289.00 frames. ], tot_loss[loss=0.2482, simple_loss=0.3066, pruned_loss=0.0949, over 4709468.83 frames. ], batch size: 105, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:46:18,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:46:21,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:46:21,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:46:26,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:46:26,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-28 22:46:26,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:46:27,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 22:46:28,900 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 22:46:32,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 22:46:35,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:46:35,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:46:37,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 22:46:37,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:46:38,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 22:46:40,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:46:41,981 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-28 22:46:49,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=166386.66666666666, ans=0.125 2023-09-28 22:46:50,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:46:51,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 22:46:51,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:46:53,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-28 22:46:53,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:46:58,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:46:58,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-28 22:47:00,377 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:47:01,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:47:03,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:47:04,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:47:08,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 22:47:10,218 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.17 vs. limit=12.0 2023-09-28 22:47:13,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:47:13,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-28 22:47:14,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:47:14,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-28 22:47:19,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:47:19,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:47:22,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:47:23,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-28 22:47:26,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:47:26,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-28 22:47:26,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:47:26,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:47:30,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:47:32,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-28 22:47:33,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-28 22:47:33,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:47:33,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:47:34,276 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=166586.66666666666, ans=0.125 2023-09-28 22:47:37,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:47:37,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:47:40,415 INFO [train.py:1039] (3/4) Epoch 5, batch 3750, loss[loss=0.2527, simple_loss=0.3016, pruned_loss=0.1019, over 23521.00 frames. ], tot_loss[loss=0.2501, simple_loss=0.3085, pruned_loss=0.09586, over 4702993.50 frames. ], batch size: 134, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:47:40,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:47:42,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:47:42,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=166653.33333333334, ans=0.0 2023-09-28 22:47:44,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=166653.33333333334, ans=0.5 2023-09-28 22:47:45,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:47:47,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-28 22:47:47,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 22:47:49,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=166653.33333333334, ans=0.0 2023-09-28 22:47:50,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-28 22:47:50,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-28 22:47:52,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:47:53,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:47:55,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:47:55,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:47:59,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:48:02,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-28 22:48:03,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:48:05,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:48:10,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:48:11,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-28 22:48:12,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:48:15,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:48:16,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:48:20,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-28 22:48:22,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-28 22:48:23,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:48:25,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:48:25,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:48:32,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:48:32,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-28 22:48:37,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-28 22:48:40,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:48:41,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:48:42,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:48:46,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:48:48,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=166920.0, ans=0.125 2023-09-28 22:48:49,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 22:48:51,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-28 22:48:52,782 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.725e+02 2.387e+02 2.666e+02 3.325e+02 5.060e+02, threshold=5.333e+02, percent-clipped=0.0 2023-09-28 22:48:53,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:48:54,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:48:58,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-28 22:48:58,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=166920.0, ans=0.0 2023-09-28 22:49:04,269 INFO [train.py:1039] (3/4) Epoch 5, batch 3800, loss[loss=0.2519, simple_loss=0.3232, pruned_loss=0.09031, over 24439.00 frames. ], tot_loss[loss=0.252, simple_loss=0.3092, pruned_loss=0.09737, over 4679311.99 frames. ], batch size: 69, lr: 1.96e-02, grad_scale: 32.0 2023-09-28 22:49:07,225 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:49:11,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:49:13,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 22:49:13,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-28 22:49:14,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:49:16,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:49:18,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-28 22:49:22,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 22:49:22,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:49:22,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:49:25,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:49:27,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:49:27,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:49:27,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-28 22:49:31,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-28 22:49:32,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:49:33,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:49:35,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:49:37,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 22:49:38,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:49:38,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:49:41,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:49:41,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:49:48,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:49:48,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-28 22:49:50,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:49:58,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:50:04,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:50:06,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-28 22:50:10,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-28 22:50:10,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:50:10,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=167253.33333333334, ans=0.125 2023-09-28 22:50:13,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:50:13,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:50:14,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-28 22:50:19,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-28 22:50:19,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-28 22:50:19,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=167253.33333333334, ans=0.2 2023-09-28 22:50:20,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:50:20,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:50:25,455 INFO [train.py:1039] (3/4) Epoch 5, batch 3850, loss[loss=0.2257, simple_loss=0.2916, pruned_loss=0.0799, over 24334.00 frames. ], tot_loss[loss=0.2492, simple_loss=0.3071, pruned_loss=0.09564, over 4690851.82 frames. ], batch size: 61, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:50:25,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:50:25,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:50:30,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=167320.0, ans=0.125 2023-09-28 22:50:31,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:50:32,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-28 22:50:34,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:50:34,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:50:37,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 22:50:38,510 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.46 vs. limit=22.5 2023-09-28 22:50:39,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:50:41,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-28 22:50:43,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-28 22:50:49,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:50:49,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=167386.66666666666, ans=0.0 2023-09-28 22:50:52,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:50:54,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:50:54,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:50:59,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:50:59,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:51:01,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:01,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:51:01,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:04,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:06,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:51:06,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:51:07,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-28 22:51:07,740 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-28 22:51:09,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:51:09,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:51:11,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:12,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:51:12,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-28 22:51:17,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-28 22:51:19,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:20,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-28 22:51:21,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=167520.0, ans=0.125 2023-09-28 22:51:22,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-28 22:51:28,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:30,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:51:31,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=167586.66666666666, ans=0.125 2023-09-28 22:51:35,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:35,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-28 22:51:37,561 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.762e+02 2.127e+02 2.578e+02 3.001e+02 5.626e+02, threshold=5.156e+02, percent-clipped=1.0 2023-09-28 22:51:37,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-28 22:51:40,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:40,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:45,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:51:45,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 22:51:45,423 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:46,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:46,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:51:46,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-28 22:51:48,210 INFO [train.py:1039] (3/4) Epoch 5, batch 3900, loss[loss=0.2521, simple_loss=0.3145, pruned_loss=0.09481, over 23398.00 frames. ], tot_loss[loss=0.2476, simple_loss=0.3061, pruned_loss=0.09455, over 4706131.51 frames. ], batch size: 93, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:51:48,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:51:48,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-28 22:51:50,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:50,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:52,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:51:52,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:53,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:51:55,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:51:55,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:51:56,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:51:56,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-28 22:51:56,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:51:59,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:52:01,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:52:01,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:52:02,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:52:03,949 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.68 vs. limit=15.0 2023-09-28 22:52:04,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:52:04,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:52:04,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=167720.0, ans=0.0 2023-09-28 22:52:08,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:52:09,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-28 22:52:09,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:52:11,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-28 22:52:13,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:52:13,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-28 22:52:16,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-28 22:52:21,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:52:21,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:52:21,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:52:22,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:52:26,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:52:26,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=167786.66666666666, ans=0.0 2023-09-28 22:52:29,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:52:31,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:52:31,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:52:32,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:52:36,600 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.91 vs. limit=22.5 2023-09-28 22:52:38,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:52:38,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:52:47,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 22:52:49,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:52:59,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:53:02,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:53:02,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-28 22:53:02,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-28 22:53:02,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-28 22:53:05,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-28 22:53:07,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:53:08,225 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.90 vs. limit=6.0 2023-09-28 22:53:08,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-28 22:53:10,458 INFO [train.py:1039] (3/4) Epoch 5, batch 3950, loss[loss=0.2568, simple_loss=0.3109, pruned_loss=0.1013, over 23792.00 frames. ], tot_loss[loss=0.2475, simple_loss=0.3058, pruned_loss=0.09458, over 4709643.49 frames. ], batch size: 149, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:53:16,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:53:17,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-28 22:53:17,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:53:19,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:53:21,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:53:27,999 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-28 22:53:28,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:53:28,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-28 22:53:28,243 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-28 22:53:29,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:53:33,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:53:34,145 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.36 vs. limit=15.0 2023-09-28 22:53:34,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-28 22:53:34,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:53:35,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff3.min_abs, batch_count=168053.33333333334, ans=0.2 2023-09-28 22:53:36,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-28 22:53:39,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:53:39,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 22:53:39,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 22:53:41,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 22:53:41,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=168053.33333333334, ans=0.07 2023-09-28 22:53:42,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-28 22:53:45,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=168120.0, ans=0.125 2023-09-28 22:53:56,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:53:56,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:53:58,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=168120.0, ans=0.125 2023-09-28 22:54:00,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-28 22:54:07,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-28 22:54:07,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-28 22:54:07,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=168186.66666666666, ans=0.1 2023-09-28 22:54:08,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:54:08,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:54:17,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-28 22:54:17,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-28 22:54:19,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:54:20,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:54:20,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-28 22:54:21,767 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.706e+02 2.253e+02 2.651e+02 3.133e+02 5.052e+02, threshold=5.303e+02, percent-clipped=0.0 2023-09-28 22:54:22,854 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.76 vs. limit=15.0 2023-09-28 22:54:24,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:54:25,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:54:30,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-28 22:54:33,571 INFO [train.py:1039] (3/4) Epoch 5, batch 4000, loss[loss=0.2246, simple_loss=0.2905, pruned_loss=0.07936, over 24290.00 frames. ], tot_loss[loss=0.2476, simple_loss=0.3063, pruned_loss=0.0945, over 4711931.94 frames. ], batch size: 61, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:54:40,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:54:47,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:54:51,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:54:53,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:54:54,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:54:54,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-28 22:54:54,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:54:56,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-28 22:54:56,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:54:56,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-28 22:54:58,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:55:01,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 22:55:02,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=168386.66666666666, ans=0.0 2023-09-28 22:55:03,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:55:03,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:55:03,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:55:03,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-28 22:55:04,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:55:06,471 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-28 22:55:07,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:55:09,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:55:13,013 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-28 22:55:13,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 22:55:13,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:55:22,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-28 22:55:22,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:55:23,792 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.97 vs. limit=6.0 2023-09-28 22:55:24,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=168520.0, ans=0.125 2023-09-28 22:55:25,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:55:25,818 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-28 22:55:27,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:55:27,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-28 22:55:27,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:55:28,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:55:30,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-28 22:55:32,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-28 22:55:32,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-28 22:55:32,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:55:34,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-28 22:55:34,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:55:35,982 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-28 22:55:36,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=168520.0, ans=0.1 2023-09-28 22:55:40,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=168586.66666666666, ans=0.1 2023-09-28 22:55:41,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 22:55:43,087 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=168586.66666666666, ans=0.0 2023-09-28 22:55:44,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-28 22:55:46,671 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=168586.66666666666, ans=0.2 2023-09-28 22:55:49,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 22:55:49,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:55:49,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:55:51,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:55:56,172 INFO [train.py:1039] (3/4) Epoch 5, batch 4050, loss[loss=0.2592, simple_loss=0.325, pruned_loss=0.0967, over 24002.00 frames. ], tot_loss[loss=0.2494, simple_loss=0.3079, pruned_loss=0.09547, over 4710013.09 frames. ], batch size: 80, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:55:56,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:55:57,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-28 22:55:59,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-28 22:55:59,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:55:59,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=168653.33333333334, ans=0.0 2023-09-28 22:56:01,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:56:02,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-28 22:56:04,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:56:06,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:56:09,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 22:56:13,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:56:13,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-28 22:56:16,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 22:56:16,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:56:20,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:56:21,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-28 22:56:24,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 22:56:27,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-28 22:56:27,373 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-28 22:56:30,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:56:36,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-28 22:56:38,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:56:41,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:56:44,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:56:46,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:56:46,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:56:48,771 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=168853.33333333334, ans=0.0 2023-09-28 22:56:50,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-28 22:56:54,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-28 22:56:54,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 22:56:56,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:56:56,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-28 22:57:02,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:57:08,225 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 2.148e+02 2.565e+02 3.123e+02 5.245e+02, threshold=5.130e+02, percent-clipped=0.0 2023-09-28 22:57:08,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-28 22:57:08,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:57:08,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 22:57:12,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-28 22:57:12,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-28 22:57:12,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:57:15,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:57:15,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=168920.0, ans=0.125 2023-09-28 22:57:16,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:57:16,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 22:57:18,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=168986.66666666666, ans=0.125 2023-09-28 22:57:20,083 INFO [train.py:1039] (3/4) Epoch 5, batch 4100, loss[loss=0.2042, simple_loss=0.2721, pruned_loss=0.0681, over 24335.00 frames. ], tot_loss[loss=0.2494, simple_loss=0.3077, pruned_loss=0.09554, over 4703484.36 frames. ], batch size: 61, lr: 1.95e-02, grad_scale: 32.0 2023-09-28 22:57:23,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-28 22:57:23,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=168986.66666666666, ans=0.09899494936611666 2023-09-28 22:57:23,826 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=168986.66666666666, ans=0.1 2023-09-28 22:57:24,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-28 22:57:27,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-28 22:57:27,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-28 22:57:29,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:57:29,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:57:29,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:57:30,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:57:31,555 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-28 22:57:35,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:57:35,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 22:57:35,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:57:37,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 22:57:40,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 22:57:41,687 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:57:41,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:57:41,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-28 22:57:43,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:57:43,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-28 22:57:43,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:57:45,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-28 22:57:45,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-28 22:57:48,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:57:51,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-28 22:57:53,448 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:57:55,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:57:55,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-28 22:57:57,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-28 22:57:58,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-28 22:57:58,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-28 22:57:58,763 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=10.23 vs. limit=15.0 2023-09-28 22:58:01,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-28 22:58:02,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-28 22:58:04,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 22:58:07,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-28 22:58:08,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:58:09,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:58:11,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:58:17,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:58:20,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:58:22,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 22:58:25,117 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=12.32 vs. limit=15.0 2023-09-28 22:58:31,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:58:31,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 22:58:34,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 22:58:34,633 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=169253.33333333334, ans=0.125 2023-09-28 22:58:37,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 22:58:42,146 INFO [train.py:1039] (3/4) Epoch 5, batch 4150, loss[loss=0.2269, simple_loss=0.2892, pruned_loss=0.08234, over 24472.00 frames. ], tot_loss[loss=0.2484, simple_loss=0.307, pruned_loss=0.09487, over 4716770.88 frames. ], batch size: 63, lr: 1.94e-02, grad_scale: 32.0 2023-09-28 22:58:43,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-28 22:58:43,943 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 22:58:46,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-28 22:58:47,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:58:49,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-28 22:58:50,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:58:50,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-28 22:58:51,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-28 22:58:52,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-28 22:58:53,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 22:58:54,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=169320.0, ans=0.125 2023-09-28 22:58:57,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 22:58:57,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:59:01,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:59:03,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:59:03,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-28 22:59:06,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 22:59:06,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 22:59:07,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-28 22:59:13,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:59:17,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:59:19,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-28 22:59:22,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-28 22:59:22,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 22:59:23,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-28 22:59:23,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-28 22:59:23,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:59:24,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=169453.33333333334, ans=0.0 2023-09-28 22:59:26,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:59:28,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:59:30,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-28 22:59:33,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:59:34,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 22:59:35,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-28 22:59:36,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-28 22:59:37,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-28 22:59:40,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 22:59:41,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-28 22:59:44,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:59:44,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-28 22:59:44,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 22:59:45,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-28 22:59:47,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 22:59:51,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-28 22:59:51,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:59:51,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 22:59:51,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 22:59:53,310 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-28 22:59:54,524 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.840e+02 2.448e+02 2.857e+02 3.478e+02 5.752e+02, threshold=5.715e+02, percent-clipped=2.0 2023-09-28 22:59:54,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 22:59:54,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-28 22:59:54,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 22:59:55,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=169586.66666666666, ans=0.125 2023-09-28 22:59:56,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-28 22:59:57,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-28 22:59:57,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-28 22:59:59,948 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=169586.66666666666, ans=0.125 2023-09-28 23:00:01,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:00:04,581 INFO [train.py:1039] (3/4) Epoch 5, batch 4200, loss[loss=0.2362, simple_loss=0.2803, pruned_loss=0.0961, over 23689.00 frames. ], tot_loss[loss=0.2479, simple_loss=0.3057, pruned_loss=0.09508, over 4699022.13 frames. ], batch size: 232, lr: 1.94e-02, grad_scale: 16.0 2023-09-28 23:00:04,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-28 23:00:06,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:00:09,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:00:11,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:00:11,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:00:11,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:00:14,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-28 23:00:17,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-28 23:00:17,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:00:21,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:00:23,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:00:26,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-28 23:00:28,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:00:28,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:00:28,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-28 23:00:28,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:00:28,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:00:29,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:00:30,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:00:32,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:00:34,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-28 23:00:34,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:00:39,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-28 23:00:41,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:00:44,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:00:46,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:00:47,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:00:47,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-28 23:00:47,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:00:50,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:00:57,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:00:59,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:01:03,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:01:07,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-28 23:01:11,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:01:15,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:01:16,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:01:16,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=169920.0, ans=0.125 2023-09-28 23:01:19,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-28 23:01:22,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=169920.0, ans=0.0 2023-09-28 23:01:26,900 INFO [train.py:1039] (3/4) Epoch 5, batch 4250, loss[loss=0.2451, simple_loss=0.3225, pruned_loss=0.08381, over 24565.00 frames. ], tot_loss[loss=0.2457, simple_loss=0.3039, pruned_loss=0.09377, over 4708966.69 frames. ], batch size: 71, lr: 1.94e-02, grad_scale: 16.0 2023-09-28 23:01:26,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-28 23:01:28,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:01:28,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:01:32,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:01:38,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:01:38,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-28 23:01:38,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:01:43,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:01:43,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=170053.33333333334, ans=0.1 2023-09-28 23:01:45,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:01:50,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:01:50,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:01:53,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:01:53,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:01:53,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=170053.33333333334, ans=0.125 2023-09-28 23:01:54,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:01:57,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:01:58,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:01:59,360 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=24.35 vs. limit=22.5 2023-09-28 23:02:01,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:02:01,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:02:03,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-28 23:02:07,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-28 23:02:07,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:02:08,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:02:08,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:02:10,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:02:10,432 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:02:11,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:02:12,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=170120.0, ans=0.125 2023-09-28 23:02:15,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:02:16,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:02:20,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:02:20,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=170186.66666666666, ans=0.0 2023-09-28 23:02:23,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:02:23,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-28 23:02:23,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:02:23,863 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=170186.66666666666, ans=0.125 2023-09-28 23:02:25,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-28 23:02:26,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:02:26,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:02:29,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:02:29,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:02:33,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-28 23:02:35,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:02:35,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:02:35,869 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.63 vs. limit=6.0 2023-09-28 23:02:37,235 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=170253.33333333334, ans=0.125 2023-09-28 23:02:39,735 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.222e+02 2.521e+02 2.962e+02 6.093e+02, threshold=5.043e+02, percent-clipped=1.0 2023-09-28 23:02:40,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:02:42,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:02:43,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:02:45,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:02:46,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:02:48,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:02:49,897 INFO [train.py:1039] (3/4) Epoch 5, batch 4300, loss[loss=0.224, simple_loss=0.2829, pruned_loss=0.0825, over 24434.00 frames. ], tot_loss[loss=0.2443, simple_loss=0.3032, pruned_loss=0.09272, over 4703809.03 frames. ], batch size: 58, lr: 1.94e-02, grad_scale: 16.0 2023-09-28 23:02:49,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:02:49,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-28 23:02:51,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:02:54,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=170320.0, ans=0.5 2023-09-28 23:02:58,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:02:58,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:03:01,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:03:08,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:03:08,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-28 23:03:09,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:03:12,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:03:12,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:03:12,152 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-28 23:03:15,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 23:03:18,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:03:20,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-28 23:03:21,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:03:21,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-28 23:03:22,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=170453.33333333334, ans=0.0 2023-09-28 23:03:24,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 23:03:26,569 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:03:28,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:03:28,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:03:29,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=170453.33333333334, ans=0.0 2023-09-28 23:03:30,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:03:31,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:03:31,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:03:31,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-28 23:03:33,482 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-28 23:03:35,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:03:37,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:03:37,633 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=170453.33333333334, ans=0.125 2023-09-28 23:03:38,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 23:03:38,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:03:38,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:03:38,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-28 23:03:38,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-28 23:03:40,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-28 23:03:41,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:03:41,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-28 23:03:42,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-28 23:03:42,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=170520.0, ans=0.125 2023-09-28 23:03:44,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=170520.0, ans=0.1 2023-09-28 23:03:48,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:03:50,142 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-28 23:03:52,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:03:52,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:03:52,591 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:03:55,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-28 23:03:57,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:03:57,010 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:03:57,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:03:57,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:03:57,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:04:00,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:04:03,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:04:03,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:04:05,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:04:07,072 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=170586.66666666666, ans=0.0 2023-09-28 23:04:10,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-28 23:04:10,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-28 23:04:13,723 INFO [train.py:1039] (3/4) Epoch 5, batch 4350, loss[loss=0.2254, simple_loss=0.2837, pruned_loss=0.08351, over 20077.00 frames. ], tot_loss[loss=0.2467, simple_loss=0.3051, pruned_loss=0.09416, over 4689206.39 frames. ], batch size: 44, lr: 1.94e-02, grad_scale: 16.0 2023-09-28 23:04:15,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:04:18,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:04:22,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:04:22,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:04:25,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=170653.33333333334, ans=0.125 2023-09-28 23:04:27,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:04:29,653 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:04:30,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:04:34,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:04:34,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:04:37,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:04:39,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:04:39,521 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=170720.0, ans=0.05 2023-09-28 23:04:40,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:04:45,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-28 23:04:48,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:04:50,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:04:54,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=170786.66666666666, ans=0.0 2023-09-28 23:04:55,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:04:58,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-28 23:05:00,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:05:02,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:05:07,287 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-28 23:05:10,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:05:10,828 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:05:12,316 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-28 23:05:12,439 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-28 23:05:12,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:05:12,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:05:13,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:05:15,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:05:15,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:05:16,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:05:18,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-28 23:05:18,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:18,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:05:18,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:20,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-28 23:05:20,720 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-28 23:05:22,119 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-28 23:05:22,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-28 23:05:25,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:05:25,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:05:25,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:05:25,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:05:27,193 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 2.177e+02 2.511e+02 2.905e+02 5.033e+02, threshold=5.022e+02, percent-clipped=0.0 2023-09-28 23:05:28,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-28 23:05:32,005 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-28 23:05:32,032 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:35,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:05:35,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:35,565 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=170986.66666666666, ans=0.125 2023-09-28 23:05:36,622 INFO [train.py:1039] (3/4) Epoch 5, batch 4400, loss[loss=0.2174, simple_loss=0.2843, pruned_loss=0.07528, over 21589.00 frames. ], tot_loss[loss=0.2484, simple_loss=0.3065, pruned_loss=0.09514, over 4676489.10 frames. ], batch size: 47, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:05:36,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:05:40,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-28 23:05:40,489 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-28 23:05:41,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-28 23:05:41,993 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-28 23:05:43,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 23:05:43,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:05:45,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-28 23:05:48,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:05:50,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:05:50,125 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-28 23:05:54,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:05:54,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-28 23:05:56,773 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-28 23:05:59,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-28 23:06:02,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-28 23:06:02,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-28 23:06:02,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:06:03,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:06:03,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:06:05,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:06:06,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-28 23:06:06,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-28 23:06:07,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:06:08,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:06:08,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:06:10,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:06:11,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:06:11,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-28 23:06:11,989 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-28 23:06:16,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:06:25,039 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:06:26,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-28 23:06:29,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:06:33,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:06:35,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:06:35,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-28 23:06:37,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:06:37,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:06:37,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:06:37,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:06:42,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-28 23:06:46,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-28 23:06:48,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-28 23:06:48,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:06:48,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-28 23:06:48,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=171253.33333333334, ans=0.125 2023-09-28 23:06:49,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:06:55,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:06:55,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=171253.33333333334, ans=0.0 2023-09-28 23:06:57,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-28 23:07:00,194 INFO [train.py:1039] (3/4) Epoch 5, batch 4450, loss[loss=0.2315, simple_loss=0.2878, pruned_loss=0.08763, over 21241.00 frames. ], tot_loss[loss=0.2482, simple_loss=0.3071, pruned_loss=0.09472, over 4692958.25 frames. ], batch size: 46, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:07:01,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:07:03,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:07:05,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:07:11,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:07:11,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:07:15,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:07:18,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:07:23,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:07:23,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:07:24,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-28 23:07:24,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:07:24,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:07:24,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:07:24,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:07:27,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 23:07:33,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:07:35,283 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:07:35,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:07:36,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:07:37,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:07:40,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 23:07:42,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-28 23:07:42,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-28 23:07:42,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:07:44,691 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=171453.33333333334, ans=0.125 2023-09-28 23:07:46,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:07:46,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-28 23:07:51,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:07:54,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:07:54,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-28 23:07:54,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:07:54,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:07:54,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:07:55,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:07:58,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:08:03,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-28 23:08:03,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-28 23:08:05,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:08:08,921 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:08:10,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:08:11,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:08:12,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 23:08:13,315 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.679e+02 2.374e+02 2.783e+02 3.317e+02 5.756e+02, threshold=5.567e+02, percent-clipped=2.0 2023-09-28 23:08:14,250 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.42 vs. limit=15.0 2023-09-28 23:08:15,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:08:16,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=171586.66666666666, ans=0.125 2023-09-28 23:08:18,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-28 23:08:20,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:08:23,521 INFO [train.py:1039] (3/4) Epoch 5, batch 4500, loss[loss=0.2285, simple_loss=0.3023, pruned_loss=0.07737, over 24485.00 frames. ], tot_loss[loss=0.2491, simple_loss=0.3075, pruned_loss=0.09529, over 4690752.94 frames. ], batch size: 66, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:08:25,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:08:26,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-28 23:08:26,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-28 23:08:28,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:08:33,043 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:08:33,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:08:33,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:08:34,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:08:34,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:08:34,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:08:45,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=171720.0, ans=0.0 2023-09-28 23:08:47,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:08:47,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=171720.0, ans=0.05 2023-09-28 23:08:48,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:08:48,730 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_abs, batch_count=171720.0, ans=0.5 2023-09-28 23:08:52,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:08:52,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:08:55,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:08:57,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=171786.66666666666, ans=0.1 2023-09-28 23:09:01,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 23:09:06,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:09:11,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:09:13,773 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.91 vs. limit=10.0 2023-09-28 23:09:14,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:09:14,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-28 23:09:14,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:09:16,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:09:18,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:09:18,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:09:21,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:09:21,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-28 23:09:21,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:09:21,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:09:28,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:09:28,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:09:31,599 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:09:33,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:09:34,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:09:35,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-28 23:09:37,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-28 23:09:38,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-28 23:09:41,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-28 23:09:44,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-28 23:09:46,079 INFO [train.py:1039] (3/4) Epoch 5, batch 4550, loss[loss=0.2386, simple_loss=0.3049, pruned_loss=0.08613, over 24327.00 frames. ], tot_loss[loss=0.2473, simple_loss=0.3055, pruned_loss=0.09455, over 4689799.52 frames. ], batch size: 61, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:09:46,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:09:49,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:09:51,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:09:54,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:09:59,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:10:02,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:10:02,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:10:02,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:10:02,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:06,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:10:07,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:10:10,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:10:12,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-28 23:10:14,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-28 23:10:14,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:10:16,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-28 23:10:18,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=172120.0, ans=0.0 2023-09-28 23:10:20,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-28 23:10:21,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:10:24,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-28 23:10:25,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=172120.0, ans=0.125 2023-09-28 23:10:27,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:10:29,287 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=172120.0, ans=0.125 2023-09-28 23:10:30,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:30,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:30,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:10:33,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-28 23:10:37,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:10:40,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:40,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:10:42,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:10:44,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-28 23:10:44,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-28 23:10:44,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:10:45,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-28 23:10:46,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=172186.66666666666, ans=0.2 2023-09-28 23:10:48,842 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-28 23:10:48,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:10:49,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:10:49,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:10:51,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:10:51,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:10:52,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:10:54,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-28 23:10:55,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:10:55,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 23:10:56,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-28 23:10:57,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:10:57,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-28 23:10:58,979 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 2.022e+02 2.307e+02 2.730e+02 4.696e+02, threshold=4.615e+02, percent-clipped=0.0 2023-09-28 23:10:59,348 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=172253.33333333334, ans=0.1 2023-09-28 23:11:00,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:11:00,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:11:04,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:11:04,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:11:05,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-28 23:11:06,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:11:07,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:11:08,879 INFO [train.py:1039] (3/4) Epoch 5, batch 4600, loss[loss=0.2521, simple_loss=0.3106, pruned_loss=0.09679, over 23951.00 frames. ], tot_loss[loss=0.2467, simple_loss=0.3048, pruned_loss=0.09428, over 4691576.89 frames. ], batch size: 164, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:11:11,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:12,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:11:16,623 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:11:16,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:11:16,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:11:19,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-28 23:11:21,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:11:25,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:11:25,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:11:27,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:29,020 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=172386.66666666666, ans=0.125 2023-09-28 23:11:34,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-28 23:11:35,629 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.68 vs. limit=22.5 2023-09-28 23:11:36,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:39,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:43,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:11:45,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:11:50,517 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.00 vs. limit=15.0 2023-09-28 23:11:51,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-28 23:11:51,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 23:11:53,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:11:58,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:11:58,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:12:00,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=172520.0, ans=0.0 2023-09-28 23:12:02,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:12:02,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=172520.0, ans=0.1 2023-09-28 23:12:05,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-28 23:12:05,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-28 23:12:10,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:11,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:12:13,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:13,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-28 23:12:14,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:12:14,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=172520.0, ans=0.0 2023-09-28 23:12:15,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-28 23:12:15,452 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:15,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=172586.66666666666, ans=0.2 2023-09-28 23:12:16,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:12:17,960 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.67 vs. limit=8.0 2023-09-28 23:12:18,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:18,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:12:20,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:12:22,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-28 23:12:23,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-28 23:12:23,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-28 23:12:23,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:12:25,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:12:25,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:12:27,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:12:33,419 INFO [train.py:1039] (3/4) Epoch 5, batch 4650, loss[loss=0.2297, simple_loss=0.2936, pruned_loss=0.08296, over 24337.00 frames. ], tot_loss[loss=0.2456, simple_loss=0.3042, pruned_loss=0.0935, over 4704144.24 frames. ], batch size: 61, lr: 1.93e-02, grad_scale: 32.0 2023-09-28 23:12:38,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:12:39,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=172653.33333333334, ans=0.0 2023-09-28 23:12:41,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:12:41,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:12:43,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:12:43,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:12:43,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:12:45,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:12:48,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-28 23:12:54,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:12:55,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-28 23:12:56,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:12:58,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-28 23:12:58,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:12:58,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-28 23:12:58,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-28 23:12:58,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:12:59,200 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.85 vs. limit=15.0 2023-09-28 23:12:59,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:13:03,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:13:03,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:13:03,737 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-28 23:13:05,451 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=172786.66666666666, ans=0.125 2023-09-28 23:13:06,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:13:08,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-28 23:13:09,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:13:09,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:13:12,028 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-28 23:13:13,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:13:18,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:13:21,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:13:29,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:13:30,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=172853.33333333334, ans=0.1 2023-09-28 23:13:31,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:13:32,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:13:32,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:13:36,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-28 23:13:36,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-28 23:13:36,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-28 23:13:36,283 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-28 23:13:39,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:13:40,954 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.06 vs. limit=15.0 2023-09-28 23:13:45,877 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.666e+02 2.231e+02 2.488e+02 2.964e+02 5.544e+02, threshold=4.977e+02, percent-clipped=2.0 2023-09-28 23:13:48,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:13:48,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:13:48,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-28 23:13:48,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:13:49,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:13:49,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:13:51,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:13:53,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:13:53,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:13:54,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:13:56,014 INFO [train.py:1039] (3/4) Epoch 5, batch 4700, loss[loss=0.2679, simple_loss=0.3179, pruned_loss=0.1089, over 23807.00 frames. ], tot_loss[loss=0.246, simple_loss=0.3044, pruned_loss=0.09381, over 4698462.02 frames. ], batch size: 179, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:13:59,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:13:59,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:13:59,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:14:01,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-28 23:14:01,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:14:01,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=172986.66666666666, ans=0.125 2023-09-28 23:14:02,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-28 23:14:11,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:14:11,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:14:13,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:14:13,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:14:15,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:14:16,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=173053.33333333334, ans=0.1 2023-09-28 23:14:19,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-28 23:14:19,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-28 23:14:23,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:14:23,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:14:24,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:14:26,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:14:34,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:14:36,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-28 23:14:40,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:14:46,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-28 23:14:48,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:14:51,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:14:54,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-28 23:14:54,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:14:59,959 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:15:01,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-28 23:15:01,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:15:02,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:15:06,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:15:06,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:15:08,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-28 23:15:08,152 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-28 23:15:11,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:15:12,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:15:12,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:15:12,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-28 23:15:14,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:15:16,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=173253.33333333334, ans=0.0 2023-09-28 23:15:19,184 INFO [train.py:1039] (3/4) Epoch 5, batch 4750, loss[loss=0.2509, simple_loss=0.3198, pruned_loss=0.09104, over 24641.00 frames. ], tot_loss[loss=0.2467, simple_loss=0.3058, pruned_loss=0.09385, over 4700877.49 frames. ], batch size: 68, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:15:19,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-28 23:15:21,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:15:23,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:15:27,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:15:27,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:15:28,620 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.62 vs. limit=12.0 2023-09-28 23:15:29,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-28 23:15:29,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:15:34,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-28 23:15:37,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:15:37,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:15:39,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:15:44,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-28 23:15:48,072 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.66 vs. limit=15.0 2023-09-28 23:15:49,032 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:15:51,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-28 23:15:53,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:15:55,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:15:55,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:15:55,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:15:57,736 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-28 23:15:57,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-28 23:16:02,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-28 23:16:05,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:16:06,377 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.40 vs. limit=15.0 2023-09-28 23:16:07,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:16:09,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:16:09,102 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-28 23:16:09,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:16:12,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:16:14,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:16:14,279 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=173520.0, ans=0.2 2023-09-28 23:16:17,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-28 23:16:17,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-28 23:16:19,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:16:19,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:16:19,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:16:20,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 23:16:20,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-28 23:16:24,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-28 23:16:25,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:16:27,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:16:27,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-28 23:16:29,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:16:31,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:16:32,599 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.125e+02 2.370e+02 2.784e+02 4.798e+02, threshold=4.741e+02, percent-clipped=0.0 2023-09-28 23:16:32,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:16:34,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:16:34,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:16:39,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:16:39,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-28 23:16:41,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-28 23:16:42,602 INFO [train.py:1039] (3/4) Epoch 5, batch 4800, loss[loss=0.2509, simple_loss=0.3182, pruned_loss=0.09174, over 24407.00 frames. ], tot_loss[loss=0.2479, simple_loss=0.3066, pruned_loss=0.09455, over 4700473.29 frames. ], batch size: 77, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:16:42,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-28 23:16:45,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:16:45,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:16:48,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-28 23:16:53,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:16:55,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:16:59,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:16:59,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=173720.0, ans=0.0 2023-09-28 23:17:02,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:17:02,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:17:02,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-28 23:17:03,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:17:03,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:17:05,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:17:12,427 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:17:12,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:17:12,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:17:16,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:17:16,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-28 23:17:16,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:17:17,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:17:20,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:17:23,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:17:25,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:17:25,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:17:25,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=173786.66666666666, ans=0.125 2023-09-28 23:17:26,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 23:17:29,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:17:30,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=173853.33333333334, ans=0.0 2023-09-28 23:17:32,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-28 23:17:32,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-28 23:17:32,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:17:32,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:17:33,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:17:33,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:17:33,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:17:34,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:17:35,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:17:38,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:17:41,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:17:42,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:17:48,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-28 23:17:48,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:17:48,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:17:49,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:17:49,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:17:53,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:17:54,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:17:54,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:17:54,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:17:54,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:17:56,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:18:00,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:18:00,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:18:00,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:18:01,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-28 23:18:04,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-28 23:18:04,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:18:04,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:18:06,204 INFO [train.py:1039] (3/4) Epoch 5, batch 4850, loss[loss=0.2558, simple_loss=0.3019, pruned_loss=0.1048, over 23811.00 frames. ], tot_loss[loss=0.2478, simple_loss=0.3067, pruned_loss=0.0945, over 4704426.10 frames. ], batch size: 164, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:18:06,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:18:06,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:18:09,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:18:19,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-28 23:18:20,131 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:18:21,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:18:27,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:18:29,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 23:18:29,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:18:29,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=174053.33333333334, ans=0.0 2023-09-28 23:18:32,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:18:32,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:18:34,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:18:34,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-28 23:18:39,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:18:42,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:18:42,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 23:18:44,062 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:18:44,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-28 23:18:46,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:18:46,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:18:46,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=174120.0, ans=0.125 2023-09-28 23:18:49,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:18:49,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-28 23:18:50,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-28 23:18:53,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:18:58,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=174186.66666666666, ans=0.125 2023-09-28 23:19:01,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:19:02,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-28 23:19:02,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:19:02,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:19:04,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:19:06,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-28 23:19:06,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:19:06,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=174186.66666666666, ans=0.1 2023-09-28 23:19:07,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-28 23:19:08,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:19:10,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:19:12,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-28 23:19:16,292 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=174253.33333333334, ans=0.0 2023-09-28 23:19:18,960 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.747e+02 2.375e+02 2.676e+02 3.229e+02 5.316e+02, threshold=5.352e+02, percent-clipped=3.0 2023-09-28 23:19:22,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:19:28,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:19:28,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:19:30,287 INFO [train.py:1039] (3/4) Epoch 5, batch 4900, loss[loss=0.2252, simple_loss=0.3004, pruned_loss=0.075, over 24431.00 frames. ], tot_loss[loss=0.2465, simple_loss=0.3054, pruned_loss=0.09379, over 4700251.60 frames. ], batch size: 69, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:19:32,632 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.99 vs. limit=12.0 2023-09-28 23:19:33,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-28 23:19:33,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:19:39,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:19:40,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:19:42,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:19:42,790 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=174320.0, ans=0.125 2023-09-28 23:19:44,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-28 23:19:49,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-28 23:19:54,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-28 23:19:55,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-28 23:19:55,636 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:19:55,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:19:55,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:19:55,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:19:55,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:19:57,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-28 23:20:01,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-28 23:20:01,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:20:03,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:20:05,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:20:08,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:20:08,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:20:09,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:20:09,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-28 23:20:11,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:20:12,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:20:12,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-28 23:20:12,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-28 23:20:17,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-28 23:20:19,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:20:21,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:20:22,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:20:22,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:20:22,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 23:20:22,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:20:22,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-28 23:20:24,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:20:27,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:20:30,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:20:34,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-28 23:20:36,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:20:37,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-28 23:20:37,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-28 23:20:43,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=174586.66666666666, ans=0.0 2023-09-28 23:20:44,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:20:44,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:20:44,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-28 23:20:44,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 23:20:46,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:20:46,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:20:46,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=174586.66666666666, ans=0.0 2023-09-28 23:20:52,894 INFO [train.py:1039] (3/4) Epoch 5, batch 4950, loss[loss=0.2294, simple_loss=0.2538, pruned_loss=0.1024, over 18929.00 frames. ], tot_loss[loss=0.245, simple_loss=0.3035, pruned_loss=0.09321, over 4691719.63 frames. ], batch size: 388, lr: 1.92e-02, grad_scale: 32.0 2023-09-28 23:20:53,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:20:53,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:20:54,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:20:54,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-28 23:20:56,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:20:56,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=174653.33333333334, ans=0.0 2023-09-28 23:20:59,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:20:59,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-28 23:21:01,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-28 23:21:01,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-28 23:21:01,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:21:02,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-28 23:21:02,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:02,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:21:03,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:21:03,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:06,620 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:21:06,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:21:10,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:21:12,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:21:12,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:13,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:21:16,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:21:19,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=174720.0, ans=0.125 2023-09-28 23:21:21,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:25,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:21:26,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:27,335 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.97 vs. limit=22.5 2023-09-28 23:21:28,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:28,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:21:31,468 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-28 23:21:31,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-28 23:21:33,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:34,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:21:34,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:21:37,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:21:37,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:21:37,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:21:40,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=174786.66666666666, ans=0.1 2023-09-28 23:21:41,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:21:41,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=174853.33333333334, ans=0.2 2023-09-28 23:21:43,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:21:45,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:21:46,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:21:48,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:48,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-28 23:21:49,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:21:51,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:21:55,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:21:56,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:21:56,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:21:56,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:21:57,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:21:57,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:21:58,499 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.15 vs. limit=15.0 2023-09-28 23:21:59,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:21:59,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:22:01,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:22:03,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-28 23:22:05,867 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.824e+02 2.215e+02 2.550e+02 3.115e+02 4.856e+02, threshold=5.099e+02, percent-clipped=0.0 2023-09-28 23:22:07,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:22:12,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-28 23:22:12,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-28 23:22:16,383 INFO [train.py:1039] (3/4) Epoch 5, batch 5000, loss[loss=0.2661, simple_loss=0.3366, pruned_loss=0.09776, over 24368.00 frames. ], tot_loss[loss=0.2439, simple_loss=0.3031, pruned_loss=0.09235, over 4703229.36 frames. ], batch size: 77, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:22:21,845 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:22:21,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:22:22,098 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-28 23:22:23,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-28 23:22:26,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:22:28,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-28 23:22:29,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:22:29,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:22:29,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-28 23:22:29,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=174986.66666666666, ans=0.0 2023-09-28 23:22:30,478 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.77 vs. limit=22.5 2023-09-28 23:22:31,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:22:31,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:22:33,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-28 23:22:33,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:22:33,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:22:35,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-28 23:22:35,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-28 23:22:36,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:22:36,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-28 23:22:36,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:22:38,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:22:38,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:22:38,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-28 23:22:38,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-28 23:22:39,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-28 23:22:39,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:22:41,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:22:42,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-28 23:22:42,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:22:45,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:22:45,918 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.38 vs. limit=15.0 2023-09-28 23:22:47,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:22:48,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-28 23:22:50,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-28 23:22:50,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:22:52,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:22:57,027 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-28 23:23:00,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:23:01,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:23:01,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:04,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-28 23:23:04,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:23:05,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:23:05,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:23:07,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-28 23:23:08,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:23:11,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:23:13,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:23:19,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-28 23:23:25,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:34,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:23:36,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:36,259 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:23:36,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:23:37,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:23:37,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:23:37,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:39,242 INFO [train.py:1039] (3/4) Epoch 5, batch 5050, loss[loss=0.2592, simple_loss=0.3148, pruned_loss=0.1018, over 23405.00 frames. ], tot_loss[loss=0.2441, simple_loss=0.304, pruned_loss=0.09206, over 4708742.95 frames. ], batch size: 93, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:23:43,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:23:43,854 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.19 vs. limit=15.0 2023-09-28 23:23:44,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-28 23:23:44,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:23:47,027 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.57 vs. limit=15.0 2023-09-28 23:23:47,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:23:49,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:23:49,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-28 23:23:50,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:23:50,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:23:53,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:23:55,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:23:55,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:24:04,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-28 23:24:06,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:24:08,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:24:08,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-28 23:24:09,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:24:11,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:24:11,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:24:11,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:24:11,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-28 23:24:12,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-28 23:24:14,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:24:16,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=175453.33333333334, ans=0.0 2023-09-28 23:24:17,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:24:20,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=175453.33333333334, ans=0.0 2023-09-28 23:24:21,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:24:21,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-28 23:24:22,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:24:25,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-28 23:24:27,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:24:29,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:24:29,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:24:31,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:24:32,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:24:33,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=175520.0, ans=0.1 2023-09-28 23:24:34,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:24:35,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:24:35,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:24:35,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:24:36,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-28 23:24:37,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:24:39,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:24:42,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:24:42,647 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-28 23:24:42,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-28 23:24:46,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:24:47,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:24:47,464 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-28 23:24:50,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:24:50,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-28 23:24:50,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:24:51,916 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.754e+02 2.101e+02 2.649e+02 3.047e+02 4.508e+02, threshold=5.297e+02, percent-clipped=0.0 2023-09-28 23:24:55,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:24:57,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:24:57,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-28 23:24:57,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-28 23:25:00,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:25:00,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:25:01,952 INFO [train.py:1039] (3/4) Epoch 5, batch 5100, loss[loss=0.2572, simple_loss=0.3079, pruned_loss=0.1033, over 23300.00 frames. ], tot_loss[loss=0.2443, simple_loss=0.3047, pruned_loss=0.09194, over 4719115.15 frames. ], batch size: 105, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:25:02,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:25:04,312 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-28 23:25:07,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:25:09,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-28 23:25:11,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-28 23:25:11,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:25:12,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:25:15,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:25:17,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-28 23:25:17,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-28 23:25:20,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:25:22,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:25:25,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:25:29,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-28 23:25:30,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:25:32,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:25:32,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-28 23:25:35,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:25:37,481 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:25:37,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-28 23:25:39,772 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-28 23:25:41,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=175786.66666666666, ans=0.1 2023-09-28 23:25:42,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:25:42,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-28 23:25:42,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-28 23:25:47,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:25:52,507 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.24 vs. limit=10.0 2023-09-28 23:25:56,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:25:59,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-28 23:26:00,013 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-28 23:26:00,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=175853.33333333334, ans=0.0 2023-09-28 23:26:01,407 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-28 23:26:03,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-28 23:26:03,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:26:04,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=175853.33333333334, ans=0.0 2023-09-28 23:26:05,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-28 23:26:09,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-28 23:26:11,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-28 23:26:13,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:26:15,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-28 23:26:18,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:26:18,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-28 23:26:24,730 INFO [train.py:1039] (3/4) Epoch 5, batch 5150, loss[loss=0.2498, simple_loss=0.306, pruned_loss=0.0968, over 23735.00 frames. ], tot_loss[loss=0.2456, simple_loss=0.3058, pruned_loss=0.09266, over 4718117.07 frames. ], batch size: 179, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:26:24,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:26:24,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:26:24,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:26:25,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:26:25,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:26:26,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:26:26,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-28 23:26:26,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-28 23:26:27,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=175986.66666666666, ans=0.07 2023-09-28 23:26:28,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-28 23:26:28,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:26:28,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-28 23:26:30,100 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:26:31,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 23:26:32,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=175986.66666666666, ans=0.0 2023-09-28 23:26:33,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:26:35,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:26:40,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 23:26:40,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-28 23:26:41,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:26:41,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:26:43,653 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=176053.33333333334, ans=0.125 2023-09-28 23:26:44,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:26:44,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:26:44,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:26:46,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:26:46,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:26:46,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-28 23:26:46,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:26:46,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=176053.33333333334, ans=0.125 2023-09-28 23:26:48,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:26:49,547 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.09 vs. limit=15.0 2023-09-28 23:26:50,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:26:52,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-28 23:26:53,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:26:54,181 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.50 vs. limit=6.0 2023-09-28 23:26:55,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=176053.33333333334, ans=0.125 2023-09-28 23:26:59,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:27:00,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=176120.0, ans=0.2 2023-09-28 23:27:01,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-28 23:27:05,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:27:11,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:27:13,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:27:16,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:27:16,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:27:19,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-28 23:27:24,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:27:26,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:27:26,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:27:31,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:27:31,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:27:32,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-28 23:27:37,273 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 2.252e+02 2.481e+02 2.857e+02 3.938e+02, threshold=4.962e+02, percent-clipped=0.0 2023-09-28 23:27:37,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:27:40,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:27:42,652 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:27:42,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:27:43,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=176253.33333333334, ans=0.125 2023-09-28 23:27:44,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-28 23:27:44,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:27:44,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:27:44,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:27:47,092 INFO [train.py:1039] (3/4) Epoch 5, batch 5200, loss[loss=0.225, simple_loss=0.2882, pruned_loss=0.08092, over 24445.00 frames. ], tot_loss[loss=0.2459, simple_loss=0.3061, pruned_loss=0.09284, over 4726259.08 frames. ], batch size: 58, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:27:49,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:27:51,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:27:55,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:28:01,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-28 23:28:01,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:28:02,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:28:03,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:28:04,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:28:06,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:28:07,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-28 23:28:09,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:28:09,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:28:12,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-28 23:28:16,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:28:17,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:28:17,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-28 23:28:17,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-28 23:28:22,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-28 23:28:22,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:28:22,307 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-28 23:28:22,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:28:25,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:28:25,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:28:27,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-28 23:28:27,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:28:30,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:28:34,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-28 23:28:34,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-28 23:28:34,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-28 23:28:40,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-28 23:28:40,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:28:44,332 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:28:47,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:28:47,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:28:48,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-28 23:28:48,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:28:48,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-28 23:28:48,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:28:50,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:28:54,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:28:55,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:28:58,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:28:59,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=176586.66666666666, ans=0.0 2023-09-28 23:29:00,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:29:00,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:29:01,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=176586.66666666666, ans=0.125 2023-09-28 23:29:07,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:29:07,411 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-28 23:29:08,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:29:08,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:29:10,860 INFO [train.py:1039] (3/4) Epoch 5, batch 5250, loss[loss=0.2326, simple_loss=0.2693, pruned_loss=0.09793, over 19481.00 frames. ], tot_loss[loss=0.2444, simple_loss=0.3046, pruned_loss=0.09213, over 4734620.72 frames. ], batch size: 388, lr: 1.91e-02, grad_scale: 32.0 2023-09-28 23:29:10,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:29:11,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-28 23:29:11,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:29:13,116 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=176653.33333333334, ans=0.1 2023-09-28 23:29:14,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:29:17,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:29:17,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:29:18,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:29:23,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:29:27,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:29:28,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:29:29,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:29:29,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=176720.0, ans=0.125 2023-09-28 23:29:32,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-28 23:29:32,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:29:34,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:29:36,785 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.71 vs. limit=12.0 2023-09-28 23:29:44,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=176786.66666666666, ans=0.0 2023-09-28 23:29:59,163 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=176853.33333333334, ans=0.0 2023-09-28 23:30:16,426 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.843e+02 2.340e+02 2.624e+02 3.163e+02 5.259e+02, threshold=5.248e+02, percent-clipped=2.0 2023-09-28 23:30:24,701 INFO [train.py:1039] (3/4) Epoch 5, batch 5300, loss[loss=0.2127, simple_loss=0.2815, pruned_loss=0.07193, over 24629.00 frames. ], tot_loss[loss=0.2436, simple_loss=0.3033, pruned_loss=0.09191, over 4731890.11 frames. ], batch size: 60, lr: 1.90e-02, grad_scale: 32.0 2023-09-28 23:30:27,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=176986.66666666666, ans=0.025 2023-09-28 23:30:29,921 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.18 vs. limit=22.5 2023-09-28 23:30:40,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:30:40,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-28 23:30:40,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-28 23:30:40,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:30:40,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:30:40,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:30:40,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:30:40,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:30:40,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:30:40,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:30:40,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:30:41,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:30:41,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-28 23:30:41,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-28 23:30:41,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-28 23:30:42,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-28 23:30:42,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-28 23:30:42,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-28 23:30:42,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:30:43,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:30:43,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:30:43,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:30:43,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:30:44,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:30:44,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:30:44,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:30:44,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:30:44,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:30:44,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:30:44,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:30:44,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:30:45,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-28 23:30:45,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:30:46,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:30:46,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-28 23:30:46,148 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-28 23:30:46,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:30:46,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:30:46,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-28 23:30:46,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-28 23:30:46,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-28 23:30:47,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:30:48,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:30:48,284 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-28 23:30:48,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-28 23:30:48,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:30:48,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:30:48,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-28 23:30:48,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-28 23:30:48,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-28 23:30:49,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-28 23:30:57,113 INFO [train.py:1039] (3/4) Epoch 6, batch 0, loss[loss=0.268, simple_loss=0.3193, pruned_loss=0.1084, over 23339.00 frames. ], tot_loss[loss=0.268, simple_loss=0.3193, pruned_loss=0.1084, over 23339.00 frames. ], batch size: 119, lr: 1.78e-02, grad_scale: 32.0 2023-09-28 23:30:57,114 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-28 23:31:12,854 INFO [train.py:1071] (3/4) Epoch 6, validation: loss=0.2892, simple_loss=0.2993, pruned_loss=0.1395, over 1125622.00 frames. 2023-09-28 23:31:12,855 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-28 23:31:16,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-28 23:31:16,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:31:18,417 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:31:24,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:31:24,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:31:24,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:31:24,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-28 23:31:26,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-28 23:31:27,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:31:29,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:31:32,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:31:32,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:31:34,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:31:34,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:31:35,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-28 23:31:38,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:31:39,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=177133.33333333334, ans=0.125 2023-09-28 23:31:44,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:31:44,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:31:49,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-28 23:31:53,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:31:53,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:31:54,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=177200.0, ans=0.2 2023-09-28 23:31:55,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:32:00,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:32:03,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=177266.66666666666, ans=0.1 2023-09-28 23:32:04,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:32:10,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-28 23:32:12,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-28 23:32:13,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:32:13,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:32:15,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:32:17,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:32:17,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-28 23:32:17,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=177333.33333333334, ans=0.1 2023-09-28 23:32:19,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:32:23,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:32:27,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:32:32,049 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-28 23:32:33,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:32:34,992 INFO [train.py:1039] (3/4) Epoch 6, batch 50, loss[loss=0.2493, simple_loss=0.2989, pruned_loss=0.09987, over 23775.00 frames. ], tot_loss[loss=0.2449, simple_loss=0.3061, pruned_loss=0.09184, over 1064140.42 frames. ], batch size: 164, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:32:38,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:32:41,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:32:41,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-28 23:32:41,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:32:42,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:32:44,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:32:46,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:32:49,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:32:55,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-28 23:32:55,225 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:33:02,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-28 23:33:04,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-28 23:33:06,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-28 23:33:08,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:33:08,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:33:08,943 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=177533.33333333334, ans=0.0 2023-09-28 23:33:09,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:33:11,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:33:11,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-28 23:33:13,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:33:13,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:33:18,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=177533.33333333334, ans=0.0 2023-09-28 23:33:21,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:33:22,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:33:22,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=177600.0, ans=0.125 2023-09-28 23:33:24,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:33:25,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-28 23:33:26,836 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.775e+02 2.186e+02 2.592e+02 3.142e+02 7.850e+02, threshold=5.184e+02, percent-clipped=2.0 2023-09-28 23:33:27,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:33:28,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:33:28,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-28 23:33:28,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=177600.0, ans=0.125 2023-09-28 23:33:29,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:33:30,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-28 23:33:39,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:33:39,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:33:40,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=177666.66666666666, ans=0.0 2023-09-28 23:33:42,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:33:43,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:33:43,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:33:47,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-28 23:33:47,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-28 23:33:47,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:33:48,775 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:33:50,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:33:50,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:33:51,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-28 23:33:51,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-28 23:33:54,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-28 23:33:54,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:33:54,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:33:56,345 INFO [train.py:1039] (3/4) Epoch 6, batch 100, loss[loss=0.2423, simple_loss=0.2936, pruned_loss=0.09554, over 23813.00 frames. ], tot_loss[loss=0.2477, simple_loss=0.3082, pruned_loss=0.0936, over 1869986.04 frames. ], batch size: 150, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:33:56,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-28 23:33:56,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-28 23:33:58,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:33:58,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:34:01,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-28 23:34:01,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:34:01,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=177733.33333333334, ans=0.1 2023-09-28 23:34:02,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:34:05,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:34:09,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:34:11,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-28 23:34:12,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:34:17,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:34:17,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:34:17,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:34:17,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:34:19,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:34:19,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-28 23:34:21,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:34:21,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:34:21,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:34:21,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:34:25,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-28 23:34:25,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:34:27,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:34:28,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:34:30,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:34:34,158 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-28 23:34:34,184 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-28 23:34:37,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:34:37,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:34:41,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-28 23:34:42,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:34:43,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:34:51,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:34:52,769 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-28 23:34:54,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-28 23:34:58,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:34:59,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:35:01,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:35:02,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=178000.0, ans=0.125 2023-09-28 23:35:05,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:35:07,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:35:08,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:35:10,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:35:11,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:35:12,630 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=7.56 vs. limit=12.0 2023-09-28 23:35:14,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:35:15,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:35:15,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:35:17,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-28 23:35:18,631 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-28 23:35:18,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:35:18,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:35:20,658 INFO [train.py:1039] (3/4) Epoch 6, batch 150, loss[loss=0.2534, simple_loss=0.3127, pruned_loss=0.09702, over 23311.00 frames. ], tot_loss[loss=0.2465, simple_loss=0.3075, pruned_loss=0.09277, over 2507079.15 frames. ], batch size: 93, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:35:20,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:20,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:35:20,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-28 23:35:21,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=178066.66666666666, ans=0.125 2023-09-28 23:35:22,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 23:35:22,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:35:22,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:22,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:35:24,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:35:24,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:35:24,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:35:27,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:35:31,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:35:31,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:35:31,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:34,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:35:35,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:37,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:35:39,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:42,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-28 23:35:42,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-28 23:35:42,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-28 23:35:47,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:35:47,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:35:47,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:35:48,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:35:48,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:35:50,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:50,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:35:54,364 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-28 23:35:55,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:36:01,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:36:04,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:36:06,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-28 23:36:09,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:36:09,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:36:10,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:36:12,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:36:13,834 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.731e+02 2.160e+02 2.435e+02 3.119e+02 4.742e+02, threshold=4.869e+02, percent-clipped=0.0 2023-09-28 23:36:15,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:36:15,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:36:16,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:36:17,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-28 23:36:20,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=178266.66666666666, ans=0.125 2023-09-28 23:36:23,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:36:25,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:36:25,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:36:25,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:36:28,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:36:31,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-28 23:36:33,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:36:34,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:36:36,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:36:38,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:36:38,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-28 23:36:38,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:36:38,592 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-28 23:36:42,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:36:43,677 INFO [train.py:1039] (3/4) Epoch 6, batch 200, loss[loss=0.32, simple_loss=0.3537, pruned_loss=0.1431, over 19415.00 frames. ], tot_loss[loss=0.2466, simple_loss=0.3071, pruned_loss=0.09304, over 2998934.15 frames. ], batch size: 388, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:36:46,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:36:46,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:36:47,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=178400.0, ans=0.2 2023-09-28 23:36:47,279 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=178400.0, ans=0.125 2023-09-28 23:36:48,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-28 23:36:50,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:36:50,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:36:51,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-28 23:36:53,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-28 23:36:54,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:36:56,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:37:01,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:37:01,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:37:01,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:37:05,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=178466.66666666666, ans=0.2 2023-09-28 23:37:24,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:37:24,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:37:24,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:37:26,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:37:26,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-28 23:37:26,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:37:27,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=178533.33333333334, ans=0.0 2023-09-28 23:37:29,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:37:30,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:37:32,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:37:32,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:37:32,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-28 23:37:33,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-28 23:37:33,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:37:39,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:37:44,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:37:51,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:37:52,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=178666.66666666666, ans=0.125 2023-09-28 23:37:53,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:37:59,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:38:02,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-28 23:38:02,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:38:02,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:38:02,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:38:03,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=178666.66666666666, ans=0.2 2023-09-28 23:38:04,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:38:04,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=178733.33333333334, ans=0.125 2023-09-28 23:38:05,648 INFO [train.py:1039] (3/4) Epoch 6, batch 250, loss[loss=0.2538, simple_loss=0.3226, pruned_loss=0.09256, over 24029.00 frames. ], tot_loss[loss=0.246, simple_loss=0.3061, pruned_loss=0.09293, over 3375399.50 frames. ], batch size: 80, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:38:07,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-28 23:38:07,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:38:08,681 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-28 23:38:10,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:38:12,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:38:12,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:38:12,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:38:17,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:38:17,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:38:19,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:38:24,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:38:26,261 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=178800.0, ans=0.0 2023-09-28 23:38:35,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:38:37,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:38:39,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:38:46,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-28 23:38:46,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:38:47,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:38:47,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:38:49,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:38:49,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:38:51,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:38:52,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:38:56,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-28 23:38:56,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:38:57,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:38:59,151 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 2.163e+02 2.470e+02 2.985e+02 4.206e+02, threshold=4.941e+02, percent-clipped=0.0 2023-09-28 23:38:59,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:38:59,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:38:59,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=178933.33333333334, ans=0.1 2023-09-28 23:39:01,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:39:02,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:39:02,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:39:04,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:39:06,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:39:06,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:39:09,336 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:39:09,730 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=178933.33333333334, ans=0.125 2023-09-28 23:39:12,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:39:14,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:39:19,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=179000.0, ans=0.2 2023-09-28 23:39:19,791 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=179000.0, ans=0.2 2023-09-28 23:39:21,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:39:23,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:39:26,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-28 23:39:28,052 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:39:28,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-28 23:39:29,812 INFO [train.py:1039] (3/4) Epoch 6, batch 300, loss[loss=0.2367, simple_loss=0.3083, pruned_loss=0.0825, over 24052.00 frames. ], tot_loss[loss=0.2436, simple_loss=0.3035, pruned_loss=0.09186, over 3672355.38 frames. ], batch size: 80, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:39:30,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-28 23:39:30,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-28 23:39:31,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:39:31,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-28 23:39:36,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:39:38,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:39:40,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:39:41,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-28 23:39:41,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:39:44,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-28 23:39:44,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-28 23:39:44,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:39:49,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:39:54,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:39:54,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-28 23:39:58,591 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-28 23:39:58,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:39:59,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=179133.33333333334, ans=0.0 2023-09-28 23:40:01,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:40:03,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:03,636 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-28 23:40:03,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:40:06,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:40:10,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:40:10,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:40:14,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-28 23:40:14,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-28 23:40:16,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:40:17,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:19,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-28 23:40:21,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:40:24,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:40:28,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:40:28,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-28 23:40:31,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:31,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:40:35,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:37,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:40:38,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-28 23:40:38,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:40:39,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:40:40,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-28 23:40:43,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:40:43,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:40:45,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:40:45,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=179333.33333333334, ans=0.125 2023-09-28 23:40:46,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:40:46,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:40:46,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=179333.33333333334, ans=0.125 2023-09-28 23:40:52,677 INFO [train.py:1039] (3/4) Epoch 6, batch 350, loss[loss=0.235, simple_loss=0.3028, pruned_loss=0.08359, over 23391.00 frames. ], tot_loss[loss=0.2413, simple_loss=0.3013, pruned_loss=0.09066, over 3899361.48 frames. ], batch size: 93, lr: 1.77e-02, grad_scale: 32.0 2023-09-28 23:40:52,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:40:52,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-28 23:40:55,805 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:40:56,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=179400.0, ans=0.125 2023-09-28 23:41:03,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:41:03,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=179400.0, ans=0.125 2023-09-28 23:41:05,176 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.10 vs. limit=15.0 2023-09-28 23:41:07,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:41:07,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:41:10,794 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-28 23:41:10,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:41:11,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-28 23:41:14,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:41:16,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-28 23:41:16,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:41:21,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-28 23:41:21,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:41:24,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:41:24,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:41:25,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:41:25,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:41:27,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:41:27,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:41:28,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:41:30,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:41:30,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:41:33,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=179533.33333333334, ans=0.1 2023-09-28 23:41:39,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:41:39,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:41:40,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:41:40,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:41:45,455 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.204e+02 2.490e+02 2.803e+02 5.345e+02, threshold=4.981e+02, percent-clipped=1.0 2023-09-28 23:41:47,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-28 23:41:47,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:41:54,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:41:54,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:41:54,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:41:54,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=179600.0, ans=0.125 2023-09-28 23:41:56,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-28 23:41:57,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:41:59,167 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-28 23:42:00,726 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-28 23:42:00,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:42:03,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:42:03,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-28 23:42:05,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:42:09,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:42:09,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:42:11,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=179666.66666666666, ans=0.0 2023-09-28 23:42:12,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:42:12,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:42:15,712 INFO [train.py:1039] (3/4) Epoch 6, batch 400, loss[loss=0.2395, simple_loss=0.314, pruned_loss=0.08248, over 24430.00 frames. ], tot_loss[loss=0.2405, simple_loss=0.301, pruned_loss=0.08998, over 4084508.31 frames. ], batch size: 69, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:42:15,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:42:17,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:42:21,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:42:21,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-28 23:42:21,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:42:23,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:42:25,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:42:26,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:42:29,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:42:29,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:42:29,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=179733.33333333334, ans=0.0 2023-09-28 23:42:32,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-28 23:42:33,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-28 23:42:33,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:42:34,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=179800.0, ans=0.125 2023-09-28 23:42:35,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-28 23:42:35,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:42:39,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:42:39,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:42:40,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-28 23:42:41,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:42:41,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:42:41,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:42:41,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:42:43,257 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-28 23:42:43,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-28 23:42:49,671 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=179866.66666666666, ans=0.1 2023-09-28 23:42:50,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:42:51,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:42:52,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-28 23:42:53,241 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.81 vs. limit=15.0 2023-09-28 23:42:53,492 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.45 vs. limit=15.0 2023-09-28 23:42:53,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-28 23:42:57,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:42:59,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:43:05,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-28 23:43:08,096 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:43:11,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-28 23:43:12,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:43:14,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:43:15,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-28 23:43:19,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:43:21,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-28 23:43:23,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:43:24,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:43:24,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-28 23:43:29,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-28 23:43:31,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-28 23:43:32,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:43:32,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:43:36,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-28 23:43:38,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:43:38,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:43:38,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-28 23:43:38,578 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:43:39,744 INFO [train.py:1039] (3/4) Epoch 6, batch 450, loss[loss=0.2093, simple_loss=0.2847, pruned_loss=0.06695, over 24319.00 frames. ], tot_loss[loss=0.2399, simple_loss=0.3012, pruned_loss=0.08931, over 4223730.29 frames. ], batch size: 61, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:43:41,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-28 23:43:41,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:43:41,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:43:42,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:43:42,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-28 23:43:43,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:43:44,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-28 23:43:46,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:43:47,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=180066.66666666666, ans=0.125 2023-09-28 23:43:51,623 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.75 vs. limit=12.0 2023-09-28 23:43:59,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:43:59,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:44:01,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-28 23:44:02,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-28 23:44:06,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:44:10,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:44:12,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:44:17,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:44:17,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:44:19,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-28 23:44:20,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-28 23:44:22,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-28 23:44:22,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:44:23,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:44:25,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:44:28,904 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-28 23:44:28,917 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-28 23:44:28,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:44:30,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:44:31,954 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 2.131e+02 2.453e+02 2.864e+02 4.653e+02, threshold=4.906e+02, percent-clipped=0.0 2023-09-28 23:44:32,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-28 23:44:34,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=180266.66666666666, ans=0.05 2023-09-28 23:44:37,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:44:37,584 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-28 23:44:38,154 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.23 vs. limit=15.0 2023-09-28 23:44:38,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-28 23:44:39,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-28 23:44:42,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:44:44,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:44:44,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=180333.33333333334, ans=0.125 2023-09-28 23:44:45,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:44:45,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-28 23:44:50,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:44:50,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-28 23:44:52,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-28 23:44:53,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-28 23:44:57,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:45:00,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:45:00,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:45:00,237 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-28 23:45:02,025 INFO [train.py:1039] (3/4) Epoch 6, batch 500, loss[loss=0.2402, simple_loss=0.3053, pruned_loss=0.08752, over 24678.00 frames. ], tot_loss[loss=0.2395, simple_loss=0.3011, pruned_loss=0.08901, over 4345830.22 frames. ], batch size: 65, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:45:05,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:45:06,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:45:06,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:45:06,838 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-28 23:45:08,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=180400.0, ans=0.0 2023-09-28 23:45:09,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-28 23:45:09,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:45:12,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=180400.0, ans=0.125 2023-09-28 23:45:13,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:45:17,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-28 23:45:17,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-28 23:45:20,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:45:20,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:45:20,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:45:27,508 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=180466.66666666666, ans=0.125 2023-09-28 23:45:29,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=180466.66666666666, ans=0.1 2023-09-28 23:45:31,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:45:33,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-28 23:45:33,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-28 23:45:33,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:45:33,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-28 23:45:35,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-28 23:45:38,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:45:40,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:45:40,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:45:40,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:45:40,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-28 23:45:44,074 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-28 23:45:47,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:45:48,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:45:48,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:45:50,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:45:50,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-28 23:45:52,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-28 23:45:52,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=180600.0, ans=0.0 2023-09-28 23:45:55,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:45:57,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:46:01,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:46:06,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:46:14,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:46:15,668 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.94 vs. limit=15.0 2023-09-28 23:46:16,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-28 23:46:16,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:46:16,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:46:18,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-28 23:46:20,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-28 23:46:20,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:46:24,634 INFO [train.py:1039] (3/4) Epoch 6, batch 550, loss[loss=0.3521, simple_loss=0.3704, pruned_loss=0.1669, over 19678.00 frames. ], tot_loss[loss=0.242, simple_loss=0.3032, pruned_loss=0.09037, over 4430723.70 frames. ], batch size: 389, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:46:28,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-28 23:46:28,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-28 23:46:28,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:46:29,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-28 23:46:30,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:46:30,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:46:30,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:46:32,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:46:33,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:46:33,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:46:36,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:46:38,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-28 23:46:38,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:46:43,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:46:43,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:46:45,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:46:47,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:46:51,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-28 23:46:51,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-28 23:46:53,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:47:00,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:47:00,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:47:02,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:47:05,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:47:05,200 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-28 23:47:07,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:47:08,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 23:47:12,681 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.86 vs. limit=15.0 2023-09-28 23:47:13,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-28 23:47:13,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-28 23:47:13,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:47:13,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:47:15,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-28 23:47:17,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-28 23:47:18,453 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.759e+02 2.230e+02 2.579e+02 3.045e+02 5.000e+02, threshold=5.158e+02, percent-clipped=1.0 2023-09-28 23:47:18,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:47:18,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:47:20,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:47:20,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:47:23,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:47:24,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:47:26,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:47:27,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:47:29,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-28 23:47:29,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:47:33,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:47:34,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:47:34,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:47:36,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-28 23:47:36,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-28 23:47:45,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-28 23:47:47,848 INFO [train.py:1039] (3/4) Epoch 6, batch 600, loss[loss=0.2426, simple_loss=0.3208, pruned_loss=0.08214, over 24331.00 frames. ], tot_loss[loss=0.2418, simple_loss=0.3034, pruned_loss=0.09014, over 4507857.79 frames. ], batch size: 74, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:47:49,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-28 23:47:51,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:47:51,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:47:51,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:47:57,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:47:59,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-28 23:48:01,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-28 23:48:03,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-28 23:48:06,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:48:07,814 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:48:08,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=181133.33333333334, ans=0.0 2023-09-28 23:48:09,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-28 23:48:09,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:48:11,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=181133.33333333334, ans=0.125 2023-09-28 23:48:17,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-28 23:48:21,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:48:21,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:48:21,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:48:25,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=181200.0, ans=0.0 2023-09-28 23:48:28,245 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=181200.0, ans=0.1 2023-09-28 23:48:29,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:48:29,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:48:29,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:48:38,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:48:41,543 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=181266.66666666666, ans=0.125 2023-09-28 23:48:42,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:48:42,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:48:42,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:48:43,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=181266.66666666666, ans=0.125 2023-09-28 23:48:51,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-28 23:48:53,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=181333.33333333334, ans=0.0 2023-09-28 23:48:56,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:48:56,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:49:03,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-28 23:49:03,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:49:06,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-28 23:49:06,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:49:06,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:49:08,346 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=181333.33333333334, ans=0.1 2023-09-28 23:49:11,011 INFO [train.py:1039] (3/4) Epoch 6, batch 650, loss[loss=0.2399, simple_loss=0.2982, pruned_loss=0.09078, over 18823.00 frames. ], tot_loss[loss=0.2418, simple_loss=0.3021, pruned_loss=0.09077, over 4539040.38 frames. ], batch size: 40, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:49:13,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-28 23:49:14,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-28 23:49:16,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:49:16,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:49:19,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:49:23,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-28 23:49:23,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:49:29,243 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.39 vs. limit=15.0 2023-09-28 23:49:30,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:49:30,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:49:35,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:49:38,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-28 23:49:39,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:49:40,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:49:40,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=181466.66666666666, ans=0.0 2023-09-28 23:49:43,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:49:45,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-28 23:49:47,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:49:48,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:49:48,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:49:50,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:49:50,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-28 23:49:51,210 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=8.40 vs. limit=12.0 2023-09-28 23:49:52,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-28 23:49:54,031 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-28 23:49:54,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:49:54,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:49:54,849 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=19.53 vs. limit=15.0 2023-09-28 23:49:57,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:49:58,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:49:58,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:49:58,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-28 23:50:00,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-28 23:50:01,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:50:01,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-28 23:50:05,126 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.839e+02 2.251e+02 2.578e+02 2.975e+02 4.088e+02, threshold=5.156e+02, percent-clipped=0.0 2023-09-28 23:50:05,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-28 23:50:05,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:50:05,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-28 23:50:08,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-28 23:50:08,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-28 23:50:09,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:50:09,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:50:09,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:50:09,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:50:10,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:50:18,943 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:50:18,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:50:19,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=181666.66666666666, ans=0.0 2023-09-28 23:50:19,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=181666.66666666666, ans=0.0 2023-09-28 23:50:21,753 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:50:24,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:50:24,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-28 23:50:25,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:50:33,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:50:33,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:50:33,149 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:50:34,414 INFO [train.py:1039] (3/4) Epoch 6, batch 700, loss[loss=0.2393, simple_loss=0.2876, pruned_loss=0.09547, over 23776.00 frames. ], tot_loss[loss=0.2412, simple_loss=0.3005, pruned_loss=0.09091, over 4565053.03 frames. ], batch size: 212, lr: 1.76e-02, grad_scale: 32.0 2023-09-28 23:50:34,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:50:37,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-28 23:50:37,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-28 23:50:40,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=181733.33333333334, ans=0.0 2023-09-28 23:50:41,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-28 23:50:43,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:50:45,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:50:48,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-28 23:50:52,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:50:53,792 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=181800.0, ans=0.125 2023-09-28 23:50:55,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:50:56,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:50:58,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-28 23:50:58,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:51:02,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:51:05,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-28 23:51:05,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:51:08,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-28 23:51:12,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-28 23:51:15,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-28 23:51:15,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:51:17,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-28 23:51:22,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:51:22,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-28 23:51:29,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:51:29,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:51:29,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-28 23:51:29,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=181933.33333333334, ans=0.1 2023-09-28 23:51:34,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:51:34,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:51:37,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:51:44,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-28 23:51:44,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-28 23:51:47,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-28 23:51:47,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-28 23:51:51,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:51:53,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:51:53,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=182000.0, ans=0.0 2023-09-28 23:51:54,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:51:56,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:51:56,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-28 23:51:57,420 INFO [train.py:1039] (3/4) Epoch 6, batch 750, loss[loss=0.232, simple_loss=0.3025, pruned_loss=0.08076, over 24488.00 frames. ], tot_loss[loss=0.2396, simple_loss=0.3006, pruned_loss=0.08932, over 4616595.19 frames. ], batch size: 66, lr: 1.75e-02, grad_scale: 32.0 2023-09-28 23:52:02,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-28 23:52:02,147 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-28 23:52:03,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-28 23:52:03,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-28 23:52:05,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-28 23:52:05,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:52:07,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-28 23:52:09,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:52:09,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:52:12,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:52:14,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:52:14,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=182133.33333333334, ans=0.125 2023-09-28 23:52:15,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-28 23:52:15,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:52:17,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:52:18,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:52:21,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:52:24,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:52:25,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:52:25,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-28 23:52:27,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-28 23:52:28,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:52:30,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:52:31,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=182200.0, ans=0.125 2023-09-28 23:52:32,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-28 23:52:32,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-28 23:52:32,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:52:35,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-28 23:52:35,519 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-28 23:52:36,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-28 23:52:37,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:52:37,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-28 23:52:39,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-28 23:52:44,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:52:44,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:52:44,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:52:47,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:52:49,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:52:51,234 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.788e+02 2.168e+02 2.495e+02 2.811e+02 4.815e+02, threshold=4.990e+02, percent-clipped=0.0 2023-09-28 23:52:51,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-28 23:52:51,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:52:52,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-28 23:52:53,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:52:56,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:52:56,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-28 23:52:57,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:53:00,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=182266.66666666666, ans=0.1 2023-09-28 23:53:05,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:53:05,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:53:05,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=182333.33333333334, ans=0.125 2023-09-28 23:53:07,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:53:10,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:53:14,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-28 23:53:14,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:53:14,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:53:16,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:53:18,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:53:21,143 INFO [train.py:1039] (3/4) Epoch 6, batch 800, loss[loss=0.2415, simple_loss=0.3111, pruned_loss=0.08596, over 24651.00 frames. ], tot_loss[loss=0.2403, simple_loss=0.3012, pruned_loss=0.0897, over 4636954.51 frames. ], batch size: 65, lr: 1.75e-02, grad_scale: 32.0 2023-09-28 23:53:21,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:53:21,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-28 23:53:29,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:53:29,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:53:31,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:53:31,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:53:34,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:53:34,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:53:35,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:53:40,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:53:40,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:53:44,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-28 23:53:44,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:53:45,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:53:47,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:53:47,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:53:47,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-28 23:53:47,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:53:47,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-28 23:53:49,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:53:51,718 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:53:53,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=182533.33333333334, ans=0.0 2023-09-28 23:53:54,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:53:54,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:53:55,658 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.04 vs. limit=5.0 2023-09-28 23:53:56,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:53:58,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:54:02,222 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.86 vs. limit=12.0 2023-09-28 23:54:02,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:54:04,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:54:04,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-28 23:54:07,665 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-28 23:54:09,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-28 23:54:09,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-28 23:54:09,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:54:12,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:54:12,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:54:18,754 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-28 23:54:18,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-28 23:54:21,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-28 23:54:23,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-28 23:54:27,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-28 23:54:30,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:54:31,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-28 23:54:33,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-28 23:54:37,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-28 23:54:41,384 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=13.33 vs. limit=22.5 2023-09-28 23:54:42,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:54:43,411 INFO [train.py:1039] (3/4) Epoch 6, batch 850, loss[loss=0.3304, simple_loss=0.3626, pruned_loss=0.1491, over 19546.00 frames. ], tot_loss[loss=0.2404, simple_loss=0.3015, pruned_loss=0.08966, over 4652026.87 frames. ], batch size: 388, lr: 1.75e-02, grad_scale: 32.0 2023-09-28 23:54:45,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:54:45,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-28 23:54:45,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:54:48,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:54:48,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-28 23:54:48,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:54:49,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:54:52,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:54:53,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:54:55,111 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:54:56,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-28 23:54:56,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-28 23:54:58,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-28 23:54:59,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-28 23:55:00,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:55:02,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:55:03,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:55:03,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-28 23:55:08,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:55:08,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:55:10,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-28 23:55:11,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-28 23:55:14,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:55:16,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-28 23:55:21,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-28 23:55:22,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-28 23:55:24,279 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-28 23:55:25,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:55:25,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:55:25,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-28 23:55:27,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:55:29,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:55:29,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-28 23:55:33,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-28 23:55:33,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:55:35,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:55:36,630 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-28 23:55:37,984 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.793e+02 2.156e+02 2.378e+02 2.757e+02 3.805e+02, threshold=4.755e+02, percent-clipped=0.0 2023-09-28 23:55:38,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-28 23:55:39,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-28 23:55:39,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-28 23:55:42,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-28 23:55:42,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:55:45,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-28 23:55:45,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:55:46,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:55:48,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:55:48,662 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.13 vs. limit=15.0 2023-09-28 23:55:50,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-28 23:55:52,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-28 23:55:53,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:55:53,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-28 23:56:00,595 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.21 vs. limit=22.5 2023-09-28 23:56:02,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-28 23:56:03,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:56:05,036 INFO [train.py:1039] (3/4) Epoch 6, batch 900, loss[loss=0.2497, simple_loss=0.3017, pruned_loss=0.0988, over 23246.00 frames. ], tot_loss[loss=0.2417, simple_loss=0.3028, pruned_loss=0.09025, over 4676416.50 frames. ], batch size: 119, lr: 1.75e-02, grad_scale: 32.0 2023-09-28 23:56:05,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-28 23:56:05,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:56:05,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:56:07,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-28 23:56:13,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:56:18,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:56:20,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-28 23:56:21,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:56:23,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-28 23:56:23,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-28 23:56:24,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-28 23:56:24,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:56:25,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-28 23:56:25,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-28 23:56:38,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:56:38,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:56:38,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-28 23:56:42,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:56:47,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-28 23:56:47,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:56:51,547 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=183200.0, ans=0.1 2023-09-28 23:56:52,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-28 23:56:52,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-28 23:56:52,977 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-28 23:56:54,399 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-28 23:57:00,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-28 23:57:00,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-28 23:57:02,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-28 23:57:09,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:57:09,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:57:11,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-28 23:57:11,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-28 23:57:14,226 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-28 23:57:15,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-28 23:57:15,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:57:17,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:57:19,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:57:23,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-28 23:57:23,233 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-28 23:57:24,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-28 23:57:24,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-28 23:57:29,116 INFO [train.py:1039] (3/4) Epoch 6, batch 950, loss[loss=0.2619, simple_loss=0.3066, pruned_loss=0.1086, over 23797.00 frames. ], tot_loss[loss=0.243, simple_loss=0.3036, pruned_loss=0.09117, over 4674596.62 frames. ], batch size: 212, lr: 1.75e-02, grad_scale: 16.0 2023-09-28 23:57:29,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:57:32,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-28 23:57:38,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:57:39,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:57:40,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:57:41,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-28 23:57:43,915 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-28 23:57:46,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:57:47,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=183466.66666666666, ans=0.0 2023-09-28 23:57:48,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:57:50,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:57:50,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-28 23:57:50,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-28 23:57:50,584 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-28 23:57:53,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:57:55,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-28 23:57:55,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:57:59,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:57:59,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-28 23:57:59,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:58:00,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-28 23:58:02,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-28 23:58:05,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:58:06,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-28 23:58:11,452 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:58:11,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:58:15,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-28 23:58:18,462 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-28 23:58:18,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-28 23:58:18,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:58:20,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:58:20,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-28 23:58:25,030 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.674e+02 2.262e+02 2.553e+02 3.079e+02 4.621e+02, threshold=5.106e+02, percent-clipped=0.0 2023-09-28 23:58:25,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-28 23:58:25,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=183600.0, ans=0.09899494936611666 2023-09-28 23:58:26,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-28 23:58:29,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:58:29,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:58:29,868 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-28 23:58:32,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:58:32,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-28 23:58:32,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-28 23:58:36,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-28 23:58:38,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-28 23:58:43,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:58:44,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-28 23:58:44,894 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-28 23:58:49,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-28 23:58:51,487 INFO [train.py:1039] (3/4) Epoch 6, batch 1000, loss[loss=0.2192, simple_loss=0.2622, pruned_loss=0.08812, over 22644.00 frames. ], tot_loss[loss=0.2418, simple_loss=0.3023, pruned_loss=0.09062, over 4685173.36 frames. ], batch size: 322, lr: 1.75e-02, grad_scale: 16.0 2023-09-28 23:58:53,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-28 23:58:54,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:58:59,394 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.72 vs. limit=12.0 2023-09-28 23:59:00,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-28 23:59:01,621 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-28 23:59:01,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-28 23:59:03,667 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=183733.33333333334, ans=0.125 2023-09-28 23:59:05,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=183733.33333333334, ans=0.07 2023-09-28 23:59:08,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:59:08,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-28 23:59:08,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:59:13,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-28 23:59:16,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-28 23:59:16,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=183800.0, ans=0.125 2023-09-28 23:59:19,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-28 23:59:19,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:59:19,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=183800.0, ans=0.2 2023-09-28 23:59:21,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-28 23:59:22,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-28 23:59:22,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-28 23:59:24,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:59:26,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:59:35,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:59:35,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-28 23:59:37,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:59:38,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-28 23:59:38,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-28 23:59:38,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-28 23:59:39,437 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.80 vs. limit=10.0 2023-09-28 23:59:40,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-28 23:59:40,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-28 23:59:41,708 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-28 23:59:44,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-28 23:59:45,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-28 23:59:47,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=183933.33333333334, ans=0.125 2023-09-28 23:59:48,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-28 23:59:50,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-28 23:59:53,820 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-28 23:59:55,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:59:56,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-28 23:59:56,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-28 23:59:58,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:00:00,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 00:00:03,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:00:03,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 00:00:05,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 00:00:07,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:00:07,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:00:09,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:00:12,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:00:12,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=184000.0, ans=0.125 2023-09-29 00:00:14,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=184066.66666666666, ans=0.04949747468305833 2023-09-29 00:00:15,320 INFO [train.py:1039] (3/4) Epoch 6, batch 1050, loss[loss=0.2307, simple_loss=0.2842, pruned_loss=0.08854, over 23665.00 frames. ], tot_loss[loss=0.24, simple_loss=0.3006, pruned_loss=0.0897, over 4693282.63 frames. ], batch size: 232, lr: 1.74e-02, grad_scale: 16.0 2023-09-29 00:00:15,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:00:19,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:00:20,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:00:22,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 00:00:24,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:00:25,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:00:27,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:00:28,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:00:30,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:00:32,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:00:32,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:00:33,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:00:33,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 00:00:36,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:00:36,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 00:00:40,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:00:40,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 00:00:40,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:00:47,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:00:48,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:00:48,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:00:48,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=184200.0, ans=0.125 2023-09-29 00:00:50,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 00:00:52,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 00:00:52,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:00:54,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 00:00:57,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 00:00:57,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=184200.0, ans=0.07 2023-09-29 00:00:58,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:00:59,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=184200.0, ans=0.125 2023-09-29 00:01:00,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 00:01:01,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:01:02,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:01:02,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:01:08,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:01:12,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 00:01:14,308 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 2.070e+02 2.251e+02 2.597e+02 4.023e+02, threshold=4.502e+02, percent-clipped=0.0 2023-09-29 00:01:14,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 00:01:14,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 00:01:16,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:01:16,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:01:17,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 00:01:22,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:01:24,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:01:24,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:01:24,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:01:24,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:01:25,094 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.93 vs. limit=10.0 2023-09-29 00:01:30,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:01:30,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 00:01:32,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:01:32,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 00:01:32,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 00:01:33,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:01:38,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:01:39,985 INFO [train.py:1039] (3/4) Epoch 6, batch 1100, loss[loss=0.2337, simple_loss=0.31, pruned_loss=0.07875, over 24480.00 frames. ], tot_loss[loss=0.239, simple_loss=0.3, pruned_loss=0.08901, over 4708490.24 frames. ], batch size: 66, lr: 1.74e-02, grad_scale: 16.0 2023-09-29 00:01:43,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:01:49,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:01:51,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:01:51,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:01:51,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 00:01:53,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:01:56,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 00:01:57,084 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.96 vs. limit=22.5 2023-09-29 00:01:58,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:02:01,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:02:01,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 00:02:03,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 00:02:04,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:02:04,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:02:07,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:02:09,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:02:14,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:02:17,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 00:02:18,034 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 00:02:20,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:02:23,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:02:23,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:02:25,170 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:02:26,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 00:02:28,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:02:28,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:02:28,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:02:28,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:02:28,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 00:02:35,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:02:35,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 00:02:36,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 00:02:41,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:02:44,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 00:02:44,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 00:02:46,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:02:48,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:02:49,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:02:51,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 00:02:53,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:02:53,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:02:54,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 00:02:55,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:02:57,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 00:02:58,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:02:58,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:03:00,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:03:03,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=184733.33333333334, ans=0.125 2023-09-29 00:03:04,733 INFO [train.py:1039] (3/4) Epoch 6, batch 1150, loss[loss=0.2494, simple_loss=0.3097, pruned_loss=0.09454, over 23310.00 frames. ], tot_loss[loss=0.2392, simple_loss=0.3004, pruned_loss=0.08894, over 4717214.98 frames. ], batch size: 93, lr: 1.74e-02, grad_scale: 16.0 2023-09-29 00:03:06,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:03:10,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:03:11,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:03:11,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:03:11,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 00:03:13,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:03:14,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 00:03:16,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:03:16,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:03:18,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=184733.33333333334, ans=0.125 2023-09-29 00:03:21,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 00:03:24,584 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:03:30,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:03:31,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:03:32,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 00:03:32,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:03:32,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:03:35,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 00:03:37,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:03:38,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:03:43,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=184866.66666666666, ans=0.125 2023-09-29 00:03:43,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=184866.66666666666, ans=0.125 2023-09-29 00:03:48,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:03:56,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:03:56,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 00:03:56,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:03:56,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:04:00,838 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 2.083e+02 2.291e+02 2.736e+02 4.000e+02, threshold=4.583e+02, percent-clipped=0.0 2023-09-29 00:04:03,492 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 00:04:05,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:04:10,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=185000.0, ans=0.125 2023-09-29 00:04:13,894 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=185000.0, ans=0.0 2023-09-29 00:04:15,237 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 00:04:19,061 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:04:21,921 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:04:21,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:04:23,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:04:26,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:04:27,746 INFO [train.py:1039] (3/4) Epoch 6, batch 1200, loss[loss=0.2376, simple_loss=0.2947, pruned_loss=0.09027, over 23810.00 frames. ], tot_loss[loss=0.24, simple_loss=0.3014, pruned_loss=0.08933, over 4716171.12 frames. ], batch size: 164, lr: 1.74e-02, grad_scale: 32.0 2023-09-29 00:04:32,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:04:32,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:04:33,079 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.27 vs. limit=12.0 2023-09-29 00:04:33,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:04:33,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:04:33,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:04:35,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:04:36,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:04:40,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:04:40,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:04:43,773 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 00:04:47,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 00:04:51,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:04:51,879 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=185133.33333333334, ans=0.125 2023-09-29 00:04:54,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:04:56,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=185133.33333333334, ans=0.125 2023-09-29 00:04:58,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:04:59,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:04:59,587 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 00:04:59,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=185200.0, ans=0.0 2023-09-29 00:05:01,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:05:07,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:05:07,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:05:07,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 00:05:07,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:05:11,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 00:05:16,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 00:05:16,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:05:18,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:05:20,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:05:20,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:05:22,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:05:22,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:05:24,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:05:24,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 00:05:24,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:05:25,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:05:25,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:05:27,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=185266.66666666666, ans=0.2 2023-09-29 00:05:29,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:05:29,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:05:30,832 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=185333.33333333334, ans=0.0 2023-09-29 00:05:33,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 00:05:35,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:05:38,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=185333.33333333334, ans=0.05 2023-09-29 00:05:39,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 00:05:42,985 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 00:05:44,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:05:45,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:05:47,251 INFO [train.py:1039] (3/4) Epoch 6, batch 1250, loss[loss=0.2615, simple_loss=0.3063, pruned_loss=0.1083, over 23849.00 frames. ], tot_loss[loss=0.2406, simple_loss=0.3024, pruned_loss=0.0894, over 4726821.34 frames. ], batch size: 195, lr: 1.74e-02, grad_scale: 32.0 2023-09-29 00:05:48,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:05:51,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:05:53,373 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:05:54,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 00:05:57,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:05:58,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:05:59,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 00:06:00,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:06:02,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:06:05,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 00:06:08,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:06:08,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:06:08,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:06:11,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:06:15,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 00:06:15,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 00:06:15,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:06:17,170 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:06:17,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:06:20,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:06:22,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:06:29,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 00:06:30,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:06:35,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:06:35,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 00:06:35,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:06:36,615 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 00:06:36,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:06:36,652 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:06:38,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:06:42,094 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 2.172e+02 2.410e+02 2.804e+02 3.996e+02, threshold=4.819e+02, percent-clipped=0.0 2023-09-29 00:06:42,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:06:42,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:06:43,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 00:06:43,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 00:06:43,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 00:06:48,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:06:49,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 00:06:49,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:06:53,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 00:06:53,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:06:55,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 00:06:56,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 00:06:56,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:06:56,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:06:57,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:06:57,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 00:07:03,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:07:03,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:07:03,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=185666.66666666666, ans=0.125 2023-09-29 00:07:04,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 00:07:06,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 00:07:07,755 INFO [train.py:1039] (3/4) Epoch 6, batch 1300, loss[loss=0.3053, simple_loss=0.3455, pruned_loss=0.1325, over 19726.00 frames. ], tot_loss[loss=0.2419, simple_loss=0.3032, pruned_loss=0.09024, over 4719412.60 frames. ], batch size: 388, lr: 1.74e-02, grad_scale: 32.0 2023-09-29 00:07:11,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:07:11,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 00:07:14,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:07:15,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:07:17,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:07:18,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:07:20,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:07:20,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=185733.33333333334, ans=0.1 2023-09-29 00:07:21,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 00:07:25,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:07:27,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:07:29,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 00:07:33,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:07:37,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:07:37,305 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:07:38,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:07:39,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:07:40,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:07:40,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 00:07:42,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 00:07:48,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:07:48,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:07:49,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 00:07:49,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 00:07:52,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:07:55,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:07:56,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=185933.33333333334, ans=0.125 2023-09-29 00:07:57,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 00:07:58,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:07:58,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 00:07:59,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:08:02,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:08:02,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:08:04,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 00:08:05,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 00:08:07,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 00:08:11,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:08:14,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 00:08:16,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=186000.0, ans=0.1 2023-09-29 00:08:17,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:08:21,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=186000.0, ans=0.1 2023-09-29 00:08:24,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 00:08:27,626 INFO [train.py:1039] (3/4) Epoch 6, batch 1350, loss[loss=0.2008, simple_loss=0.2632, pruned_loss=0.06919, over 24269.00 frames. ], tot_loss[loss=0.2412, simple_loss=0.3028, pruned_loss=0.08984, over 4719301.63 frames. ], batch size: 56, lr: 1.74e-02, grad_scale: 32.0 2023-09-29 00:08:27,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:08:29,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:08:32,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:08:33,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:08:36,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:08:36,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:08:38,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=186066.66666666666, ans=0.0 2023-09-29 00:08:39,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:08:41,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 00:08:41,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=186066.66666666666, ans=0.09899494936611666 2023-09-29 00:08:43,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 00:08:44,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:08:46,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=186133.33333333334, ans=0.5 2023-09-29 00:08:47,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 00:08:49,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:08:50,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:08:50,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 00:08:53,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 00:08:55,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 00:08:58,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:08:58,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 00:08:58,572 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:09:11,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:09:15,010 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=186266.66666666666, ans=0.0 2023-09-29 00:09:21,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:09:22,608 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.690e+02 2.119e+02 2.458e+02 2.800e+02 4.358e+02, threshold=4.916e+02, percent-clipped=0.0 2023-09-29 00:09:22,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:09:22,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 00:09:25,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:09:27,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 00:09:27,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 00:09:28,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:09:31,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:09:33,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 00:09:34,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:09:36,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=186333.33333333334, ans=0.125 2023-09-29 00:09:38,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=186333.33333333334, ans=0.1 2023-09-29 00:09:39,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 00:09:43,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 00:09:48,755 INFO [train.py:1039] (3/4) Epoch 6, batch 1400, loss[loss=0.2427, simple_loss=0.2895, pruned_loss=0.09801, over 23668.00 frames. ], tot_loss[loss=0.2396, simple_loss=0.3012, pruned_loss=0.08897, over 4725655.37 frames. ], batch size: 232, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:09:48,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 00:09:50,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:09:53,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:09:55,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:09:58,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 00:10:00,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 00:10:01,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=186400.0, ans=0.0 2023-09-29 00:10:04,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=186466.66666666666, ans=0.0 2023-09-29 00:10:10,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:10:12,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:10:13,332 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.76 vs. limit=6.0 2023-09-29 00:10:16,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:10:17,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 00:10:21,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:10:22,468 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 00:10:26,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=186533.33333333334, ans=0.125 2023-09-29 00:10:30,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:10:30,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:10:35,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 00:10:36,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:10:36,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:10:39,136 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=12.27 vs. limit=15.0 2023-09-29 00:10:39,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:10:39,828 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:10:41,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:10:42,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:10:42,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:10:44,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 00:10:45,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:10:49,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:10:56,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:11:03,674 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 00:11:05,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 00:11:06,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:11:09,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 00:11:10,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:11:11,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:11:12,497 INFO [train.py:1039] (3/4) Epoch 6, batch 1450, loss[loss=0.2486, simple_loss=0.3163, pruned_loss=0.09049, over 23776.00 frames. ], tot_loss[loss=0.2387, simple_loss=0.3004, pruned_loss=0.08846, over 4728061.63 frames. ], batch size: 85, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:11:14,519 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=186733.33333333334, ans=0.0 2023-09-29 00:11:15,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:11:15,921 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:11:17,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:17,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 00:11:22,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:11:23,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:11:25,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:11:25,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 00:11:27,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:11:27,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 00:11:29,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:30,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:11:30,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 00:11:32,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:11:32,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:11:34,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 00:11:34,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:11:34,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=186800.0, ans=0.95 2023-09-29 00:11:36,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:11:37,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:39,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:11:43,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:11:43,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:11:45,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:11:45,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=186866.66666666666, ans=0.07 2023-09-29 00:11:46,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:48,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:11:48,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:11:48,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:11:49,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:11:51,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=186866.66666666666, ans=0.2 2023-09-29 00:11:53,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 00:11:56,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:12:00,658 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 00:12:02,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:12:02,743 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.25 vs. limit=15.0 2023-09-29 00:12:04,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:12:06,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:12:08,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 00:12:09,774 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.836e+02 2.135e+02 2.497e+02 3.099e+02 5.077e+02, threshold=4.994e+02, percent-clipped=1.0 2023-09-29 00:12:11,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=186933.33333333334, ans=0.07 2023-09-29 00:12:12,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:12:14,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 00:12:14,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 00:12:15,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:12:18,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:12:20,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:12:20,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 00:12:21,330 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.26 vs. limit=22.5 2023-09-29 00:12:23,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 00:12:23,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 00:12:25,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:12:26,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:12:32,990 INFO [train.py:1039] (3/4) Epoch 6, batch 1500, loss[loss=0.2354, simple_loss=0.3033, pruned_loss=0.08373, over 24659.00 frames. ], tot_loss[loss=0.2381, simple_loss=0.3, pruned_loss=0.08811, over 4729093.65 frames. ], batch size: 65, lr: 1.73e-02, grad_scale: 16.0 2023-09-29 00:12:38,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 00:12:38,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:12:38,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:12:40,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:12:41,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:12:42,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:12:43,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 00:12:45,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:12:45,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:12:45,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:12:46,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:12:48,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:12:49,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:12:51,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=187133.33333333334, ans=0.2 2023-09-29 00:12:52,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:12:54,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 00:12:54,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:12:55,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:12:55,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:12:57,639 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=187133.33333333334, ans=0.2 2023-09-29 00:12:58,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 00:13:03,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 00:13:06,794 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:13:06,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 00:13:11,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 00:13:13,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:13:13,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=187200.0, ans=0.125 2023-09-29 00:13:14,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:13:14,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:13:16,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 00:13:16,445 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:13:16,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:13:17,825 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 00:13:17,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:13:18,653 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.45 vs. limit=15.0 2023-09-29 00:13:19,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=187200.0, ans=0.07 2023-09-29 00:13:25,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:13:25,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 00:13:28,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 00:13:31,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:13:35,664 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 00:13:35,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:13:37,111 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 00:13:37,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:13:39,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:13:39,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=187333.33333333334, ans=0.125 2023-09-29 00:13:40,793 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 00:13:42,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:13:46,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 00:13:47,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:13:52,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:13:52,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:13:52,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:13:53,961 INFO [train.py:1039] (3/4) Epoch 6, batch 1550, loss[loss=0.2802, simple_loss=0.3436, pruned_loss=0.1083, over 24044.00 frames. ], tot_loss[loss=0.2384, simple_loss=0.3005, pruned_loss=0.08821, over 4742481.78 frames. ], batch size: 80, lr: 1.73e-02, grad_scale: 16.0 2023-09-29 00:13:54,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:13:54,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:13:55,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 00:13:55,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 00:13:55,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:13:57,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 00:13:57,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 00:14:00,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:14:01,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:14:03,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:14:03,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:14:03,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:14:04,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:14:07,744 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 00:14:07,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:14:07,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:14:09,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:14:10,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:14:10,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 00:14:12,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:14:13,980 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 00:14:14,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 00:14:14,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 00:14:14,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=187466.66666666666, ans=0.2 2023-09-29 00:14:16,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:14:17,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:14:21,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:14:24,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 00:14:24,418 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 00:14:33,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:14:36,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:14:36,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 00:14:36,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:14:36,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 00:14:37,117 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.93 vs. limit=12.0 2023-09-29 00:14:41,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:14:42,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:14:45,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:14:48,949 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.26 vs. limit=10.0 2023-09-29 00:14:49,281 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.701e+02 2.090e+02 2.378e+02 2.713e+02 3.704e+02, threshold=4.756e+02, percent-clipped=0.0 2023-09-29 00:14:49,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:14:49,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:14:49,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 00:14:49,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:14:51,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:14:51,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:14:53,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 00:14:53,747 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 00:14:56,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:15:01,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 00:15:07,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:15:08,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:15:09,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 00:15:10,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:15:12,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:15:12,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:15:12,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:15:13,492 INFO [train.py:1039] (3/4) Epoch 6, batch 1600, loss[loss=0.2119, simple_loss=0.2808, pruned_loss=0.07146, over 24577.00 frames. ], tot_loss[loss=0.2392, simple_loss=0.3013, pruned_loss=0.08861, over 4728429.60 frames. ], batch size: 60, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:15:13,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:15:16,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:15:16,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=187733.33333333334, ans=0.2 2023-09-29 00:15:17,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 00:15:18,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 00:15:21,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 00:15:25,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:15:26,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 00:15:28,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:15:30,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:15:36,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:15:40,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 00:15:41,436 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.65 vs. limit=22.5 2023-09-29 00:15:42,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:15:43,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 00:15:44,297 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=187866.66666666666, ans=0.125 2023-09-29 00:15:45,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:15:45,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 00:15:50,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 00:15:53,617 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=187866.66666666666, ans=0.0 2023-09-29 00:15:53,638 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=187866.66666666666, ans=0.125 2023-09-29 00:15:55,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=187866.66666666666, ans=0.04949747468305833 2023-09-29 00:15:58,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:15:59,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 00:15:59,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:16:00,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=187933.33333333334, ans=0.1 2023-09-29 00:16:01,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:16:01,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:16:05,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 00:16:09,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 00:16:10,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:16:11,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:16:12,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:16:12,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:16:14,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:16:17,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:16:18,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:16:23,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:16:25,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:16:28,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 00:16:28,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:16:28,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 00:16:33,094 INFO [train.py:1039] (3/4) Epoch 6, batch 1650, loss[loss=0.246, simple_loss=0.3036, pruned_loss=0.09424, over 23374.00 frames. ], tot_loss[loss=0.2394, simple_loss=0.3011, pruned_loss=0.08884, over 4695688.97 frames. ], batch size: 106, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:16:35,603 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.35 vs. limit=15.0 2023-09-29 00:16:36,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:16:37,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:16:37,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:16:39,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 00:16:39,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 00:16:39,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 00:16:39,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 00:16:42,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:16:43,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:16:44,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:16:44,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:16:47,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:16:48,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 00:16:51,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:16:51,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:16:51,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:16:51,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:16:53,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 00:16:53,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 00:16:56,968 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=188133.33333333334, ans=0.05 2023-09-29 00:16:58,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:16:58,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=188133.33333333334, ans=0.09899494936611666 2023-09-29 00:16:59,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:16:59,978 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=188133.33333333334, ans=0.0 2023-09-29 00:17:08,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 00:17:10,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:17:12,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 00:17:16,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:17:19,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:17:19,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:17:20,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:17:21,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:17:21,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:17:25,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:17:25,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:17:27,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:17:27,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:17:28,446 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 2.178e+02 2.493e+02 2.802e+02 6.343e+02, threshold=4.987e+02, percent-clipped=2.0 2023-09-29 00:17:28,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:17:28,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:17:31,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:17:33,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 00:17:34,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:17:34,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 00:17:35,021 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=188333.33333333334, ans=0.0 2023-09-29 00:17:37,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 00:17:37,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 00:17:39,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:17:39,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:17:40,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:17:40,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:17:40,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 00:17:45,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:17:46,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:17:46,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:17:50,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 00:17:50,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=188333.33333333334, ans=0.125 2023-09-29 00:17:53,514 INFO [train.py:1039] (3/4) Epoch 6, batch 1700, loss[loss=0.2121, simple_loss=0.2833, pruned_loss=0.07049, over 24319.00 frames. ], tot_loss[loss=0.2396, simple_loss=0.301, pruned_loss=0.08906, over 4699684.28 frames. ], batch size: 61, lr: 1.73e-02, grad_scale: 32.0 2023-09-29 00:17:55,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:17:55,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:17:55,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 00:17:55,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:17:56,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:17:56,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:17:59,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:17:59,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:17:59,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 00:18:02,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:18:08,500 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.89 vs. limit=15.0 2023-09-29 00:18:10,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:18:13,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:18:19,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:18:21,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:18:21,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:18:21,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:18:24,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 00:18:24,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:18:25,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:18:27,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:18:27,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:18:29,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 00:18:30,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 00:18:32,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:18:33,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 00:18:35,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:18:44,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:18:46,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:18:47,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:18:49,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 00:18:49,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 00:18:49,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:18:51,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:18:51,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 00:18:53,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:18:53,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:18:53,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:18:53,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:18:56,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:18:56,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:18:56,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=188600.0, ans=0.125 2023-09-29 00:18:57,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:18:57,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:18:57,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:19:02,636 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:19:05,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 00:19:05,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:19:07,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:19:08,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 00:19:16,075 INFO [train.py:1039] (3/4) Epoch 6, batch 1750, loss[loss=0.2279, simple_loss=0.289, pruned_loss=0.08343, over 23420.00 frames. ], tot_loss[loss=0.2387, simple_loss=0.2994, pruned_loss=0.08902, over 4680662.13 frames. ], batch size: 119, lr: 1.72e-02, grad_scale: 32.0 2023-09-29 00:19:17,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:19:21,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:19:21,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:19:22,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 00:19:22,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:19:23,665 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=6.71 vs. limit=15.0 2023-09-29 00:19:26,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:19:26,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:19:29,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 00:19:32,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:19:34,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=188800.0, ans=0.1 2023-09-29 00:19:35,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 00:19:35,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:19:37,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:19:38,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 00:19:40,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 00:19:43,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:19:43,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 00:19:53,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:19:57,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:19:57,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:20:00,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:20:00,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:20:01,107 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=188866.66666666666, ans=0.1 2023-09-29 00:20:03,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:20:05,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:20:07,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:20:07,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:20:08,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 00:20:10,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:20:10,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=188933.33333333334, ans=0.125 2023-09-29 00:20:12,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 00:20:13,411 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.714e+02 2.157e+02 2.511e+02 2.908e+02 4.872e+02, threshold=5.023e+02, percent-clipped=0.0 2023-09-29 00:20:13,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:20:16,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:20:18,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:20:21,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:20:23,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 00:20:24,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:20:25,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:20:30,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:20:34,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:20:35,736 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:20:37,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 00:20:37,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:20:38,617 INFO [train.py:1039] (3/4) Epoch 6, batch 1800, loss[loss=0.232, simple_loss=0.3086, pruned_loss=0.07765, over 24566.00 frames. ], tot_loss[loss=0.2376, simple_loss=0.2988, pruned_loss=0.08817, over 4689033.48 frames. ], batch size: 71, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:20:38,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:20:38,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:20:38,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:20:38,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:20:40,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:20:42,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:20:43,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:20:45,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 00:20:48,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:20:51,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 00:20:52,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:20:52,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=189133.33333333334, ans=0.5 2023-09-29 00:20:55,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:20:57,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:20:59,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:21:00,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:21:03,454 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:21:03,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 00:21:04,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:21:08,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:21:12,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 00:21:15,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 00:21:15,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 00:21:15,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:21:17,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:21:17,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:21:19,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:21:23,900 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 00:21:25,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:21:28,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:21:29,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 00:21:31,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 00:21:32,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:21:33,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:21:35,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:21:39,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 00:21:40,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=189266.66666666666, ans=0.2 2023-09-29 00:21:47,722 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.30 vs. limit=6.0 2023-09-29 00:21:48,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:21:48,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 00:21:48,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:21:48,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:21:49,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:21:50,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 00:21:53,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:21:53,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:21:56,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 00:21:56,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:21:59,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:21:59,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:21:59,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:21:59,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:22:00,798 INFO [train.py:1039] (3/4) Epoch 6, batch 1850, loss[loss=0.2264, simple_loss=0.29, pruned_loss=0.08137, over 23272.00 frames. ], tot_loss[loss=0.238, simple_loss=0.2992, pruned_loss=0.0884, over 4699495.41 frames. ], batch size: 105, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:22:00,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:22:02,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:22:02,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:22:06,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:22:06,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:22:14,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:22:16,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 00:22:18,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 00:22:20,865 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.44 vs. limit=10.0 2023-09-29 00:22:22,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 00:22:25,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:22:27,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 00:22:27,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 00:22:31,239 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.17 vs. limit=15.0 2023-09-29 00:22:36,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:22:40,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 00:22:41,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:22:43,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:22:48,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 00:22:49,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:22:49,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:22:49,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:22:53,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:22:56,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:22:58,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=189600.0, ans=0.2 2023-09-29 00:22:59,536 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 2.153e+02 2.382e+02 2.790e+02 3.964e+02, threshold=4.764e+02, percent-clipped=0.0 2023-09-29 00:22:59,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:22:59,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:22:59,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 00:23:01,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:23:02,753 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:23:02,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:23:03,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=189600.0, ans=0.125 2023-09-29 00:23:06,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 00:23:07,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:23:10,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:23:10,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:23:10,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 00:23:10,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 00:23:14,249 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 00:23:14,399 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 00:23:17,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:23:17,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:23:17,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:23:17,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:23:18,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=189666.66666666666, ans=0.0 2023-09-29 00:23:19,341 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 00:23:19,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:23:19,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:23:19,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:23:21,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:23:22,538 INFO [train.py:1039] (3/4) Epoch 6, batch 1900, loss[loss=0.2209, simple_loss=0.2789, pruned_loss=0.08152, over 24472.00 frames. ], tot_loss[loss=0.2382, simple_loss=0.2997, pruned_loss=0.08835, over 4716296.22 frames. ], batch size: 58, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:23:22,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:23:22,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 00:23:26,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:23:26,237 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 00:23:26,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:23:28,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:23:32,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:23:35,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:23:36,829 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.35 vs. limit=22.5 2023-09-29 00:23:37,344 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 00:23:37,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 00:23:39,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:23:40,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:23:40,611 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 00:23:40,654 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 00:23:45,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 00:23:45,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=189800.0, ans=0.125 2023-09-29 00:23:47,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:23:50,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 00:23:53,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 00:24:04,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 00:24:07,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 00:24:07,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:24:07,233 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 00:24:07,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 00:24:07,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 00:24:08,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 00:24:08,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:24:09,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=189866.66666666666, ans=0.0 2023-09-29 00:24:13,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 00:24:14,343 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.28 vs. limit=15.0 2023-09-29 00:24:17,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:24:20,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:24:20,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 00:24:24,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:24:26,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=190000.0, ans=0.125 2023-09-29 00:24:27,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 00:24:27,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:24:33,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:24:33,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:24:33,991 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:24:34,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:24:36,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:24:37,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 00:24:37,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:24:39,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:24:39,255 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:24:42,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:24:42,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:24:42,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:24:45,070 INFO [train.py:1039] (3/4) Epoch 6, batch 1950, loss[loss=0.2306, simple_loss=0.2891, pruned_loss=0.0861, over 23515.00 frames. ], tot_loss[loss=0.2383, simple_loss=0.2997, pruned_loss=0.08843, over 4722477.31 frames. ], batch size: 134, lr: 1.72e-02, grad_scale: 8.0 2023-09-29 00:24:45,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:24:49,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:24:51,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:24:51,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:24:51,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:24:52,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 00:24:53,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=190066.66666666666, ans=0.0 2023-09-29 00:24:54,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 00:24:54,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:24:56,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:24:58,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:24:59,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:24:59,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:25:01,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:25:06,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:25:08,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:25:08,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:25:08,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:25:11,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:25:14,054 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.39 vs. limit=22.5 2023-09-29 00:25:14,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:25:14,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:25:14,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 00:25:14,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 00:25:14,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:25:14,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:25:16,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:25:20,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:25:22,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:25:26,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:25:30,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:25:30,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:25:32,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 00:25:32,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:25:33,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.max_positive, batch_count=190266.66666666666, ans=0.95 2023-09-29 00:25:37,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:25:40,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:25:40,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:25:46,169 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.758e+02 2.214e+02 2.605e+02 2.904e+02 4.592e+02, threshold=5.209e+02, percent-clipped=0.0 2023-09-29 00:25:49,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:25:51,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:25:51,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=190333.33333333334, ans=0.0 2023-09-29 00:25:52,040 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.44 vs. limit=6.0 2023-09-29 00:25:53,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:25:56,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:25:57,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=190333.33333333334, ans=0.0 2023-09-29 00:25:59,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:26:00,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:26:00,195 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=190333.33333333334, ans=0.035 2023-09-29 00:26:00,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=190333.33333333334, ans=0.125 2023-09-29 00:26:00,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=190333.33333333334, ans=0.125 2023-09-29 00:26:00,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=190333.33333333334, ans=0.0 2023-09-29 00:26:01,440 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 00:26:01,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:26:01,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:26:03,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 00:26:03,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=190333.33333333334, ans=0.2 2023-09-29 00:26:06,581 INFO [train.py:1039] (3/4) Epoch 6, batch 2000, loss[loss=0.2459, simple_loss=0.2966, pruned_loss=0.09756, over 23720.00 frames. ], tot_loss[loss=0.2379, simple_loss=0.2994, pruned_loss=0.08818, over 4724882.99 frames. ], batch size: 232, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:26:06,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:26:08,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=190400.0, ans=0.125 2023-09-29 00:26:09,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:26:09,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:26:11,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:26:13,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:26:15,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:26:17,854 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.99 vs. limit=6.0 2023-09-29 00:26:18,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 00:26:18,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:26:23,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:26:25,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 00:26:26,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:26:26,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:26:29,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:26:30,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 00:26:32,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:26:35,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:26:35,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:26:35,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 00:26:36,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:26:38,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 00:26:38,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:26:41,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:26:41,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 00:26:41,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:26:42,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:26:44,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:26:46,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 00:26:49,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 00:26:49,267 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:26:50,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:26:55,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:26:56,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:26:57,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:26:58,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:26:59,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:27:00,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:27:01,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:27:01,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:27:03,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:06,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:27:08,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 00:27:08,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=190600.0, ans=0.125 2023-09-29 00:27:15,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:27:15,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:27:18,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:27:18,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:27:22,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:23,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:27:23,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:25,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:27:25,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:27:27,652 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=190733.33333333334, ans=0.0 2023-09-29 00:27:29,356 INFO [train.py:1039] (3/4) Epoch 6, batch 2050, loss[loss=0.2154, simple_loss=0.2599, pruned_loss=0.08538, over 23541.00 frames. ], tot_loss[loss=0.2374, simple_loss=0.2987, pruned_loss=0.08799, over 4724000.71 frames. ], batch size: 285, lr: 1.72e-02, grad_scale: 16.0 2023-09-29 00:27:29,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:27:30,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:32,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:27:32,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:38,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:27:38,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=190733.33333333334, ans=0.2 2023-09-29 00:27:40,259 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:27:40,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:27:41,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:27:45,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 00:27:45,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:27:47,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:27:47,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:27:50,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=190800.0, ans=0.5 2023-09-29 00:27:56,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:27:56,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:27:59,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=190800.0, ans=0.125 2023-09-29 00:28:00,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 00:28:03,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:28:05,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 00:28:05,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:28:07,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:28:10,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:28:11,087 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.43 vs. limit=22.5 2023-09-29 00:28:11,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:28:12,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:28:13,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=190866.66666666666, ans=0.125 2023-09-29 00:28:14,548 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:28:16,026 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:28:16,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:28:19,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:28:21,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:28:24,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:28:25,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:28:29,256 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.749e+02 2.164e+02 2.484e+02 2.839e+02 4.579e+02, threshold=4.968e+02, percent-clipped=0.0 2023-09-29 00:28:29,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:28:35,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:28:35,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 00:28:41,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:28:42,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:28:44,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:28:47,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 00:28:50,062 INFO [train.py:1039] (3/4) Epoch 6, batch 2100, loss[loss=0.2422, simple_loss=0.3117, pruned_loss=0.08633, over 24582.00 frames. ], tot_loss[loss=0.2366, simple_loss=0.2978, pruned_loss=0.08772, over 4731743.57 frames. ], batch size: 71, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:28:50,289 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 00:28:50,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:28:50,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:28:51,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:28:53,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:28:53,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 00:28:53,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 00:28:54,323 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=191066.66666666666, ans=0.125 2023-09-29 00:28:56,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:28:59,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:28:59,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:29:03,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:29:05,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:29:05,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 00:29:05,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:29:07,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 00:29:07,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 00:29:08,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:29:09,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:29:09,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 00:29:10,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 00:29:15,284 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 00:29:15,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:29:19,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:29:21,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:29:21,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=191200.0, ans=0.0 2023-09-29 00:29:22,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:29:24,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 00:29:24,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:29:24,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 00:29:27,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 00:29:29,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:29:29,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 00:29:29,526 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 00:29:29,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 00:29:32,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:29:34,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:29:37,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:29:37,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:29:40,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:29:42,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:29:42,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 00:29:42,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:29:42,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:29:43,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:29:43,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 00:29:45,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 00:29:45,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 00:29:48,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:29:52,902 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:29:52,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 00:29:53,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=191266.66666666666, ans=0.07 2023-09-29 00:29:59,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:30:02,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:30:02,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:30:02,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:30:02,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 00:30:04,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:30:06,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:30:06,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:30:07,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:30:07,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:09,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 00:30:10,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 00:30:10,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:30:11,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=191400.0, ans=0.0 2023-09-29 00:30:12,258 INFO [train.py:1039] (3/4) Epoch 6, batch 2150, loss[loss=0.2368, simple_loss=0.2989, pruned_loss=0.08736, over 23150.00 frames. ], tot_loss[loss=0.2371, simple_loss=0.298, pruned_loss=0.08806, over 4723006.43 frames. ], batch size: 93, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:30:14,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:30:14,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:30:14,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:30:14,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:30:21,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 00:30:24,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:30:24,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:25,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:30:25,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:30:25,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:30:27,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=191466.66666666666, ans=0.0 2023-09-29 00:30:29,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:30,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:30:30,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:30:32,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=191466.66666666666, ans=0.1 2023-09-29 00:30:34,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:30:34,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 00:30:39,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:30:40,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:30:41,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:30:42,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:30:42,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:30:42,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:30:42,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=191466.66666666666, ans=0.125 2023-09-29 00:30:44,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:30:44,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:30:45,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:30:47,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 00:30:47,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:30:49,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:49,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:30:51,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:30:52,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:30:53,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_abs, batch_count=191533.33333333334, ans=0.5 2023-09-29 00:30:53,667 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=23.33 vs. limit=22.5 2023-09-29 00:30:54,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:30:55,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:30:57,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:30:57,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 00:30:57,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 00:31:00,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:31:00,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:02,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:31:02,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:31:02,959 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.42 vs. limit=15.0 2023-09-29 00:31:03,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:05,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:05,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 00:31:07,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 00:31:07,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:31:07,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=191600.0, ans=0.1 2023-09-29 00:31:08,636 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 00:31:09,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:10,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:31:12,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 00:31:12,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:31:12,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 00:31:12,169 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 00:31:12,170 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 00:31:13,510 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 2.175e+02 2.371e+02 2.778e+02 4.132e+02, threshold=4.742e+02, percent-clipped=0.0 2023-09-29 00:31:13,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 00:31:13,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=191600.0, ans=0.125 2023-09-29 00:31:15,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:15,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:31:16,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:31:16,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:18,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 00:31:19,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:19,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:30,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:31:31,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 00:31:34,638 INFO [train.py:1039] (3/4) Epoch 6, batch 2200, loss[loss=0.2106, simple_loss=0.2853, pruned_loss=0.06788, over 24457.00 frames. ], tot_loss[loss=0.2373, simple_loss=0.2979, pruned_loss=0.08829, over 4719314.94 frames. ], batch size: 66, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:31:34,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:31:37,159 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.52 vs. limit=10.0 2023-09-29 00:31:39,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:31:40,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:31:40,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:31:44,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:31:44,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=191733.33333333334, ans=0.1 2023-09-29 00:31:46,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:31:47,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:31:47,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 00:31:50,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=191800.0, ans=0.09899494936611666 2023-09-29 00:31:54,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 00:31:55,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:31:56,590 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.27 vs. limit=22.5 2023-09-29 00:32:01,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 00:32:05,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:32:07,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:32:07,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:32:11,091 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.73 vs. limit=15.0 2023-09-29 00:32:11,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:32:11,869 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 00:32:15,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=191866.66666666666, ans=0.0 2023-09-29 00:32:16,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:32:16,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:32:18,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 00:32:21,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:32:23,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:32:25,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:32:26,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:32:28,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 00:32:29,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:32:30,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 00:32:32,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:32:32,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 00:32:32,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:32:32,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=191933.33333333334, ans=0.1 2023-09-29 00:32:32,982 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.54 vs. limit=15.0 2023-09-29 00:32:36,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:32:37,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:32:37,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:32:37,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:32:39,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:32:40,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:32:42,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 00:32:45,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 00:32:45,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:32:47,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:32:48,694 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 00:32:49,493 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.95 vs. limit=22.5 2023-09-29 00:32:50,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:32:51,724 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 00:32:51,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:32:51,951 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 00:32:53,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:32:55,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 00:32:56,882 INFO [train.py:1039] (3/4) Epoch 6, batch 2250, loss[loss=0.2453, simple_loss=0.3056, pruned_loss=0.09248, over 23229.00 frames. ], tot_loss[loss=0.2368, simple_loss=0.298, pruned_loss=0.08778, over 4737297.88 frames. ], batch size: 93, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:32:59,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:32:59,186 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 00:33:00,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:33:04,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:33:11,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:33:11,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:33:14,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:33:15,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:33:17,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:33:17,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=192133.33333333334, ans=0.0 2023-09-29 00:33:20,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 00:33:20,411 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:33:20,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:33:23,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 00:33:23,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:33:24,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:33:26,305 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:33:31,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:33:33,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 00:33:33,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:33:35,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 00:33:36,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:33:40,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:33:44,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:33:45,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:33:47,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:33:47,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:33:48,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:33:49,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=192266.66666666666, ans=0.2 2023-09-29 00:33:50,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:33:54,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:33:56,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 00:33:57,692 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 2.089e+02 2.370e+02 2.766e+02 4.098e+02, threshold=4.740e+02, percent-clipped=0.0 2023-09-29 00:33:58,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=192266.66666666666, ans=0.125 2023-09-29 00:34:02,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 00:34:04,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:34:04,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:34:09,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 00:34:13,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:34:13,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 00:34:14,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:34:14,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:34:18,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 00:34:19,576 INFO [train.py:1039] (3/4) Epoch 6, batch 2300, loss[loss=0.1983, simple_loss=0.2683, pruned_loss=0.06416, over 24264.00 frames. ], tot_loss[loss=0.2396, simple_loss=0.3001, pruned_loss=0.08954, over 4721598.79 frames. ], batch size: 56, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:34:19,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:34:19,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:34:25,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:34:25,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:34:27,439 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 00:34:27,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=192400.0, ans=0.2 2023-09-29 00:34:30,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:34:38,014 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=15.49 vs. limit=15.0 2023-09-29 00:34:38,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:34:38,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 00:34:38,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:34:40,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:34:40,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 00:34:41,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:34:46,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:34:46,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:34:50,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:34:54,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:34:57,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:35:01,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:35:03,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:35:06,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:35:07,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:35:11,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:35:11,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:35:13,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:35:13,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 00:35:16,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 00:35:16,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:35:16,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:35:16,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:35:18,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:35:19,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 00:35:19,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 00:35:19,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 00:35:22,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:35:22,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:35:22,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 00:35:30,225 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:35:31,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:35:37,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:35:37,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:35:39,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 00:35:40,483 INFO [train.py:1039] (3/4) Epoch 6, batch 2350, loss[loss=0.2513, simple_loss=0.323, pruned_loss=0.08976, over 24359.00 frames. ], tot_loss[loss=0.2415, simple_loss=0.3018, pruned_loss=0.09058, over 4713347.23 frames. ], batch size: 77, lr: 1.71e-02, grad_scale: 16.0 2023-09-29 00:35:40,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:35:40,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:35:42,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:35:44,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 00:35:48,410 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.88 vs. limit=22.5 2023-09-29 00:35:50,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:35:50,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 00:35:57,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 00:35:59,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:36:01,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:36:01,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:36:01,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:36:03,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:36:03,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 00:36:07,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:36:12,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 00:36:14,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:36:16,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:36:16,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:36:19,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:36:22,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 00:36:22,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:36:23,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:36:23,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:36:23,948 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=192866.66666666666, ans=0.015 2023-09-29 00:36:25,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:36:27,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=192866.66666666666, ans=0.0 2023-09-29 00:36:30,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:36:32,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 00:36:33,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:36:36,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:36:37,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:36:39,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 00:36:39,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:36:41,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 00:36:41,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:36:42,242 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.670e+02 2.125e+02 2.359e+02 2.805e+02 3.859e+02, threshold=4.718e+02, percent-clipped=0.0 2023-09-29 00:36:42,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=192933.33333333334, ans=0.125 2023-09-29 00:36:46,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 00:36:49,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 00:36:50,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:36:50,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 00:36:50,840 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 00:36:52,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 00:36:52,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 00:36:56,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:37:01,894 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:37:03,293 INFO [train.py:1039] (3/4) Epoch 6, batch 2400, loss[loss=0.1935, simple_loss=0.2703, pruned_loss=0.0583, over 24353.00 frames. ], tot_loss[loss=0.2408, simple_loss=0.3009, pruned_loss=0.09036, over 4692609.23 frames. ], batch size: 61, lr: 1.71e-02, grad_scale: 32.0 2023-09-29 00:37:06,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:37:08,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:37:10,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 00:37:10,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 00:37:17,004 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 00:37:17,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:37:20,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 00:37:22,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:37:23,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:37:23,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 00:37:28,750 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=24.27 vs. limit=22.5 2023-09-29 00:37:30,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:37:32,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 00:37:37,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:37:40,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 00:37:43,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:37:45,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:37:49,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:37:50,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 00:37:50,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:37:59,284 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:38:02,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:38:05,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:38:05,454 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:38:05,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 00:38:05,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:38:05,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:38:06,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:38:07,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 00:38:08,029 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.44 vs. limit=15.0 2023-09-29 00:38:12,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:38:12,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:38:12,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 00:38:14,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 00:38:18,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:38:18,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:38:18,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 00:38:19,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 00:38:19,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 00:38:19,792 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 00:38:21,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 00:38:21,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:38:23,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:38:24,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:38:25,942 INFO [train.py:1039] (3/4) Epoch 6, batch 2450, loss[loss=0.2179, simple_loss=0.2937, pruned_loss=0.07102, over 24644.00 frames. ], tot_loss[loss=0.2391, simple_loss=0.2992, pruned_loss=0.08949, over 4694397.42 frames. ], batch size: 68, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:38:26,662 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 00:38:26,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:38:28,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 00:38:29,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=193400.0, ans=0.125 2023-09-29 00:38:32,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:38:32,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:38:35,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:38:37,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:38:37,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 00:38:39,398 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.40 vs. limit=15.0 2023-09-29 00:38:43,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:38:43,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:38:48,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:38:48,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:38:48,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:38:49,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 00:38:49,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=193466.66666666666, ans=0.125 2023-09-29 00:38:53,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:38:55,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:38:56,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:38:59,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:38:59,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:39:01,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:39:03,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:39:04,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 00:39:04,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:39:14,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:39:15,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:39:16,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:39:16,681 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:39:16,859 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:39:18,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:39:18,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:39:18,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 00:39:23,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:39:23,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:39:25,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=193600.0, ans=0.125 2023-09-29 00:39:26,706 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 2.234e+02 2.563e+02 3.066e+02 5.570e+02, threshold=5.125e+02, percent-clipped=5.0 2023-09-29 00:39:26,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:39:26,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:39:31,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:39:32,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 00:39:34,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:39:34,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:39:34,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 00:39:34,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:39:36,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:39:39,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:39:42,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:39:42,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:39:43,186 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=193666.66666666666, ans=0.0 2023-09-29 00:39:45,689 INFO [train.py:1039] (3/4) Epoch 6, batch 2500, loss[loss=0.2283, simple_loss=0.3039, pruned_loss=0.07633, over 24482.00 frames. ], tot_loss[loss=0.2381, simple_loss=0.299, pruned_loss=0.08864, over 4716128.01 frames. ], batch size: 69, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:39:46,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 00:39:46,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=193733.33333333334, ans=0.2 2023-09-29 00:39:47,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 00:39:52,884 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.43 vs. limit=15.0 2023-09-29 00:39:54,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:40:04,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:40:04,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=193800.0, ans=0.1 2023-09-29 00:40:05,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:40:07,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:40:07,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 00:40:08,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=193800.0, ans=15.0 2023-09-29 00:40:12,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:40:14,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:40:15,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 00:40:15,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 00:40:15,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 00:40:17,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:40:18,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:40:18,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 00:40:20,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:40:20,451 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 00:40:20,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=193866.66666666666, ans=0.125 2023-09-29 00:40:21,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:40:25,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:40:26,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:40:30,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 00:40:30,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 00:40:32,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:40:33,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:40:37,526 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:40:42,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:40:45,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:40:46,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=193933.33333333334, ans=0.125 2023-09-29 00:40:49,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 00:40:52,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 00:40:52,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:40:52,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 00:40:55,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:40:55,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 00:40:56,693 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 00:40:56,694 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 00:40:56,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 00:40:59,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:40:59,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=194000.0, ans=0.2 2023-09-29 00:41:03,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 00:41:03,322 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 00:41:04,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:41:04,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 00:41:05,106 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=194000.0, ans=0.5 2023-09-29 00:41:06,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=194066.66666666666, ans=0.0 2023-09-29 00:41:07,750 INFO [train.py:1039] (3/4) Epoch 6, batch 2550, loss[loss=0.2103, simple_loss=0.2728, pruned_loss=0.07393, over 24303.00 frames. ], tot_loss[loss=0.2373, simple_loss=0.2985, pruned_loss=0.08805, over 4718189.39 frames. ], batch size: 56, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:41:09,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 00:41:11,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:41:13,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:41:13,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=194066.66666666666, ans=0.2 2023-09-29 00:41:14,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:41:16,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:41:17,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 00:41:18,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:41:21,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 00:41:22,995 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:41:25,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:41:27,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:41:27,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 00:41:27,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:41:27,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:41:29,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:41:29,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=194133.33333333334, ans=0.1 2023-09-29 00:41:30,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:41:30,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 00:41:32,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 00:41:32,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:41:32,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 00:41:44,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:41:51,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:41:51,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:41:51,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:41:53,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 00:42:00,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:42:02,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 00:42:02,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:42:02,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:42:04,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:42:04,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:42:09,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:42:09,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:42:10,624 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.774e+02 2.103e+02 2.352e+02 2.955e+02 4.902e+02, threshold=4.704e+02, percent-clipped=0.0 2023-09-29 00:42:14,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:42:14,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 00:42:14,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:42:14,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=194333.33333333334, ans=0.015 2023-09-29 00:42:15,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:42:16,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=194333.33333333334, ans=0.125 2023-09-29 00:42:17,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 00:42:18,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:42:18,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:42:26,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:42:27,712 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:42:30,611 INFO [train.py:1039] (3/4) Epoch 6, batch 2600, loss[loss=0.2061, simple_loss=0.2714, pruned_loss=0.07038, over 24304.00 frames. ], tot_loss[loss=0.2371, simple_loss=0.2986, pruned_loss=0.08776, over 4729251.44 frames. ], batch size: 56, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:42:31,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=194400.0, ans=0.2 2023-09-29 00:42:32,254 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 00:42:35,220 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 00:42:35,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:42:35,322 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 00:42:37,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 00:42:37,425 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 00:42:39,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:42:39,272 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 00:42:41,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 00:42:41,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=194400.0, ans=0.2 2023-09-29 00:42:42,811 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 00:42:45,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:42:47,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 00:42:47,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 00:42:48,845 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:42:48,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 00:42:52,039 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 00:42:53,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 00:42:54,216 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.66 vs. limit=15.0 2023-09-29 00:43:03,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:43:03,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:43:05,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:43:05,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 00:43:05,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=194533.33333333334, ans=0.025 2023-09-29 00:43:06,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:43:12,016 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 00:43:18,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:43:20,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:43:20,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 00:43:20,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:43:20,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:43:21,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 00:43:25,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:43:25,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:43:27,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:43:31,071 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 00:43:32,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:43:32,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:43:40,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:43:41,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:43:41,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 00:43:41,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:43:44,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:43:44,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:43:52,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 00:43:52,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:43:53,558 INFO [train.py:1039] (3/4) Epoch 6, batch 2650, loss[loss=0.1998, simple_loss=0.2754, pruned_loss=0.06212, over 24490.00 frames. ], tot_loss[loss=0.2366, simple_loss=0.2983, pruned_loss=0.08746, over 4733737.53 frames. ], batch size: 63, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:43:53,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 00:43:58,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 00:43:58,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:43:59,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:44:01,203 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 00:44:01,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:44:04,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:44:08,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 00:44:10,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:44:12,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:44:14,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 00:44:14,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:44:15,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:44:15,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=194800.0, ans=0.1 2023-09-29 00:44:17,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 00:44:18,647 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 00:44:19,011 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=194800.0, ans=0.0 2023-09-29 00:44:21,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:44:21,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 00:44:23,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:44:23,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 00:44:27,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:44:29,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 00:44:29,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:44:29,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:44:34,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 00:44:34,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 00:44:36,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:44:39,722 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 00:44:39,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:44:41,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:44:41,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:44:43,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:44:43,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:44:46,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:44:49,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:44:49,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:44:49,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:44:51,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:44:52,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:44:54,156 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 2.176e+02 2.610e+02 3.276e+02 6.463e+02, threshold=5.220e+02, percent-clipped=8.0 2023-09-29 00:44:54,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:44:54,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:44:56,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:44:56,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 00:44:56,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=195000.0, ans=0.125 2023-09-29 00:45:03,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:45:03,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:45:03,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:45:03,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 00:45:05,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=195000.0, ans=0.0 2023-09-29 00:45:06,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:45:08,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:45:09,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:45:11,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:45:12,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 00:45:12,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:45:14,227 INFO [train.py:1039] (3/4) Epoch 6, batch 2700, loss[loss=0.2055, simple_loss=0.2683, pruned_loss=0.07139, over 24310.00 frames. ], tot_loss[loss=0.2381, simple_loss=0.2995, pruned_loss=0.08838, over 4732996.69 frames. ], batch size: 56, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:45:15,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:45:15,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 00:45:19,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:45:21,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 00:45:21,487 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=195066.66666666666, ans=0.1 2023-09-29 00:45:22,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:45:22,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:45:22,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:45:24,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:45:24,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:45:24,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:45:24,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 00:45:24,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 00:45:24,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:45:26,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:45:28,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:45:28,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=195066.66666666666, ans=0.125 2023-09-29 00:45:29,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:45:33,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:45:34,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 00:45:36,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:45:36,892 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.11 vs. limit=15.0 2023-09-29 00:45:42,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:45:42,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:45:48,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:45:48,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:45:48,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:45:48,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:45:53,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:45:57,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:45:57,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:45:57,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:45:59,091 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=195200.0, ans=0.1 2023-09-29 00:46:01,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:46:01,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:46:10,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:46:12,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:46:12,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=195266.66666666666, ans=0.125 2023-09-29 00:46:15,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:46:15,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:46:18,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:46:18,653 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:46:19,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:46:19,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:46:21,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:22,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:46:23,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:46:26,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:46:26,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:46:26,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:46:29,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 00:46:29,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:46:33,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:46:33,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 00:46:36,437 INFO [train.py:1039] (3/4) Epoch 6, batch 2750, loss[loss=0.1979, simple_loss=0.2623, pruned_loss=0.06676, over 24263.00 frames. ], tot_loss[loss=0.2371, simple_loss=0.2989, pruned_loss=0.08762, over 4728418.57 frames. ], batch size: 56, lr: 1.70e-02, grad_scale: 16.0 2023-09-29 00:46:36,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 00:46:36,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:46:40,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:46:40,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:46:41,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:41,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 00:46:42,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:45,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:46:45,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 00:46:46,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:46:46,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:46,673 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 00:46:46,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:46:46,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:46:54,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 00:46:55,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:46:55,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:46:55,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:46:57,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 00:46:57,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=195466.66666666666, ans=0.0 2023-09-29 00:46:59,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:46:59,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:47:01,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:47:01,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:47:05,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 00:47:05,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 00:47:07,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 00:47:07,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:47:10,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:47:14,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=195533.33333333334, ans=0.125 2023-09-29 00:47:18,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:47:20,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 00:47:20,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:47:26,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:47:26,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:47:26,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:47:33,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:47:33,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:47:33,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 00:47:38,282 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.719e+02 2.212e+02 2.511e+02 3.083e+02 4.520e+02, threshold=5.022e+02, percent-clipped=0.0 2023-09-29 00:47:39,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:47:41,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 00:47:48,842 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 00:47:50,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:47:50,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 00:47:52,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:47:53,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:47:53,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 00:47:53,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:47:56,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 00:47:58,243 INFO [train.py:1039] (3/4) Epoch 6, batch 2800, loss[loss=0.2265, simple_loss=0.2862, pruned_loss=0.08335, over 24616.00 frames. ], tot_loss[loss=0.2354, simple_loss=0.2971, pruned_loss=0.08686, over 4730134.90 frames. ], batch size: 60, lr: 1.70e-02, grad_scale: 32.0 2023-09-29 00:47:58,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:47:58,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:47:59,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 00:47:59,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:47:59,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:48:03,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:48:03,145 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 00:48:03,146 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 00:48:07,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:48:09,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:48:09,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:48:14,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:48:15,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 00:48:19,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 00:48:20,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 00:48:21,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:48:21,640 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.28 vs. limit=6.0 2023-09-29 00:48:22,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:48:22,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:48:24,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:48:26,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:48:26,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:48:27,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:48:35,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:48:37,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:48:38,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:48:40,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:48:42,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:48:47,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:48:47,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 00:48:49,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:48:49,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:48:49,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:48:56,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:48:56,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=195933.33333333334, ans=0.125 2023-09-29 00:48:57,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:48:59,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:49:01,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:49:02,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:49:02,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 00:49:03,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 00:49:03,292 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=196000.0, ans=0.125 2023-09-29 00:49:04,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 00:49:06,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:49:06,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 00:49:06,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:49:06,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=196000.0, ans=0.0 2023-09-29 00:49:07,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:49:07,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:49:09,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 00:49:10,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:49:10,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:49:11,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=196000.0, ans=0.125 2023-09-29 00:49:12,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:49:13,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 00:49:19,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:49:19,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:49:19,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:49:21,350 INFO [train.py:1039] (3/4) Epoch 6, batch 2850, loss[loss=0.2418, simple_loss=0.28, pruned_loss=0.1018, over 23382.00 frames. ], tot_loss[loss=0.2339, simple_loss=0.2959, pruned_loss=0.08602, over 4727460.93 frames. ], batch size: 285, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:49:23,126 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:49:26,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:49:26,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:49:26,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:49:29,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:49:29,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:49:32,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:49:33,411 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 00:49:39,548 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 00:49:39,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:49:41,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 00:49:41,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:49:44,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 00:49:44,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 00:49:47,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:49:47,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=196133.33333333334, ans=0.125 2023-09-29 00:49:59,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:49:59,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:50:01,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 00:50:01,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 00:50:01,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 00:50:01,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 00:50:02,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:50:02,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 00:50:04,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=196200.0, ans=0.125 2023-09-29 00:50:07,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 00:50:07,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:50:07,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:50:07,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=196200.0, ans=0.04949747468305833 2023-09-29 00:50:09,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:50:12,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:50:12,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:50:12,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:50:15,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:50:15,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:50:17,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:50:18,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:50:19,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=196266.66666666666, ans=0.07 2023-09-29 00:50:21,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:50:24,238 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.740e+02 2.055e+02 2.329e+02 2.690e+02 4.548e+02, threshold=4.658e+02, percent-clipped=0.0 2023-09-29 00:50:27,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:50:31,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 00:50:31,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 00:50:32,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 00:50:32,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:50:32,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 00:50:34,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:50:34,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:50:34,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:50:34,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 00:50:34,541 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 00:50:34,635 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 00:50:34,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:50:36,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:50:42,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:50:42,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:50:42,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:50:43,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=196400.0, ans=0.125 2023-09-29 00:50:44,192 INFO [train.py:1039] (3/4) Epoch 6, batch 2900, loss[loss=0.2234, simple_loss=0.3029, pruned_loss=0.07197, over 24553.00 frames. ], tot_loss[loss=0.2342, simple_loss=0.2957, pruned_loss=0.08629, over 4724527.14 frames. ], batch size: 71, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:50:44,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 00:50:47,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:50:47,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 00:50:47,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 00:50:50,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 00:50:50,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:50:51,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=196400.0, ans=0.0 2023-09-29 00:50:51,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=196400.0, ans=0.0 2023-09-29 00:50:52,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:50:55,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:50:59,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 00:50:59,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:51:02,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 00:51:02,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 00:51:04,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:51:06,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:51:09,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 00:51:10,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 00:51:14,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:51:14,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 00:51:14,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:51:17,704 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:51:17,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 00:51:20,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:51:20,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:51:22,844 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=196533.33333333334, ans=10.0 2023-09-29 00:51:23,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:51:24,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=196533.33333333334, ans=0.125 2023-09-29 00:51:25,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:51:27,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 00:51:27,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 00:51:27,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:51:28,319 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer_na.min_abs, batch_count=196533.33333333334, ans=0.02 2023-09-29 00:51:32,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:51:35,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 00:51:36,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 00:51:43,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:51:45,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=196600.0, ans=0.125 2023-09-29 00:51:52,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 00:51:52,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 00:51:54,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 00:51:57,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:51:57,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 00:51:58,928 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:52:00,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:52:05,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:52:07,292 INFO [train.py:1039] (3/4) Epoch 6, batch 2950, loss[loss=0.2046, simple_loss=0.2726, pruned_loss=0.06825, over 16015.00 frames. ], tot_loss[loss=0.234, simple_loss=0.296, pruned_loss=0.08604, over 4712748.97 frames. ], batch size: 34, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:52:07,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 00:52:07,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:52:07,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:52:11,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:52:12,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:52:14,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 00:52:14,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 00:52:14,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 00:52:14,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:52:21,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:52:23,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:52:24,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:52:24,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:52:28,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:52:29,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:52:30,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:52:32,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:52:32,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:52:34,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 00:52:38,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=196800.0, ans=0.0 2023-09-29 00:52:40,931 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 00:52:42,919 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 00:52:43,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:52:45,978 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 00:52:46,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 00:52:46,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:52:47,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 00:52:47,568 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 00:52:47,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 00:52:49,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 00:52:51,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:52:51,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 00:52:54,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:52:56,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 00:52:56,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:52:58,055 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 00:52:58,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:52:59,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 00:53:04,396 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=196933.33333333334, ans=0.125 2023-09-29 00:53:05,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:53:07,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:53:07,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 00:53:07,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:53:10,069 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.711e+02 2.213e+02 2.464e+02 2.740e+02 4.622e+02, threshold=4.928e+02, percent-clipped=0.0 2023-09-29 00:53:10,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 00:53:11,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:53:15,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:53:15,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:53:15,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:53:15,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 00:53:17,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:53:19,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:53:19,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:53:19,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 00:53:20,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:53:20,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:53:22,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:53:22,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 00:53:22,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=197000.0, ans=0.0 2023-09-29 00:53:24,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:53:26,448 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:53:27,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 00:53:29,595 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=197066.66666666666, ans=0.1 2023-09-29 00:53:30,687 INFO [train.py:1039] (3/4) Epoch 6, batch 3000, loss[loss=0.2593, simple_loss=0.3094, pruned_loss=0.1046, over 22872.00 frames. ], tot_loss[loss=0.2358, simple_loss=0.2976, pruned_loss=0.08704, over 4700177.06 frames. ], batch size: 322, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:53:30,688 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 00:53:44,105 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.4.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([2.8337, 1.6505, 2.9082, 2.4277], device='cuda:3') 2023-09-29 00:53:45,524 INFO [train.py:1071] (3/4) Epoch 6, validation: loss=0.3825, simple_loss=0.3275, pruned_loss=0.2187, over 1125622.00 frames. 2023-09-29 00:53:45,525 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 00:53:47,209 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 00:53:47,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 00:53:50,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:53:50,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:53:51,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 00:53:51,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:53:57,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=197066.66666666666, ans=0.2 2023-09-29 00:54:00,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 00:54:08,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:54:10,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=197133.33333333334, ans=0.125 2023-09-29 00:54:14,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 00:54:17,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:54:19,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:54:19,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:54:21,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:54:21,451 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=197200.0, ans=0.125 2023-09-29 00:54:23,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:54:23,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 00:54:25,727 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.76 vs. limit=15.0 2023-09-29 00:54:26,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 00:54:28,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:54:28,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 00:54:30,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:54:30,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:54:32,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:54:32,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:54:37,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 00:54:37,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:54:37,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 00:54:39,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:54:42,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 00:54:42,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 00:54:42,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:54:44,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 00:54:48,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:54:48,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:54:50,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 00:54:50,423 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 00:54:50,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:54:50,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 00:54:50,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 00:54:52,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=197333.33333333334, ans=0.0 2023-09-29 00:54:53,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 00:54:57,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 00:54:57,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 00:54:57,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 00:54:59,399 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 00:54:59,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 00:55:00,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:55:02,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:55:02,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 00:55:02,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:03,179 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.02 vs. limit=22.5 2023-09-29 00:55:03,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:55:07,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 00:55:09,414 INFO [train.py:1039] (3/4) Epoch 6, batch 3050, loss[loss=0.2423, simple_loss=0.3138, pruned_loss=0.08537, over 24493.00 frames. ], tot_loss[loss=0.2362, simple_loss=0.2979, pruned_loss=0.08728, over 4695892.23 frames. ], batch size: 66, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:55:09,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:55:12,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:55:12,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 00:55:12,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=197400.0, ans=0.1 2023-09-29 00:55:17,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:20,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 00:55:26,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 00:55:28,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 00:55:28,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:55:31,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 00:55:34,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:34,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:55:36,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:55:40,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:55:41,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 00:55:41,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:55:41,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:55:41,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:55:43,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:47,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:55:49,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:55:49,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 00:55:50,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:55:50,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 00:55:53,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:55:55,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 00:55:56,630 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:55:56,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:56:02,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:56:04,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:56:10,217 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 2.122e+02 2.325e+02 2.738e+02 3.532e+02, threshold=4.649e+02, percent-clipped=0.0 2023-09-29 00:56:10,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:56:10,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:56:10,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:56:12,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:56:12,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 00:56:14,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 00:56:14,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 00:56:17,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 00:56:17,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:56:19,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 00:56:23,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:56:29,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:56:31,043 INFO [train.py:1039] (3/4) Epoch 6, batch 3100, loss[loss=0.2172, simple_loss=0.2849, pruned_loss=0.07474, over 24675.00 frames. ], tot_loss[loss=0.2345, simple_loss=0.2966, pruned_loss=0.08618, over 4695318.66 frames. ], batch size: 65, lr: 1.69e-02, grad_scale: 32.0 2023-09-29 00:56:31,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 00:56:34,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 00:56:35,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 00:56:37,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 00:56:40,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 00:56:43,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 00:56:44,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:56:44,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:56:48,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 00:56:52,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:56:54,469 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=197800.0, ans=0.1 2023-09-29 00:57:00,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 00:57:04,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 00:57:04,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:05,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:57:07,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:57:07,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 00:57:10,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:57:10,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 00:57:10,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 00:57:11,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:57:13,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 00:57:13,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:57:17,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 00:57:17,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 00:57:19,428 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=197933.33333333334, ans=0.0 2023-09-29 00:57:20,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 00:57:20,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:21,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:57:23,977 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.65 vs. limit=12.0 2023-09-29 00:57:25,150 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:57:25,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:25,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:57:26,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 00:57:26,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 00:57:29,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:57:29,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:57:29,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:29,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 00:57:34,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:57:36,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 00:57:39,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 00:57:39,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 00:57:39,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:57:41,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:41,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 00:57:49,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 00:57:51,982 INFO [train.py:1039] (3/4) Epoch 6, batch 3150, loss[loss=0.253, simple_loss=0.3257, pruned_loss=0.09015, over 24665.00 frames. ], tot_loss[loss=0.2344, simple_loss=0.2956, pruned_loss=0.08657, over 4688676.76 frames. ], batch size: 73, lr: 1.69e-02, grad_scale: 16.0 2023-09-29 00:57:53,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:57:54,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:57:55,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:57:55,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 00:57:56,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 00:57:59,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:58:00,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 00:58:00,972 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 00:58:02,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 00:58:02,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:58:04,997 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.43 vs. limit=15.0 2023-09-29 00:58:05,758 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 00:58:09,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 00:58:09,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:58:10,999 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 00:58:11,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 00:58:12,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 00:58:14,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 00:58:14,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 00:58:14,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:58:14,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:58:15,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 00:58:17,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 00:58:20,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:58:20,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:58:20,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:58:20,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=198133.33333333334, ans=0.1 2023-09-29 00:58:23,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 00:58:26,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 00:58:28,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 00:58:31,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 00:58:31,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 00:58:33,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 00:58:35,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 00:58:36,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 00:58:36,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 00:58:36,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 00:58:36,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:58:36,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 00:58:39,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 00:58:39,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 00:58:40,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 00:58:41,106 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=198266.66666666666, ans=0.125 2023-09-29 00:58:41,394 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.27 vs. limit=22.5 2023-09-29 00:58:42,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 00:58:42,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:58:43,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 00:58:43,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 00:58:45,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 00:58:46,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:58:47,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=198266.66666666666, ans=0.125 2023-09-29 00:58:48,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 00:58:48,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:58:48,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 00:58:49,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 00:58:52,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 00:58:52,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:58:54,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 00:58:55,614 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 2.072e+02 2.389e+02 2.889e+02 3.902e+02, threshold=4.779e+02, percent-clipped=0.0 2023-09-29 00:58:55,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 00:58:55,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 00:59:00,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 00:59:02,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:59:02,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 00:59:07,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 00:59:07,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:59:12,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 00:59:14,227 INFO [train.py:1039] (3/4) Epoch 6, batch 3200, loss[loss=0.2311, simple_loss=0.2912, pruned_loss=0.08555, over 23374.00 frames. ], tot_loss[loss=0.2333, simple_loss=0.2945, pruned_loss=0.08605, over 4688871.66 frames. ], batch size: 119, lr: 1.68e-02, grad_scale: 32.0 2023-09-29 00:59:18,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 00:59:18,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 00:59:21,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:59:23,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 00:59:23,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 00:59:24,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 00:59:29,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 00:59:29,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=198466.66666666666, ans=0.09899494936611666 2023-09-29 00:59:35,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 00:59:44,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 00:59:46,787 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.68 vs. limit=10.0 2023-09-29 00:59:55,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 00:59:56,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:00:00,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 01:00:00,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:00:01,903 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=198600.0, ans=0.04949747468305833 2023-09-29 01:00:04,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:00:04,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:00:05,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:00:09,078 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 01:00:11,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 01:00:11,634 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:00:12,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 01:00:16,111 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=198600.0, ans=0.125 2023-09-29 01:00:16,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=198600.0, ans=0.125 2023-09-29 01:00:17,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 01:00:19,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:00:26,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:00:26,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:00:26,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:00:26,773 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 01:00:26,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:00:31,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:00:33,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 01:00:34,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 01:00:36,063 INFO [train.py:1039] (3/4) Epoch 6, batch 3250, loss[loss=0.2162, simple_loss=0.2985, pruned_loss=0.06694, over 24450.00 frames. ], tot_loss[loss=0.2328, simple_loss=0.2942, pruned_loss=0.08572, over 4698381.75 frames. ], batch size: 69, lr: 1.68e-02, grad_scale: 32.0 2023-09-29 01:00:36,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 01:00:37,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 01:00:39,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:00:41,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=198733.33333333334, ans=0.1 2023-09-29 01:00:42,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 01:00:42,492 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 01:00:42,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:00:42,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:00:42,714 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 01:00:48,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:00:48,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=198733.33333333334, ans=0.125 2023-09-29 01:00:51,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:01:00,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:01:00,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 01:01:02,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:01:02,319 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:01:02,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:01:05,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:01:05,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:01:08,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:01:08,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:01:08,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:01:09,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:01:09,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:01:10,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:01:12,077 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=198866.66666666666, ans=0.2 2023-09-29 01:01:13,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:01:16,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:01:17,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:01:17,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:01:19,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:01:19,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:01:20,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:01:23,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 01:01:24,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:01:24,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:01:26,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:01:27,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:01:33,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:01:33,828 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=198933.33333333334, ans=0.125 2023-09-29 01:01:38,780 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=198933.33333333334, ans=0.125 2023-09-29 01:01:41,414 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.784e+02 2.144e+02 2.444e+02 2.943e+02 3.918e+02, threshold=4.889e+02, percent-clipped=0.0 2023-09-29 01:01:41,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:01:42,405 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.78 vs. limit=15.0 2023-09-29 01:01:43,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:01:43,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 01:01:43,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:01:43,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 01:01:44,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:01:47,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 01:01:47,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 01:01:47,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:01:49,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:01:50,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:01:52,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 01:01:52,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:01:54,855 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=199000.0, ans=0.125 2023-09-29 01:01:55,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:01:56,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:01:57,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 01:01:57,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:01:59,091 INFO [train.py:1039] (3/4) Epoch 6, batch 3300, loss[loss=0.3051, simple_loss=0.3345, pruned_loss=0.1379, over 18947.00 frames. ], tot_loss[loss=0.2338, simple_loss=0.2952, pruned_loss=0.08627, over 4686490.72 frames. ], batch size: 388, lr: 1.68e-02, grad_scale: 32.0 2023-09-29 01:02:00,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 01:02:00,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 01:02:02,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:02:04,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 01:02:07,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 01:02:07,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 01:02:07,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:02:11,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=199066.66666666666, ans=0.2 2023-09-29 01:02:12,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:02:14,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:02:14,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:02:17,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 01:02:17,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:02:19,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:02:20,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:02:25,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 01:02:25,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:02:25,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:02:26,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=199133.33333333334, ans=0.125 2023-09-29 01:02:27,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:02:28,749 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 01:02:30,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:02:31,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:02:31,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:02:31,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:02:31,950 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 01:02:36,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:02:36,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:02:38,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:02:40,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 01:02:40,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=199200.0, ans=0.1 2023-09-29 01:02:41,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 01:02:41,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:02:41,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:02:43,774 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 01:02:47,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 01:02:47,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:02:50,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 01:02:51,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:02:55,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:02:56,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:02:58,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:02:58,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:02:58,276 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:02:58,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:03:02,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:03:02,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:03:03,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:03:05,001 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 01:03:06,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 01:03:06,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=199333.33333333334, ans=0.125 2023-09-29 01:03:09,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 01:03:09,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:03:09,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:03:12,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=199333.33333333334, ans=0.125 2023-09-29 01:03:12,609 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.79 vs. limit=15.0 2023-09-29 01:03:13,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:03:13,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:03:14,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:03:14,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:14,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 01:03:16,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:03:17,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 01:03:21,877 INFO [train.py:1039] (3/4) Epoch 6, batch 3350, loss[loss=0.2313, simple_loss=0.3062, pruned_loss=0.07824, over 24359.00 frames. ], tot_loss[loss=0.2346, simple_loss=0.2965, pruned_loss=0.08636, over 4697796.25 frames. ], batch size: 77, lr: 1.68e-02, grad_scale: 16.0 2023-09-29 01:03:21,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 01:03:22,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:03:23,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:25,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:03:25,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:03:28,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:03:29,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:03:29,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:03:32,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:03:34,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:03:34,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:03:37,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:40,025 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.66 vs. limit=15.0 2023-09-29 01:03:40,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:03:40,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:03:42,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:03:43,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 01:03:45,995 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 01:03:46,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:03:49,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 01:03:49,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 01:03:50,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:03:50,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:03:52,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:03:52,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 01:03:52,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:52,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:03:55,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:03:57,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:03:57,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:04:00,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:04:03,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:04:05,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:04:06,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:04:09,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:04:09,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=199600.0, ans=0.125 2023-09-29 01:04:11,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:04:12,079 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:04:13,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:04:13,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:04:16,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:04:18,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 01:04:18,355 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 01:04:18,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 01:04:19,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:04:19,989 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 01:04:21,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:04:23,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:04:25,668 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.082e+02 2.289e+02 2.624e+02 4.671e+02, threshold=4.577e+02, percent-clipped=0.0 2023-09-29 01:04:30,381 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.70 vs. limit=22.5 2023-09-29 01:04:31,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:04:32,989 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 01:04:33,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:04:34,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:04:34,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:04:38,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=199666.66666666666, ans=0.2 2023-09-29 01:04:39,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:04:42,774 INFO [train.py:1039] (3/4) Epoch 6, batch 3400, loss[loss=0.3205, simple_loss=0.3494, pruned_loss=0.1459, over 19301.00 frames. ], tot_loss[loss=0.2365, simple_loss=0.2983, pruned_loss=0.08737, over 4701806.68 frames. ], batch size: 388, lr: 1.68e-02, grad_scale: 16.0 2023-09-29 01:04:42,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 01:04:42,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:04:43,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:04:44,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=199733.33333333334, ans=0.2 2023-09-29 01:04:45,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:04:46,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 01:04:47,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:04:47,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 01:04:48,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:04:48,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:04:49,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:04:51,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:04:51,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 01:04:52,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=199733.33333333334, ans=0.125 2023-09-29 01:04:55,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 01:04:55,744 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 01:04:55,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:05:00,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:05:00,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:05:00,534 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:05:02,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:05:07,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:05:09,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 01:05:13,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:05:16,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:05:17,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:05:17,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 01:05:20,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=199866.66666666666, ans=0.125 2023-09-29 01:05:24,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:05:29,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 01:05:35,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:05:37,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:05:37,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 01:05:37,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:05:39,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:05:40,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:05:41,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:05:44,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:05:48,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:05:48,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:05:50,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=200000.0, ans=0.125 2023-09-29 01:05:56,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:05:58,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 01:06:02,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:06:02,735 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.64 vs. limit=12.0 2023-09-29 01:06:06,476 INFO [train.py:1039] (3/4) Epoch 6, batch 3450, loss[loss=0.1877, simple_loss=0.2622, pruned_loss=0.05655, over 24452.00 frames. ], tot_loss[loss=0.2375, simple_loss=0.2986, pruned_loss=0.08816, over 4690186.44 frames. ], batch size: 63, lr: 1.68e-02, grad_scale: 16.0 2023-09-29 01:06:06,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 01:06:09,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 01:06:11,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:06:13,383 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:06:13,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 01:06:14,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:06:17,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:06:23,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:06:23,947 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.52 vs. limit=22.5 2023-09-29 01:06:24,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:06:26,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:06:26,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:06:28,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:06:33,529 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.34 vs. limit=15.0 2023-09-29 01:06:34,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 01:06:40,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 01:06:40,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:06:40,718 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:06:42,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:06:48,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 01:06:50,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:06:54,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:06:54,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:06:56,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:06:58,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:06:59,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 01:06:59,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:07:03,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:07:04,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:07:06,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 01:07:11,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:07:12,880 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 2.101e+02 2.630e+02 3.255e+02 5.395e+02, threshold=5.260e+02, percent-clipped=4.0 2023-09-29 01:07:13,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=200333.33333333334, ans=0.125 2023-09-29 01:07:14,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:07:16,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:07:19,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:07:23,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:07:23,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:07:25,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:07:25,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:07:30,110 INFO [train.py:1039] (3/4) Epoch 6, batch 3500, loss[loss=0.2506, simple_loss=0.3011, pruned_loss=0.1001, over 23845.00 frames. ], tot_loss[loss=0.2359, simple_loss=0.2966, pruned_loss=0.08755, over 4686101.92 frames. ], batch size: 164, lr: 1.68e-02, grad_scale: 16.0 2023-09-29 01:07:30,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:07:32,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=200400.0, ans=0.125 2023-09-29 01:07:35,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:07:35,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 01:07:35,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=200400.0, ans=0.125 2023-09-29 01:07:36,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:07:38,633 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=200400.0, ans=0.125 2023-09-29 01:07:39,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 01:07:41,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:07:41,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 01:07:44,900 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:07:47,943 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:07:49,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:07:49,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:07:49,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 01:07:50,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:07:50,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:07:50,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 01:07:51,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=200466.66666666666, ans=0.1 2023-09-29 01:07:55,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:07:55,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 01:07:58,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:08:01,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:03,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 01:08:03,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:08:06,747 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=6.63 vs. limit=15.0 2023-09-29 01:08:07,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:08:07,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:08:08,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:10,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:08:10,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:08:12,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 01:08:12,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 01:08:13,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 01:08:13,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:08:15,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:16,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:08:16,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:08:19,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 01:08:21,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:08:21,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=200600.0, ans=0.2 2023-09-29 01:08:23,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=200600.0, ans=0.125 2023-09-29 01:08:27,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:08:27,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=200600.0, ans=0.2 2023-09-29 01:08:28,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 01:08:28,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 01:08:28,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:08:31,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:08:33,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:08:34,294 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=200600.0, ans=0.125 2023-09-29 01:08:35,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:39,247 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 01:08:40,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:08:40,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=200666.66666666666, ans=0.125 2023-09-29 01:08:42,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:08:43,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 01:08:45,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 01:08:46,253 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.56 vs. limit=15.0 2023-09-29 01:08:47,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:08:48,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:08:48,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:08:48,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:08:53,074 INFO [train.py:1039] (3/4) Epoch 6, batch 3550, loss[loss=0.2311, simple_loss=0.29, pruned_loss=0.08609, over 23202.00 frames. ], tot_loss[loss=0.2338, simple_loss=0.2943, pruned_loss=0.0867, over 4676945.89 frames. ], batch size: 105, lr: 1.68e-02, grad_scale: 8.0 2023-09-29 01:08:53,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:08:53,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=200733.33333333334, ans=0.125 2023-09-29 01:09:02,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:09:05,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 01:09:07,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:09:09,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:09:13,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:09:14,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:09:14,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:09:16,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:09:16,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:09:17,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:09:17,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 01:09:19,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:09:24,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:09:24,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:09:26,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:09:27,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:09:28,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:09:28,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 01:09:28,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:09:30,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:09:31,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 01:09:36,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:09:37,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:09:39,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:09:40,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 01:09:42,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:09:42,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 01:09:44,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:09:45,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:09:46,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:09:50,445 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 01:09:52,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:09:59,417 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.687e+02 2.123e+02 2.427e+02 3.037e+02 5.186e+02, threshold=4.854e+02, percent-clipped=0.0 2023-09-29 01:09:59,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:10:01,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 01:10:01,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:10:04,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:10:04,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 01:10:10,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=201000.0, ans=0.125 2023-09-29 01:10:11,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 01:10:12,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:10:12,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:10:14,579 INFO [train.py:1039] (3/4) Epoch 6, batch 3600, loss[loss=0.2191, simple_loss=0.2986, pruned_loss=0.06977, over 24621.00 frames. ], tot_loss[loss=0.234, simple_loss=0.2952, pruned_loss=0.08642, over 4694083.90 frames. ], batch size: 68, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:10:16,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:10:16,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:10:18,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:10:23,066 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.25 vs. limit=22.5 2023-09-29 01:10:23,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:10:25,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:10:25,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:10:25,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:10:26,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:10:26,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 01:10:31,398 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:10:32,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:10:37,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:10:37,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:10:39,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:10:40,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:10:40,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 01:10:42,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:10:45,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:10:47,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:10:48,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:10:51,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:10:51,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:10:53,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 01:11:02,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:11:02,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=201200.0, ans=0.0 2023-09-29 01:11:03,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:11:03,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 01:11:08,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:11:14,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:11:16,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:11:16,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=201266.66666666666, ans=0.125 2023-09-29 01:11:24,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 01:11:25,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:11:25,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 01:11:27,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 01:11:27,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 01:11:30,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:11:32,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:11:34,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 01:11:34,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:11:35,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:11:35,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:11:35,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 01:11:35,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 01:11:38,736 INFO [train.py:1039] (3/4) Epoch 6, batch 3650, loss[loss=0.2525, simple_loss=0.3041, pruned_loss=0.1005, over 23857.00 frames. ], tot_loss[loss=0.2339, simple_loss=0.2956, pruned_loss=0.08612, over 4686776.62 frames. ], batch size: 195, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:11:40,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:11:41,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 01:11:46,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 01:11:46,713 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:11:51,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 01:11:51,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 01:11:53,881 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=201466.66666666666, ans=0.125 2023-09-29 01:11:56,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:11:56,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:11:57,028 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=29.33 vs. limit=15.0 2023-09-29 01:11:58,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:11:59,474 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.85 vs. limit=22.5 2023-09-29 01:12:01,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 01:12:01,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:12:03,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 01:12:03,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:12:03,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:12:05,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 01:12:05,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 01:12:06,145 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=16.29 vs. limit=15.0 2023-09-29 01:12:07,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:12:07,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:12:09,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:12:12,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 01:12:12,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 01:12:12,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=201533.33333333334, ans=0.125 2023-09-29 01:12:14,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:12:15,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 01:12:17,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:12:17,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:12:20,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=201533.33333333334, ans=0.125 2023-09-29 01:12:21,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:12:25,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:12:25,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:12:26,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:12:27,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:12:27,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=201600.0, ans=0.125 2023-09-29 01:12:30,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:12:32,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:12:33,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:12:33,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:12:35,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 01:12:35,875 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=201600.0, ans=0.125 2023-09-29 01:12:36,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:12:37,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:12:39,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=201600.0, ans=0.035 2023-09-29 01:12:41,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=201600.0, ans=0.1 2023-09-29 01:12:44,478 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 01:12:44,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=201666.66666666666, ans=0.0 2023-09-29 01:12:47,281 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.749e+02 2.133e+02 2.417e+02 2.802e+02 4.868e+02, threshold=4.835e+02, percent-clipped=1.0 2023-09-29 01:12:50,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:12:50,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:12:51,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 01:12:51,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:12:52,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 01:12:53,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:12:53,835 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=201666.66666666666, ans=0.125 2023-09-29 01:12:55,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 01:12:55,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:12:56,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=201666.66666666666, ans=0.125 2023-09-29 01:12:58,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:13:00,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:13:01,772 INFO [train.py:1039] (3/4) Epoch 6, batch 3700, loss[loss=0.2313, simple_loss=0.2891, pruned_loss=0.08675, over 23713.00 frames. ], tot_loss[loss=0.2351, simple_loss=0.2971, pruned_loss=0.08655, over 4693847.68 frames. ], batch size: 212, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:13:01,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:13:03,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:13:03,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 01:13:03,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:13:05,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 01:13:06,989 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:13:10,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:13:13,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:13:13,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:13:15,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:13:17,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:13:17,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 01:13:17,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:13:18,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=201800.0, ans=0.2 2023-09-29 01:13:19,262 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 01:13:28,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:13:28,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 01:13:29,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:13:29,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 01:13:29,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:13:35,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:13:36,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 01:13:38,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:13:39,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:13:40,721 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.18 vs. limit=15.0 2023-09-29 01:13:42,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:13:42,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:13:44,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 01:13:45,386 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.35 vs. limit=22.5 2023-09-29 01:13:48,111 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:13:48,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 01:13:48,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:13:48,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 01:13:52,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=201933.33333333334, ans=0.2 2023-09-29 01:13:53,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:13:54,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:13:58,183 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=201933.33333333334, ans=0.0 2023-09-29 01:13:59,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:13:59,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 01:14:02,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:14:02,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 01:14:02,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:14:02,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:14:07,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:14:07,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 01:14:09,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 01:14:09,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:14:11,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:14:12,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:14:14,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:14:15,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:14:17,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:14:18,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:14:20,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 01:14:23,836 INFO [train.py:1039] (3/4) Epoch 6, batch 3750, loss[loss=0.2419, simple_loss=0.2901, pruned_loss=0.09684, over 23767.00 frames. ], tot_loss[loss=0.2364, simple_loss=0.2981, pruned_loss=0.08735, over 4699325.03 frames. ], batch size: 164, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:14:24,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 01:14:25,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 01:14:27,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 01:14:27,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:14:29,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:14:30,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:14:31,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:14:35,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:14:38,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:14:40,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:14:44,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:14:48,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:14:50,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 01:14:50,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:14:51,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:14:53,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:14:55,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 01:15:00,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 01:15:01,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:15:01,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:15:03,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:15:08,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:15:10,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 01:15:13,974 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.53 vs. limit=15.0 2023-09-29 01:15:14,199 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.93 vs. limit=15.0 2023-09-29 01:15:16,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 01:15:18,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:15:22,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:15:22,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:15:26,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:15:30,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 01:15:31,937 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.739e+02 2.301e+02 2.601e+02 3.130e+02 4.781e+02, threshold=5.202e+02, percent-clipped=0.0 2023-09-29 01:15:32,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:15:35,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:15:36,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:15:39,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:15:41,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=202333.33333333334, ans=0.125 2023-09-29 01:15:46,256 INFO [train.py:1039] (3/4) Epoch 6, batch 3800, loss[loss=0.2368, simple_loss=0.2804, pruned_loss=0.09662, over 23519.00 frames. ], tot_loss[loss=0.2355, simple_loss=0.2968, pruned_loss=0.08711, over 4704606.62 frames. ], batch size: 285, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:15:48,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:15:53,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:15:55,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 01:15:55,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 01:15:58,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:15:58,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:16:00,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 01:16:01,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 01:16:01,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:16:03,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:16:04,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:16:05,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:16:05,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:16:07,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 01:16:10,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 01:16:10,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:16:13,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:16:15,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:16:17,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:16:17,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=202466.66666666666, ans=0.125 2023-09-29 01:16:18,855 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.13 vs. limit=15.0 2023-09-29 01:16:19,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 01:16:19,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:16:19,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=202533.33333333334, ans=0.125 2023-09-29 01:16:20,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:16:22,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:16:24,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=202533.33333333334, ans=0.0 2023-09-29 01:16:28,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 01:16:28,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 01:16:30,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:16:38,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:16:42,828 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.67 vs. limit=12.0 2023-09-29 01:16:43,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:16:46,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 01:16:48,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 01:16:48,377 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:16:51,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:16:51,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:16:55,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 01:16:58,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 01:17:00,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 01:17:00,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:17:02,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:17:07,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:17:08,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:17:10,313 INFO [train.py:1039] (3/4) Epoch 6, batch 3850, loss[loss=0.2185, simple_loss=0.2644, pruned_loss=0.08636, over 23441.00 frames. ], tot_loss[loss=0.2346, simple_loss=0.2962, pruned_loss=0.08654, over 4707701.53 frames. ], batch size: 285, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:17:14,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:17:15,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 01:17:15,816 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:17:18,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:17:18,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:17:18,987 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=202733.33333333334, ans=0.0 2023-09-29 01:17:23,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:17:24,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:17:27,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 01:17:27,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 01:17:28,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=202800.0, ans=0.0 2023-09-29 01:17:35,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:37,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:17:40,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:17:40,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:17:43,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:43,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:17:44,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:17:44,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:17:46,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:17:49,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:17:51,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:51,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:17:51,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 01:17:51,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 01:17:51,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:17:51,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:55,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:17:55,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:17:57,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 01:17:58,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 01:18:00,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:18:02,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 01:18:05,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 01:18:06,573 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.27 vs. limit=10.0 2023-09-29 01:18:09,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:18:11,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:18:17,506 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.828e+02 2.249e+02 2.602e+02 3.151e+02 5.214e+02, threshold=5.203e+02, percent-clipped=1.0 2023-09-29 01:18:17,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:18:17,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 01:18:19,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 01:18:23,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:18:23,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:18:26,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:18:26,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:18:27,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:27,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:27,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:18:29,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 01:18:29,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:18:30,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 01:18:30,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:30,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:18:32,223 INFO [train.py:1039] (3/4) Epoch 6, batch 3900, loss[loss=0.2242, simple_loss=0.286, pruned_loss=0.08122, over 23629.00 frames. ], tot_loss[loss=0.2335, simple_loss=0.2956, pruned_loss=0.08571, over 4714478.83 frames. ], batch size: 149, lr: 1.67e-02, grad_scale: 16.0 2023-09-29 01:18:33,061 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.67 vs. limit=15.0 2023-09-29 01:18:33,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:18:33,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:36,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:18:36,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:18:36,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:18:39,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:18:39,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 01:18:39,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:44,336 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:18:44,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:18:44,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:18:46,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:18:49,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:18:49,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:51,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:18:52,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 01:18:52,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:18:54,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 01:18:54,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:18:55,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 01:18:56,264 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=203133.33333333334, ans=0.2 2023-09-29 01:18:58,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 01:18:58,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=203133.33333333334, ans=0.125 2023-09-29 01:19:02,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:19:02,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:19:04,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:19:04,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:19:08,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:19:12,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:19:13,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:19:13,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:19:13,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:19:21,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:19:21,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:19:30,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:19:32,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=203266.66666666666, ans=0.125 2023-09-29 01:19:33,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:19:35,538 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=203266.66666666666, ans=0.5 2023-09-29 01:19:41,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:19:44,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:19:44,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 01:19:46,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 01:19:46,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:19:47,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 01:19:50,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:19:50,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 01:19:55,242 INFO [train.py:1039] (3/4) Epoch 6, batch 3950, loss[loss=0.1966, simple_loss=0.2631, pruned_loss=0.06508, over 24302.00 frames. ], tot_loss[loss=0.2334, simple_loss=0.2951, pruned_loss=0.08587, over 4711936.61 frames. ], batch size: 56, lr: 1.66e-02, grad_scale: 16.0 2023-09-29 01:19:58,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:20:01,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 01:20:01,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:20:05,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:20:07,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:20:07,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=203400.0, ans=0.09899494936611666 2023-09-29 01:20:13,294 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 01:20:13,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:20:14,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 01:20:14,856 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 01:20:14,900 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:20:17,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:20:17,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:20:17,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:20:21,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 01:20:21,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=203466.66666666666, ans=0.09899494936611666 2023-09-29 01:20:24,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:20:24,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:20:24,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:20:24,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:20:24,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=203466.66666666666, ans=0.125 2023-09-29 01:20:24,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=203466.66666666666, ans=0.125 2023-09-29 01:20:26,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:20:36,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:20:38,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:20:42,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 01:20:43,405 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:20:46,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=203600.0, ans=0.125 2023-09-29 01:20:47,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 01:20:47,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 01:20:47,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:20:49,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:20:59,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:20:59,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:21:00,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:21:00,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:21:00,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 01:21:02,045 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.704e+02 2.135e+02 2.350e+02 2.654e+02 4.554e+02, threshold=4.701e+02, percent-clipped=0.0 2023-09-29 01:21:06,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:21:08,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:21:12,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 01:21:17,428 INFO [train.py:1039] (3/4) Epoch 6, batch 4000, loss[loss=0.2544, simple_loss=0.3167, pruned_loss=0.09602, over 23384.00 frames. ], tot_loss[loss=0.2338, simple_loss=0.2959, pruned_loss=0.08581, over 4709547.65 frames. ], batch size: 93, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:21:21,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=203733.33333333334, ans=0.0 2023-09-29 01:21:22,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:21:32,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:21:37,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:21:37,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:21:37,411 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:21:37,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 01:21:38,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 01:21:40,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 01:21:40,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:21:40,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 01:21:41,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:21:46,362 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.28 vs. limit=15.0 2023-09-29 01:21:47,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:21:47,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:21:47,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:21:47,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:21:47,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 01:21:48,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:21:52,331 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 01:21:53,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:21:53,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:21:56,959 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 01:21:57,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:21:57,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:22:04,716 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 01:22:06,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:22:07,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:22:09,357 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 01:22:10,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:22:12,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 01:22:12,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:22:13,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:22:13,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:22:15,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:22:15,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:22:15,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=203933.33333333334, ans=0.125 2023-09-29 01:22:16,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:22:19,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 01:22:19,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:22:22,327 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 01:22:27,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:22:30,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 01:22:32,350 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=204000.0, ans=0.1 2023-09-29 01:22:33,110 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=8.14 vs. limit=15.0 2023-09-29 01:22:33,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=204000.0, ans=0.95 2023-09-29 01:22:35,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:22:35,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:22:35,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:22:36,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:22:39,618 INFO [train.py:1039] (3/4) Epoch 6, batch 4050, loss[loss=0.2375, simple_loss=0.3094, pruned_loss=0.08277, over 24516.00 frames. ], tot_loss[loss=0.2339, simple_loss=0.2965, pruned_loss=0.08565, over 4713840.23 frames. ], batch size: 66, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:22:45,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:22:46,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 01:22:47,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 01:22:49,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:22:50,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:22:51,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:22:52,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:22:54,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:22:58,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:23:00,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:23:00,102 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 01:23:00,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=204133.33333333334, ans=0.125 2023-09-29 01:23:00,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=204133.33333333334, ans=0.125 2023-09-29 01:23:03,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:23:03,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:23:03,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=204133.33333333334, ans=0.1 2023-09-29 01:23:08,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:23:09,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:23:12,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 01:23:14,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 01:23:14,543 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 01:23:16,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:23:23,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 01:23:24,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:23:29,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:23:33,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:23:33,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:23:33,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:23:34,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:23:38,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 01:23:38,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 01:23:41,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:23:43,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 01:23:45,300 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.10 vs. limit=15.0 2023-09-29 01:23:47,272 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.69 vs. limit=22.5 2023-09-29 01:23:48,266 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.708e+02 2.126e+02 2.466e+02 2.913e+02 5.658e+02, threshold=4.933e+02, percent-clipped=1.0 2023-09-29 01:23:48,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:23:56,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 01:23:56,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:23:56,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:23:58,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=204333.33333333334, ans=0.035 2023-09-29 01:23:59,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 01:23:59,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 01:23:59,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:24:02,571 INFO [train.py:1039] (3/4) Epoch 6, batch 4100, loss[loss=0.2294, simple_loss=0.2869, pruned_loss=0.08595, over 24338.00 frames. ], tot_loss[loss=0.2349, simple_loss=0.2972, pruned_loss=0.08633, over 4713430.60 frames. ], batch size: 56, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:24:02,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:24:04,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:04,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:24:12,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 01:24:14,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 01:24:16,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 01:24:17,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 01:24:17,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:24:17,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:17,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:18,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=204466.66666666666, ans=0.125 2023-09-29 01:24:19,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:24:19,539 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 01:24:23,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:24:24,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:24:24,722 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:24:24,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:24:29,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:24:31,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:24:31,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:24:31,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 01:24:31,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:32,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:24:32,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:24:32,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:24:33,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 01:24:34,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:24:35,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=204533.33333333334, ans=0.125 2023-09-29 01:24:36,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 01:24:38,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:24:41,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:24:41,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 01:24:44,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:24:44,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:24:44,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:24:46,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 01:24:47,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:24:48,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:24:49,138 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.37 vs. limit=15.0 2023-09-29 01:24:50,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 01:24:51,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:24:51,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:24:53,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:25:00,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:25:05,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:25:06,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:25:10,719 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:25:15,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:25:15,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:25:21,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:25:24,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:25:25,933 INFO [train.py:1039] (3/4) Epoch 6, batch 4150, loss[loss=0.2465, simple_loss=0.2898, pruned_loss=0.1015, over 23876.00 frames. ], tot_loss[loss=0.2352, simple_loss=0.2974, pruned_loss=0.08653, over 4717048.25 frames. ], batch size: 195, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:25:27,726 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:25:29,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:25:30,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:25:30,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:25:34,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 01:25:34,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:25:35,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 01:25:37,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 01:25:37,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 01:25:39,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:25:44,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:25:44,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:25:48,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:25:49,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:25:50,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 01:25:52,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 01:25:52,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:25:54,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 01:25:58,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:26:02,727 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:26:02,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 01:26:07,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 01:26:07,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:26:07,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 01:26:07,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:26:07,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:26:09,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:26:11,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:26:17,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 01:26:19,401 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=204933.33333333334, ans=0.125 2023-09-29 01:26:21,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 01:26:23,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:26:24,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 01:26:24,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:26:26,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 01:26:29,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:26:29,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:26:31,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:26:33,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 01:26:33,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:26:33,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 01:26:33,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 01:26:34,523 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.777e+02 2.193e+02 2.474e+02 2.867e+02 4.434e+02, threshold=4.949e+02, percent-clipped=0.0 2023-09-29 01:26:36,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 01:26:36,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:26:36,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:26:36,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:26:37,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 01:26:37,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:26:38,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 01:26:39,515 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:26:42,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:26:42,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 01:26:42,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 01:26:49,017 INFO [train.py:1039] (3/4) Epoch 6, batch 4200, loss[loss=0.2224, simple_loss=0.2968, pruned_loss=0.07396, over 24497.00 frames. ], tot_loss[loss=0.2353, simple_loss=0.297, pruned_loss=0.08677, over 4696742.66 frames. ], batch size: 66, lr: 1.66e-02, grad_scale: 32.0 2023-09-29 01:26:49,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:26:50,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 01:26:54,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:26:56,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:26:57,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:26:57,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:26:57,889 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:26:59,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 01:27:04,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 01:27:04,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:27:04,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=205133.33333333334, ans=0.125 2023-09-29 01:27:07,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:27:09,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:27:12,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 01:27:14,465 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:27:14,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:27:15,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 01:27:15,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:27:17,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:27:17,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:27:17,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:27:19,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:27:21,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 01:27:22,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:27:24,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=205200.0, ans=0.125 2023-09-29 01:27:24,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=205200.0, ans=0.125 2023-09-29 01:27:27,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 01:27:29,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:27:32,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:27:33,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:27:36,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:27:36,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 01:27:36,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:27:36,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:27:42,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:27:45,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:27:46,137 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.14 vs. limit=15.0 2023-09-29 01:27:52,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:27:55,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 01:27:56,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=205333.33333333334, ans=0.0 2023-09-29 01:27:57,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:28:02,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 01:28:04,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:28:05,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 01:28:11,991 INFO [train.py:1039] (3/4) Epoch 6, batch 4250, loss[loss=0.2187, simple_loss=0.2917, pruned_loss=0.07281, over 24650.00 frames. ], tot_loss[loss=0.2334, simple_loss=0.2944, pruned_loss=0.08621, over 4683210.50 frames. ], batch size: 68, lr: 1.66e-02, grad_scale: 16.0 2023-09-29 01:28:12,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:28:16,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:28:16,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 01:28:18,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:28:20,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=205400.0, ans=0.035 2023-09-29 01:28:24,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:28:24,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 01:28:24,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:28:27,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:28:27,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=205466.66666666666, ans=0.0 2023-09-29 01:28:30,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:28:35,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:28:37,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:28:38,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:28:38,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:28:40,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:28:42,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:28:42,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:28:43,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:28:45,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:28:47,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 01:28:51,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 01:28:51,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:28:53,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:28:53,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:28:54,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:28:54,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:28:55,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:28:58,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 01:28:58,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:29:03,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:29:05,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:29:07,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 01:29:07,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:29:08,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 01:29:10,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:29:12,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:29:13,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:29:13,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:29:16,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 01:29:17,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 01:29:18,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:29:21,786 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 2.147e+02 2.416e+02 2.924e+02 5.280e+02, threshold=4.831e+02, percent-clipped=2.0 2023-09-29 01:29:22,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:29:25,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:29:26,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:29:27,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:29:29,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:29:30,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:29:33,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:29:33,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 01:29:35,412 INFO [train.py:1039] (3/4) Epoch 6, batch 4300, loss[loss=0.2035, simple_loss=0.2661, pruned_loss=0.07042, over 21109.00 frames. ], tot_loss[loss=0.2325, simple_loss=0.2936, pruned_loss=0.0857, over 4683564.57 frames. ], batch size: 46, lr: 1.66e-02, grad_scale: 16.0 2023-09-29 01:29:35,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:29:40,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=205733.33333333334, ans=0.125 2023-09-29 01:29:41,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:29:41,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:29:42,255 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.31 vs. limit=6.0 2023-09-29 01:29:43,536 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=205733.33333333334, ans=0.2 2023-09-29 01:29:45,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:29:56,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:29:56,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 01:29:56,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:29:59,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:29:59,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:29:59,407 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 01:30:02,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:30:05,200 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.34 vs. limit=6.0 2023-09-29 01:30:06,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:30:09,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 01:30:09,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:30:09,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 01:30:12,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 01:30:12,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=205866.66666666666, ans=0.2 2023-09-29 01:30:14,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:30:17,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:30:17,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:30:17,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:30:19,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:30:19,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:30:21,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 01:30:21,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 01:30:24,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:30:28,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:30:28,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 01:30:28,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:30:28,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:30:28,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 01:30:28,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 01:30:28,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 01:30:30,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:30:31,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 01:30:31,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 01:30:35,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:30:35,945 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 01:30:37,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:30:39,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:30:39,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:30:41,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 01:30:43,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:30:43,149 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:30:44,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:30:44,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:30:44,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:30:46,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:30:46,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=206000.0, ans=0.0 2023-09-29 01:30:46,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=206000.0, ans=0.0 2023-09-29 01:30:49,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:30:50,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:30:52,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:30:56,203 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=206066.66666666666, ans=0.125 2023-09-29 01:30:57,152 INFO [train.py:1039] (3/4) Epoch 6, batch 4350, loss[loss=0.1996, simple_loss=0.2642, pruned_loss=0.0675, over 24324.00 frames. ], tot_loss[loss=0.2327, simple_loss=0.2945, pruned_loss=0.08542, over 4696423.30 frames. ], batch size: 56, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:30:58,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 01:30:58,889 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 01:31:04,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:31:06,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:31:09,393 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.06 vs. limit=12.0 2023-09-29 01:31:11,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:31:11,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:31:13,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=206133.33333333334, ans=0.09899494936611666 2023-09-29 01:31:16,686 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.50 vs. limit=6.0 2023-09-29 01:31:17,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:31:20,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:31:23,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:31:23,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:31:27,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:31:30,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:31:32,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:31:37,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 01:31:37,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:31:39,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:31:40,069 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.05 vs. limit=15.0 2023-09-29 01:31:43,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:31:44,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 01:31:50,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:31:50,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 01:31:52,770 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.53 vs. limit=5.0 2023-09-29 01:31:54,681 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 01:31:56,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:31:56,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:31:57,777 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 01:31:59,193 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 01:31:59,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:32:00,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:32:02,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:32:03,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:32:04,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:32:04,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:32:06,017 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.725e+02 2.197e+02 2.443e+02 2.898e+02 4.711e+02, threshold=4.887e+02, percent-clipped=0.0 2023-09-29 01:32:07,738 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 01:32:07,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:07,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:32:07,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:07,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 01:32:09,440 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 01:32:09,447 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 01:32:09,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 01:32:12,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:32:12,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:32:13,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:32:15,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:32:15,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 01:32:18,820 INFO [train.py:1039] (3/4) Epoch 6, batch 4400, loss[loss=0.2165, simple_loss=0.2979, pruned_loss=0.06756, over 24292.00 frames. ], tot_loss[loss=0.2331, simple_loss=0.2957, pruned_loss=0.08521, over 4714767.04 frames. ], batch size: 74, lr: 1.65e-02, grad_scale: 32.0 2023-09-29 01:32:19,006 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 01:32:19,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:23,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:32:23,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:25,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:32:27,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 01:32:28,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 01:32:28,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 01:32:28,894 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 01:32:30,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 01:32:30,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:32:32,147 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 01:32:33,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:35,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:32:35,385 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 01:32:38,948 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:32:38,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 01:32:40,328 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 01:32:43,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 01:32:43,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 01:32:43,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 01:32:43,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:32:45,350 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:32:45,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:32:46,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:32:49,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 01:32:49,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 01:32:51,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:32:53,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:32:53,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:32:54,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:32:54,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:32:54,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 01:32:55,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=206533.33333333334, ans=0.04949747468305833 2023-09-29 01:32:57,925 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 01:33:00,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:33:02,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=206533.33333333334, ans=0.2 2023-09-29 01:33:06,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:33:10,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 01:33:14,417 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.62 vs. limit=8.0 2023-09-29 01:33:14,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:33:19,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:33:20,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:33:21,572 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.62 vs. limit=15.0 2023-09-29 01:33:22,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 01:33:22,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:33:22,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:33:22,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:33:22,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:33:29,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 01:33:33,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 01:33:34,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 01:33:34,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:33:34,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 01:33:36,889 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:33:40,032 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:33:41,385 INFO [train.py:1039] (3/4) Epoch 6, batch 4450, loss[loss=0.2285, simple_loss=0.3089, pruned_loss=0.07406, over 24552.00 frames. ], tot_loss[loss=0.2344, simple_loss=0.2968, pruned_loss=0.08599, over 4705713.12 frames. ], batch size: 71, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:33:41,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 01:33:44,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:33:48,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:33:49,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:33:54,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:33:54,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:33:55,233 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.04 vs. limit=15.0 2023-09-29 01:33:59,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:34:00,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:34:04,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:34:05,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:34:05,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 01:34:05,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:34:07,838 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:34:07,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:34:07,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:34:11,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:34:16,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:34:17,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:34:19,170 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:34:20,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:34:20,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:34:26,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 01:34:26,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 01:34:26,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 01:34:26,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:34:29,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:34:29,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 01:34:31,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=206933.33333333334, ans=0.125 2023-09-29 01:34:32,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:34:36,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:34:37,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 01:34:37,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:34:37,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:34:37,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:34:37,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:34:40,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:34:45,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 01:34:46,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 01:34:48,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:34:49,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:34:50,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:34:52,743 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 2.158e+02 2.416e+02 2.828e+02 3.801e+02, threshold=4.831e+02, percent-clipped=0.0 2023-09-29 01:34:52,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:34:52,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 01:34:54,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:34:58,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 01:35:01,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:35:04,021 INFO [train.py:1039] (3/4) Epoch 6, batch 4500, loss[loss=0.2339, simple_loss=0.2811, pruned_loss=0.09331, over 23396.00 frames. ], tot_loss[loss=0.2345, simple_loss=0.2969, pruned_loss=0.08609, over 4712255.78 frames. ], batch size: 285, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:35:05,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:35:07,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 01:35:07,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 01:35:08,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:35:16,117 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:35:16,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:35:16,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:35:18,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:35:18,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:35:18,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:35:32,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:35:34,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:35:36,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:35:36,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:35:37,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:35:43,746 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:35:44,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=207200.0, ans=0.125 2023-09-29 01:35:47,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:35:51,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:35:54,928 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:35:54,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 01:35:57,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:35:57,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:35:59,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:35:59,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:36:01,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:36:01,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 01:36:01,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 01:36:01,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:36:06,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:36:06,716 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:36:09,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:36:10,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=207333.33333333334, ans=0.125 2023-09-29 01:36:11,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:36:11,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:36:14,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 01:36:15,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 01:36:15,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 01:36:21,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 01:36:25,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 01:36:26,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:36:27,958 INFO [train.py:1039] (3/4) Epoch 6, batch 4550, loss[loss=0.2288, simple_loss=0.2846, pruned_loss=0.08648, over 23295.00 frames. ], tot_loss[loss=0.2329, simple_loss=0.2956, pruned_loss=0.08516, over 4710990.54 frames. ], batch size: 119, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:36:29,169 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.09 vs. limit=22.5 2023-09-29 01:36:29,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:36:29,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:36:34,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:36:40,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:36:42,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:36:45,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:36:45,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:36:45,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:36:47,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:36:47,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:36:51,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:36:55,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 01:36:55,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 01:36:57,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:36:57,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=207466.66666666666, ans=0.125 2023-09-29 01:36:59,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 01:37:02,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 01:37:02,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:37:05,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 01:37:08,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:37:10,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:12,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:12,648 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:37:14,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 01:37:17,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:37:19,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=207600.0, ans=0.125 2023-09-29 01:37:20,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:20,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:37:20,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=207600.0, ans=0.125 2023-09-29 01:37:21,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:37:22,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=207600.0, ans=0.0 2023-09-29 01:37:23,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 01:37:24,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 01:37:25,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:37:25,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 01:37:26,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 01:37:26,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:37:28,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:37:28,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:37:29,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=207600.0, ans=0.0 2023-09-29 01:37:30,983 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.76 vs. limit=15.0 2023-09-29 01:37:31,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:31,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:37:33,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:37:33,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 01:37:37,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:37:37,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 01:37:37,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 01:37:37,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:37:37,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 01:37:39,086 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 2.083e+02 2.307e+02 2.767e+02 3.692e+02, threshold=4.614e+02, percent-clipped=0.0 2023-09-29 01:37:39,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=207666.66666666666, ans=0.125 2023-09-29 01:37:42,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:37:42,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:37:44,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:37:44,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=207666.66666666666, ans=0.125 2023-09-29 01:37:46,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:37:46,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:37:47,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:37:50,415 INFO [train.py:1039] (3/4) Epoch 6, batch 4600, loss[loss=0.2188, simple_loss=0.2958, pruned_loss=0.0709, over 24470.00 frames. ], tot_loss[loss=0.2324, simple_loss=0.2957, pruned_loss=0.08461, over 4725647.73 frames. ], batch size: 69, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:37:50,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:37:52,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:37:53,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:37:55,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=207733.33333333334, ans=0.125 2023-09-29 01:37:56,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:37:56,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:37:58,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:37:59,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 01:38:00,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:38:05,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:38:05,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:38:05,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=207800.0, ans=0.0 2023-09-29 01:38:08,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:10,142 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.57 vs. limit=15.0 2023-09-29 01:38:16,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=207800.0, ans=0.125 2023-09-29 01:38:17,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 01:38:18,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:20,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:22,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:38:22,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:38:26,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=207866.66666666666, ans=0.125 2023-09-29 01:38:26,484 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=13.88 vs. limit=15.0 2023-09-29 01:38:28,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 01:38:28,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:38:28,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:38:34,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:34,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:38:36,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:38:43,978 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 01:38:44,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 01:38:48,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:38:50,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:38:53,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:38:53,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 01:38:53,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:38:54,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=207933.33333333334, ans=0.0 2023-09-29 01:38:55,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 01:38:55,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:38:55,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:38:58,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:38:58,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:39:00,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:39:00,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 01:39:00,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 01:39:00,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 01:39:00,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:39:02,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:39:03,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:39:05,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:39:11,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=208066.66666666666, ans=0.07 2023-09-29 01:39:13,058 INFO [train.py:1039] (3/4) Epoch 6, batch 4650, loss[loss=0.2306, simple_loss=0.294, pruned_loss=0.08361, over 22085.00 frames. ], tot_loss[loss=0.2323, simple_loss=0.2953, pruned_loss=0.08464, over 4722481.40 frames. ], batch size: 48, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:39:16,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:39:18,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:39:20,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:39:20,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:39:21,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:39:21,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:39:21,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:39:26,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 01:39:30,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=208133.33333333334, ans=0.0 2023-09-29 01:39:31,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:39:32,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 01:39:32,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:39:32,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 01:39:34,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:39:34,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 01:39:34,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 01:39:34,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:39:36,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:39:39,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:39:40,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:39:40,715 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 01:39:42,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:39:44,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 01:39:47,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:39:47,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:39:47,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 01:39:50,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:39:55,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:39:58,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:40:00,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=208200.0, ans=0.125 2023-09-29 01:40:03,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=208266.66666666666, ans=0.0 2023-09-29 01:40:04,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:40:06,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:40:08,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:40:08,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:40:11,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 01:40:12,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 01:40:12,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 01:40:12,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 01:40:15,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:40:23,226 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 2.129e+02 2.354e+02 2.649e+02 3.887e+02, threshold=4.707e+02, percent-clipped=0.0 2023-09-29 01:40:23,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:40:23,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:40:23,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 01:40:25,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:40:28,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:40:28,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:40:29,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:40:32,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:40:32,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:40:34,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:40:36,321 INFO [train.py:1039] (3/4) Epoch 6, batch 4700, loss[loss=0.2891, simple_loss=0.3241, pruned_loss=0.1271, over 19571.00 frames. ], tot_loss[loss=0.2332, simple_loss=0.2962, pruned_loss=0.0851, over 4718969.03 frames. ], batch size: 388, lr: 1.65e-02, grad_scale: 16.0 2023-09-29 01:40:36,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:40:37,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:40:37,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:40:38,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 01:40:39,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 01:40:41,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 01:40:49,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:40:49,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:40:50,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:40:51,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=208466.66666666666, ans=0.125 2023-09-29 01:40:52,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:40:53,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 01:40:57,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 01:40:58,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 01:40:59,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=208466.66666666666, ans=0.0 2023-09-29 01:41:02,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:41:02,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:41:02,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:41:07,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:41:14,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:41:15,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 01:41:18,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:41:26,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 01:41:26,194 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:41:27,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:28,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=208600.0, ans=0.0 2023-09-29 01:41:31,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 01:41:31,343 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=208600.0, ans=0.125 2023-09-29 01:41:34,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:41:40,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:41:40,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 01:41:43,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:43,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:41:46,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:41:47,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:41:47,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 01:41:48,662 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 01:41:50,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:41:53,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:53,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:53,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 01:41:54,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:41:57,787 INFO [train.py:1039] (3/4) Epoch 6, batch 4750, loss[loss=0.2085, simple_loss=0.2704, pruned_loss=0.07328, over 20702.00 frames. ], tot_loss[loss=0.234, simple_loss=0.2972, pruned_loss=0.08545, over 4718855.57 frames. ], batch size: 45, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:41:58,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 01:41:59,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:42:01,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:42:04,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:42:06,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:42:07,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 01:42:07,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:42:11,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 01:42:13,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:42:14,048 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.51 vs. limit=22.5 2023-09-29 01:42:15,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:42:15,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:42:20,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 01:42:24,921 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:42:26,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 01:42:26,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:42:31,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:42:31,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:42:31,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:42:32,729 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 01:42:32,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 01:42:36,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=208866.66666666666, ans=0.125 2023-09-29 01:42:38,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 01:42:40,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:42:41,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=208866.66666666666, ans=0.125 2023-09-29 01:42:42,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:42:46,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:42:46,635 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 01:42:46,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:42:50,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:42:54,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:42:55,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 01:42:55,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 01:42:57,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:42:57,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:42:57,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:42:58,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:42:58,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 01:43:00,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 01:43:03,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:43:06,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:43:06,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 01:43:06,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:43:09,481 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.730e+02 2.147e+02 2.364e+02 2.785e+02 5.281e+02, threshold=4.728e+02, percent-clipped=1.0 2023-09-29 01:43:09,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:43:09,838 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:43:10,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=209000.0, ans=22.5 2023-09-29 01:43:11,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:43:11,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 01:43:14,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=209000.0, ans=0.125 2023-09-29 01:43:15,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:43:15,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 01:43:16,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 01:43:17,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 01:43:21,134 INFO [train.py:1039] (3/4) Epoch 6, batch 4800, loss[loss=0.2323, simple_loss=0.2908, pruned_loss=0.08693, over 23500.00 frames. ], tot_loss[loss=0.2351, simple_loss=0.2983, pruned_loss=0.0859, over 4725978.42 frames. ], batch size: 134, lr: 1.64e-02, grad_scale: 32.0 2023-09-29 01:43:21,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:43:23,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:43:23,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 01:43:23,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=209066.66666666666, ans=0.0 2023-09-29 01:43:27,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=209066.66666666666, ans=0.1 2023-09-29 01:43:30,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:43:30,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:43:30,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=209066.66666666666, ans=0.0 2023-09-29 01:43:36,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:43:37,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:43:37,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:43:39,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 01:43:40,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:43:40,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:43:42,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:43:44,288 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=209133.33333333334, ans=0.125 2023-09-29 01:43:48,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:43:48,543 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=209133.33333333334, ans=0.125 2023-09-29 01:43:49,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:43:50,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:43:52,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:43:52,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 01:43:52,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:43:53,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:43:56,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:43:58,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=209200.0, ans=0.0 2023-09-29 01:43:59,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:44:03,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:44:03,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:44:03,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=209200.0, ans=0.2 2023-09-29 01:44:04,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 01:44:04,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:44:07,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 01:44:07,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 01:44:07,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:44:07,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:44:09,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:44:09,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:44:09,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:44:11,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:44:11,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:44:14,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:44:14,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=209266.66666666666, ans=0.0 2023-09-29 01:44:15,294 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=6.35 vs. limit=12.0 2023-09-29 01:44:16,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=209266.66666666666, ans=0.1 2023-09-29 01:44:17,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:17,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=209266.66666666666, ans=15.0 2023-09-29 01:44:18,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:44:23,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 01:44:25,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:44:25,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:25,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:44:26,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:44:30,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:44:32,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:44:32,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:32,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:44:34,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:44:34,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:44:38,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:44:39,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:39,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:44:40,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 01:44:42,324 INFO [train.py:1039] (3/4) Epoch 6, batch 4850, loss[loss=0.2285, simple_loss=0.3016, pruned_loss=0.07763, over 24650.00 frames. ], tot_loss[loss=0.2358, simple_loss=0.2991, pruned_loss=0.08627, over 4728414.97 frames. ], batch size: 68, lr: 1.64e-02, grad_scale: 32.0 2023-09-29 01:44:42,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 01:44:42,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:44:42,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:44:42,597 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:44:42,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:44:44,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=209400.0, ans=0.2 2023-09-29 01:44:45,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:44:50,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=209400.0, ans=0.2 2023-09-29 01:44:52,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 01:44:53,518 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:44:56,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:44:59,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 01:44:59,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:45:03,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:45:05,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:45:06,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:45:06,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 01:45:12,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:45:14,283 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:45:14,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 01:45:15,802 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:45:15,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 01:45:19,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:45:19,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:45:23,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:45:24,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 01:45:24,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 01:45:27,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 01:45:31,396 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=209600.0, ans=0.125 2023-09-29 01:45:34,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:45:34,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 01:45:36,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:45:36,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:45:38,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:45:42,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 01:45:42,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:45:42,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 01:45:44,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:45:44,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:45:45,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 01:45:52,740 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=26.20 vs. limit=22.5 2023-09-29 01:45:53,387 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.753e+02 2.040e+02 2.308e+02 2.752e+02 3.700e+02, threshold=4.617e+02, percent-clipped=0.0 2023-09-29 01:45:55,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:45:57,020 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=209666.66666666666, ans=0.1 2023-09-29 01:45:57,483 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.08 vs. limit=15.0 2023-09-29 01:46:00,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:46:00,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:46:04,439 INFO [train.py:1039] (3/4) Epoch 6, batch 4900, loss[loss=0.2388, simple_loss=0.313, pruned_loss=0.08226, over 24578.00 frames. ], tot_loss[loss=0.2341, simple_loss=0.2973, pruned_loss=0.08547, over 4730423.41 frames. ], batch size: 71, lr: 1.64e-02, grad_scale: 32.0 2023-09-29 01:46:06,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 01:46:06,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:46:11,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:46:14,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:46:14,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:46:17,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 01:46:23,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 01:46:26,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 01:46:28,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 01:46:28,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:46:29,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:46:29,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:46:29,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:46:29,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:46:29,948 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 01:46:33,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 01:46:33,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:46:34,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:46:36,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:46:39,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:46:39,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:46:41,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=209866.66666666666, ans=0.0 2023-09-29 01:46:42,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:46:42,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 01:46:44,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:46:44,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:46:44,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 01:46:44,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 01:46:50,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=209866.66666666666, ans=0.125 2023-09-29 01:46:51,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 01:46:53,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:46:54,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:46:54,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:46:56,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:46:56,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 01:46:56,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:46:56,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 01:46:59,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:47:01,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 01:47:02,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:47:06,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 01:47:06,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=209933.33333333334, ans=0.125 2023-09-29 01:47:07,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:47:07,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 01:47:07,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 01:47:11,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=210000.0, ans=0.0 2023-09-29 01:47:14,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:47:15,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:47:18,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 01:47:18,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:47:18,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:47:19,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:47:26,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:47:26,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:47:26,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:47:26,978 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 01:47:28,393 INFO [train.py:1039] (3/4) Epoch 6, batch 4950, loss[loss=0.2337, simple_loss=0.294, pruned_loss=0.08673, over 19713.00 frames. ], tot_loss[loss=0.233, simple_loss=0.2964, pruned_loss=0.08485, over 4737826.57 frames. ], batch size: 43, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:47:28,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:47:31,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:47:31,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 01:47:32,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=210066.66666666666, ans=0.125 2023-09-29 01:47:35,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=210066.66666666666, ans=0.125 2023-09-29 01:47:36,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 01:47:36,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 01:47:37,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:47:37,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 01:47:37,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:47:39,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:47:39,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 01:47:39,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:47:41,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:47:42,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:47:44,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:47:44,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:47:46,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:47:46,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:47:51,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 01:47:52,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=210133.33333333334, ans=0.0 2023-09-29 01:47:57,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:48:00,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:48:02,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:48:03,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:03,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:48:05,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 01:48:05,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 01:48:07,164 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=210200.0, ans=0.0 2023-09-29 01:48:08,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:10,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:48:10,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:48:11,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:48:11,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:48:13,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 01:48:14,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:48:17,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:48:20,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:48:24,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:48:24,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:25,168 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.36 vs. limit=15.0 2023-09-29 01:48:25,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 01:48:25,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:48:27,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:48:31,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:48:32,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:48:32,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:48:35,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:35,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:48:36,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:48:38,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:48:38,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:48:39,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:48:41,014 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.165e+02 2.494e+02 2.935e+02 4.100e+02, threshold=4.988e+02, percent-clipped=0.0 2023-09-29 01:48:41,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 01:48:44,377 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:48:49,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 01:48:49,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 01:48:50,395 INFO [train.py:1039] (3/4) Epoch 6, batch 5000, loss[loss=0.2317, simple_loss=0.2619, pruned_loss=0.1008, over 19234.00 frames. ], tot_loss[loss=0.233, simple_loss=0.2963, pruned_loss=0.08488, over 4738200.05 frames. ], batch size: 389, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:48:57,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:48:57,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:48:58,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 01:49:00,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 01:49:01,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:49:04,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 01:49:06,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 01:49:06,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:49:08,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 01:49:08,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:49:09,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:49:09,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 01:49:09,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:49:09,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:49:11,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 01:49:12,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 01:49:14,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:49:14,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 01:49:14,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:49:14,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:49:15,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 01:49:15,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 01:49:15,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 01:49:18,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 01:49:18,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:49:18,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:49:20,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 01:49:20,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:49:21,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:49:23,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:49:23,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 01:49:24,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 01:49:26,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:49:28,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:49:31,889 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 01:49:35,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:49:36,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:49:36,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:49:41,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 01:49:41,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:49:41,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:49:42,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:49:45,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 01:49:45,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:49:47,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=210600.0, ans=0.125 2023-09-29 01:49:48,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 01:49:50,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:49:56,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 01:50:01,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:50:01,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=210666.66666666666, ans=0.1 2023-09-29 01:50:09,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:50:11,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:50:11,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:50:11,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:50:13,269 INFO [train.py:1039] (3/4) Epoch 6, batch 5050, loss[loss=0.2477, simple_loss=0.298, pruned_loss=0.09867, over 23767.00 frames. ], tot_loss[loss=0.2337, simple_loss=0.297, pruned_loss=0.0852, over 4741519.19 frames. ], batch size: 212, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:50:13,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:50:13,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 01:50:13,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:50:18,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=210733.33333333334, ans=15.0 2023-09-29 01:50:18,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:50:18,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 01:50:20,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:50:21,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:50:21,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:50:23,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 01:50:23,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:50:23,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:50:26,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 01:50:27,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 01:50:29,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:50:34,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten.whitening_limit, batch_count=210800.0, ans=15.0 2023-09-29 01:50:36,166 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.52 vs. limit=15.0 2023-09-29 01:50:38,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=210800.0, ans=0.0 2023-09-29 01:50:40,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 01:50:42,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 01:50:42,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:50:42,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 01:50:42,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:50:43,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:50:45,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:50:46,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:50:46,765 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 01:50:46,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 01:50:49,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:50:52,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:50:54,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:50:54,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 01:50:55,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:50:59,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 01:51:00,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:51:00,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:51:02,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:51:03,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:51:05,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:51:05,456 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=210933.33333333334, ans=0.0 2023-09-29 01:51:08,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:51:08,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:51:09,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:51:09,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:51:09,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 01:51:11,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 01:51:13,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 01:51:15,511 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.15 vs. limit=15.0 2023-09-29 01:51:17,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:51:17,897 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 01:51:17,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 01:51:19,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:51:19,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:51:21,324 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 01:51:24,335 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.752e+02 2.235e+02 2.528e+02 3.059e+02 5.158e+02, threshold=5.056e+02, percent-clipped=2.0 2023-09-29 01:51:24,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:51:24,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 01:51:24,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:51:28,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:51:29,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:51:29,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 01:51:31,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 01:51:33,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:51:33,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:51:33,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 01:51:34,397 INFO [train.py:1039] (3/4) Epoch 6, batch 5100, loss[loss=0.2467, simple_loss=0.3196, pruned_loss=0.08692, over 24315.00 frames. ], tot_loss[loss=0.2341, simple_loss=0.2976, pruned_loss=0.08529, over 4744366.81 frames. ], batch size: 74, lr: 1.64e-02, grad_scale: 16.0 2023-09-29 01:51:36,147 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 01:51:39,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:51:39,643 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.71 vs. limit=10.0 2023-09-29 01:51:43,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 01:51:43,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 01:51:44,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:51:46,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:51:50,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:51:50,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 01:51:50,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 01:51:54,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:51:56,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:52:00,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:52:04,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 01:52:05,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:52:06,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:52:06,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 01:52:09,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:52:09,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:52:09,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 01:52:11,406 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 01:52:12,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:52:12,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 01:52:14,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 01:52:18,117 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=211200.0, ans=15.0 2023-09-29 01:52:18,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:52:19,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=211200.0, ans=0.125 2023-09-29 01:52:30,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:52:31,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 01:52:31,883 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 01:52:34,634 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 01:52:36,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 01:52:36,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:52:39,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 01:52:43,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 01:52:44,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 01:52:46,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=211333.33333333334, ans=0.125 2023-09-29 01:52:47,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 01:52:49,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 01:52:49,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 01:52:50,764 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 01:52:51,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=211333.33333333334, ans=0.125 2023-09-29 01:52:51,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=211333.33333333334, ans=0.1 2023-09-29 01:52:55,305 INFO [train.py:1039] (3/4) Epoch 6, batch 5150, loss[loss=0.2429, simple_loss=0.3171, pruned_loss=0.08434, over 23645.00 frames. ], tot_loss[loss=0.2347, simple_loss=0.2976, pruned_loss=0.08587, over 4740881.45 frames. ], batch size: 85, lr: 1.63e-02, grad_scale: 16.0 2023-09-29 01:52:55,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:52:55,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:52:55,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:52:57,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:52:59,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 01:52:59,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:53:00,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 01:53:00,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 01:53:00,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 01:53:00,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 01:53:00,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 01:53:03,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:53:03,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 01:53:03,827 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=211400.0, ans=0.125 2023-09-29 01:53:05,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:53:07,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:53:12,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 01:53:12,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 01:53:14,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:53:15,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 01:53:16,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=211466.66666666666, ans=0.1 2023-09-29 01:53:17,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 01:53:17,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:53:17,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:53:17,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:53:17,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:53:18,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 01:53:20,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:53:20,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:53:23,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 01:53:25,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 01:53:27,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:53:33,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 01:53:35,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 01:53:39,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:53:46,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:53:49,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:53:52,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:53:52,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:53:55,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 01:53:59,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:53:59,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 01:53:59,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 01:54:02,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=211666.66666666666, ans=0.09899494936611666 2023-09-29 01:54:04,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:54:04,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=211666.66666666666, ans=0.0 2023-09-29 01:54:05,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:54:05,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 01:54:08,360 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 2.054e+02 2.300e+02 2.668e+02 5.365e+02, threshold=4.600e+02, percent-clipped=1.0 2023-09-29 01:54:11,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:54:13,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:54:14,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:54:14,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:54:14,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 01:54:16,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 01:54:16,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 01:54:16,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:54:18,920 INFO [train.py:1039] (3/4) Epoch 6, batch 5200, loss[loss=0.2185, simple_loss=0.29, pruned_loss=0.07354, over 24477.00 frames. ], tot_loss[loss=0.2347, simple_loss=0.2975, pruned_loss=0.08597, over 4744886.39 frames. ], batch size: 63, lr: 1.63e-02, grad_scale: 32.0 2023-09-29 01:54:21,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=211733.33333333334, ans=0.0 2023-09-29 01:54:22,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:54:24,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:54:27,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:54:27,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=211733.33333333334, ans=0.04949747468305833 2023-09-29 01:54:28,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 01:54:30,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:54:30,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:54:32,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:54:34,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 01:54:34,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:54:37,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 01:54:40,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 01:54:41,335 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=11.31 vs. limit=15.0 2023-09-29 01:54:41,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:54:43,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 01:54:46,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 01:54:47,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 01:54:48,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 01:54:48,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 01:54:48,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=211800.0, ans=0.125 2023-09-29 01:54:51,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 01:54:52,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:54:52,586 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 01:54:52,599 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:54:55,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:54:56,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:54:57,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 01:54:57,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:55:01,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:55:03,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 01:55:05,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 01:55:05,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 01:55:10,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 01:55:11,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 01:55:16,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:55:16,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:55:17,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 01:55:19,411 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:55:19,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 01:55:19,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:55:19,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:55:24,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:55:24,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 01:55:26,670 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=212000.0, ans=0.125 2023-09-29 01:55:30,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:55:32,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:55:32,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:55:36,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:55:37,043 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=212000.0, ans=0.09899494936611666 2023-09-29 01:55:38,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 01:55:39,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 01:55:39,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:55:41,380 INFO [train.py:1039] (3/4) Epoch 6, batch 5250, loss[loss=0.2392, simple_loss=0.302, pruned_loss=0.0882, over 23398.00 frames. ], tot_loss[loss=0.2345, simple_loss=0.2966, pruned_loss=0.0862, over 4744304.17 frames. ], batch size: 93, lr: 1.63e-02, grad_scale: 32.0 2023-09-29 01:55:41,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:55:41,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 01:55:41,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 01:55:45,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:55:48,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:55:48,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:55:49,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:55:56,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:55:57,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 01:56:00,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:56:01,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 01:56:05,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 01:56:05,142 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:56:06,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:56:07,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=212133.33333333334, ans=0.125 2023-09-29 01:56:11,075 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 01:56:18,307 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.54 vs. limit=22.5 2023-09-29 01:56:34,838 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=212266.66666666666, ans=0.125 2023-09-29 01:56:46,950 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.781e+02 2.177e+02 2.529e+02 3.121e+02 4.794e+02, threshold=5.058e+02, percent-clipped=1.0 2023-09-29 01:56:52,105 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.25 vs. limit=12.0 2023-09-29 01:56:55,340 INFO [train.py:1039] (3/4) Epoch 6, batch 5300, loss[loss=0.2109, simple_loss=0.2856, pruned_loss=0.0681, over 24292.00 frames. ], tot_loss[loss=0.234, simple_loss=0.2958, pruned_loss=0.08606, over 4730923.09 frames. ], batch size: 61, lr: 1.63e-02, grad_scale: 16.0 2023-09-29 01:57:07,941 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=25.36 vs. limit=22.5 2023-09-29 01:57:11,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 01:57:11,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 01:57:11,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 01:57:11,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:57:11,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:57:11,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:57:11,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:57:11,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:57:12,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:57:12,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:57:12,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 01:57:12,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:57:12,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 01:57:12,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 01:57:12,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 01:57:13,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 01:57:13,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 01:57:13,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 01:57:13,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:57:14,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:57:14,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:57:14,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:57:14,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:57:15,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:57:15,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:57:15,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:57:15,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:57:15,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:57:15,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 01:57:15,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:57:15,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 01:57:16,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 01:57:16,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 01:57:17,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 01:57:17,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 01:57:17,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 01:57:17,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 01:57:17,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:57:17,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 01:57:18,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 01:57:18,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:57:19,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 01:57:19,319 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 01:57:19,470 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 01:57:19,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 01:57:19,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 01:57:19,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:57:19,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 01:57:19,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 01:57:20,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 01:57:20,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 01:57:30,914 INFO [train.py:1039] (3/4) Epoch 7, batch 0, loss[loss=0.2413, simple_loss=0.2946, pruned_loss=0.09399, over 23382.00 frames. ], tot_loss[loss=0.2413, simple_loss=0.2946, pruned_loss=0.09399, over 23382.00 frames. ], batch size: 285, lr: 1.53e-02, grad_scale: 32.0 2023-09-29 01:57:30,915 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 01:57:45,842 INFO [train.py:1071] (3/4) Epoch 7, validation: loss=0.2938, simple_loss=0.3001, pruned_loss=0.1437, over 1125622.00 frames. 2023-09-29 01:57:45,843 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 01:57:47,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 01:57:48,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:57:51,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 01:57:56,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:57:56,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 01:57:58,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:57:58,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 01:58:00,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 01:58:03,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:58:03,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:58:07,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:58:07,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:58:09,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 01:58:09,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:58:10,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 01:58:12,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:58:22,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 01:58:22,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:58:24,460 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 01:58:27,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 01:58:29,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:58:30,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:58:31,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=212613.33333333334, ans=0.0 2023-09-29 01:58:35,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 01:58:40,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:58:46,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 01:58:51,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 01:58:51,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:58:51,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:58:51,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 01:58:53,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:58:54,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 01:58:56,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:58:57,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 01:59:01,594 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:59:03,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=212746.66666666666, ans=0.125 2023-09-29 01:59:04,547 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 01:59:05,961 INFO [train.py:1039] (3/4) Epoch 7, batch 50, loss[loss=0.2489, simple_loss=0.3012, pruned_loss=0.09833, over 23573.00 frames. ], tot_loss[loss=0.2349, simple_loss=0.298, pruned_loss=0.08587, over 1067860.06 frames. ], batch size: 256, lr: 1.53e-02, grad_scale: 32.0 2023-09-29 01:59:06,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 01:59:09,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:59:10,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=212813.33333333334, ans=0.125 2023-09-29 01:59:11,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:59:11,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 01:59:11,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 01:59:12,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 01:59:15,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:59:16,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 01:59:20,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 01:59:22,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 01:59:22,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:59:31,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 01:59:31,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 01:59:32,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 01:59:36,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 01:59:36,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 01:59:38,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:59:38,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 01:59:39,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 01:59:39,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 01:59:39,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 01:59:47,497 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.34 vs. limit=22.5 2023-09-29 01:59:48,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 01:59:49,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 01:59:49,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 01:59:51,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 01:59:54,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 01:59:54,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 01:59:54,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 01:59:56,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 01:59:58,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 02:00:01,120 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 2.205e+02 2.565e+02 2.922e+02 4.560e+02, threshold=5.129e+02, percent-clipped=0.0 2023-09-29 02:00:06,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:00:06,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:00:06,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:00:08,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:00:08,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:00:11,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 02:00:12,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 02:00:13,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:00:13,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:00:15,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:00:16,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:00:16,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 02:00:16,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 02:00:18,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 02:00:20,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:00:20,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:00:21,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 02:00:21,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 02:00:21,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:00:23,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:00:24,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 02:00:25,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:00:28,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:00:29,757 INFO [train.py:1039] (3/4) Epoch 7, batch 100, loss[loss=0.3211, simple_loss=0.3559, pruned_loss=0.1432, over 19033.00 frames. ], tot_loss[loss=0.2357, simple_loss=0.2981, pruned_loss=0.08668, over 1871819.95 frames. ], batch size: 388, lr: 1.53e-02, grad_scale: 32.0 2023-09-29 02:00:34,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:00:36,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:00:38,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 02:00:38,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:00:41,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:00:41,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:00:41,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:00:41,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:00:41,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:00:42,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 02:00:46,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 02:00:46,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:00:46,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:00:46,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:00:51,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 02:00:51,963 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.92 vs. limit=22.5 2023-09-29 02:00:52,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:00:52,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:00:54,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:00:56,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 02:01:01,110 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 02:01:01,161 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 02:01:04,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:01:04,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:01:10,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:01:11,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:01:13,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:21,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:22,080 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 02:01:23,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=213346.66666666666, ans=0.1 2023-09-29 02:01:25,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 02:01:28,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:01:30,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:01:33,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:34,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:01:38,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:01:40,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:01:40,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=213413.33333333334, ans=0.1 2023-09-29 02:01:43,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:45,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:01:47,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:01:47,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:01:47,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:01:47,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 02:01:47,575 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 02:01:47,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:01:49,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:01:49,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:01:49,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:01:50,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 02:01:50,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 02:01:50,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 02:01:50,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:01:50,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:01:52,465 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:01:53,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:01:54,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:01:54,851 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.68 vs. limit=15.0 2023-09-29 02:01:57,174 INFO [train.py:1039] (3/4) Epoch 7, batch 150, loss[loss=0.2263, simple_loss=0.2997, pruned_loss=0.07649, over 24561.00 frames. ], tot_loss[loss=0.2334, simple_loss=0.2962, pruned_loss=0.0853, over 2506682.54 frames. ], batch size: 71, lr: 1.53e-02, grad_scale: 32.0 2023-09-29 02:01:57,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:01:58,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:01:58,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:02:00,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:02,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:02:04,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:07,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:02:08,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:13,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 02:02:13,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 02:02:13,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 02:02:15,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:02:15,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:02:17,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:02:18,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=213546.66666666666, ans=0.0 2023-09-29 02:02:19,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:02:19,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:02:19,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:21,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:02:22,584 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 02:02:24,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:02:30,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:02:33,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:02:37,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 02:02:40,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:02:41,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:02:41,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:02:43,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:02:45,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:02:45,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:02:46,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:02:48,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 02:02:50,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=213680.0, ans=0.0 2023-09-29 02:02:51,337 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.641e+02 2.019e+02 2.375e+02 2.708e+02 4.033e+02, threshold=4.751e+02, percent-clipped=0.0 2023-09-29 02:02:55,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:02:55,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:02:55,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:02:55,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:02:58,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:02:59,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 02:03:00,474 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.67 vs. limit=15.0 2023-09-29 02:03:01,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:03:02,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:03:04,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:03:06,044 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:03:06,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 02:03:06,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:03:06,162 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 02:03:12,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:03:12,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=213746.66666666666, ans=0.125 2023-09-29 02:03:15,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:03:17,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:03:19,252 INFO [train.py:1039] (3/4) Epoch 7, batch 200, loss[loss=0.2425, simple_loss=0.2918, pruned_loss=0.09662, over 23639.00 frames. ], tot_loss[loss=0.2339, simple_loss=0.2974, pruned_loss=0.0852, over 3000885.38 frames. ], batch size: 256, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:03:20,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 02:03:22,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:03:23,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:03:26,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 02:03:28,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:03:30,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:03:32,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:03:32,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=213813.33333333334, ans=0.125 2023-09-29 02:03:34,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:03:34,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:03:34,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:03:42,798 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.85 vs. limit=22.5 2023-09-29 02:03:43,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=213880.0, ans=0.2 2023-09-29 02:03:53,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:03:53,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:03:55,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:03:55,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:03:57,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 02:03:57,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 02:03:58,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:00,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:04:02,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:04:02,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:04:03,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 02:04:03,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 02:04:05,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:04:07,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:04:12,670 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=214013.33333333334, ans=0.5 2023-09-29 02:04:15,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:04:22,061 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:23,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:04:23,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=214080.0, ans=0.1 2023-09-29 02:04:30,398 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:32,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=214080.0, ans=0.125 2023-09-29 02:04:33,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 02:04:34,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:04:34,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:04:35,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:04:37,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:04:37,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 02:04:37,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:04:38,731 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 02:04:40,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:42,339 INFO [train.py:1039] (3/4) Epoch 7, batch 250, loss[loss=0.2395, simple_loss=0.3024, pruned_loss=0.08827, over 23579.00 frames. ], tot_loss[loss=0.2341, simple_loss=0.2968, pruned_loss=0.08566, over 3379580.97 frames. ], batch size: 134, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:04:42,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:04:43,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:44,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:04:45,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:04:45,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:04:48,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:04:51,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:05:02,673 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=214213.33333333334, ans=0.125 2023-09-29 02:05:03,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:05:06,100 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:05:06,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:05:07,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=214213.33333333334, ans=0.035 2023-09-29 02:05:15,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:05:16,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 02:05:18,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:05:18,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:05:19,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 02:05:19,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:05:20,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:05:21,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=214280.0, ans=0.125 2023-09-29 02:05:23,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:05:24,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 02:05:24,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:05:27,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:05:27,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:05:27,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:05:29,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:05:29,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:05:29,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:05:32,913 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:05:34,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:05:34,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:05:35,832 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 2.070e+02 2.355e+02 2.714e+02 5.110e+02, threshold=4.709e+02, percent-clipped=2.0 2023-09-29 02:05:39,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:05:42,280 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=214346.66666666666, ans=0.0 2023-09-29 02:05:46,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:05:49,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:05:53,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=214413.33333333334, ans=0.1 2023-09-29 02:05:54,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:05:56,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:05:59,355 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 02:05:59,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:05:59,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:06:01,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=214413.33333333334, ans=0.125 2023-09-29 02:06:02,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 02:06:02,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 02:06:04,126 INFO [train.py:1039] (3/4) Epoch 7, batch 300, loss[loss=0.2242, simple_loss=0.3005, pruned_loss=0.07396, over 24473.00 frames. ], tot_loss[loss=0.2319, simple_loss=0.2951, pruned_loss=0.08437, over 3677235.62 frames. ], batch size: 69, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:06:04,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:06:04,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 02:06:08,569 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.22 vs. limit=12.0 2023-09-29 02:06:09,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:06:10,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:06:11,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=214480.0, ans=0.1 2023-09-29 02:06:12,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:06:14,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 02:06:16,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:06:18,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:06:18,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 02:06:18,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:06:23,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 02:06:27,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:06:27,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 02:06:27,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=214546.66666666666, ans=0.125 2023-09-29 02:06:30,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 02:06:30,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:06:33,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:06:33,565 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=214546.66666666666, ans=0.125 2023-09-29 02:06:34,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:06:34,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 02:06:34,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:06:39,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:06:41,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:06:41,455 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:06:46,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 02:06:46,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 02:06:48,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:06:50,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:06:52,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 02:06:53,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:06:57,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:06:58,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:06:58,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 02:07:04,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:07:04,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:07:07,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:07:10,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:07:10,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 02:07:10,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 02:07:11,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:07:11,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 02:07:12,631 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.19 vs. limit=10.0 2023-09-29 02:07:15,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:07:16,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:18,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:07:18,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:07:18,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:25,204 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.56 vs. limit=12.0 2023-09-29 02:07:25,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:07:25,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 02:07:27,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:29,101 INFO [train.py:1039] (3/4) Epoch 7, batch 350, loss[loss=0.1924, simple_loss=0.262, pruned_loss=0.06137, over 24301.00 frames. ], tot_loss[loss=0.2296, simple_loss=0.2926, pruned_loss=0.08328, over 3896581.80 frames. ], batch size: 56, lr: 1.52e-02, grad_scale: 16.0 2023-09-29 02:07:34,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:07:39,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:07:39,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:40,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 02:07:43,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:07:43,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 02:07:45,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:07:46,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 02:07:46,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:07:50,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 02:07:52,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:07:54,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:07:57,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:07:57,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=214880.0, ans=0.0 2023-09-29 02:07:58,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:07:58,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:07:58,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:07:58,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:07:59,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:08:02,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:08:02,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:08:10,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:08:10,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:08:12,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:08:12,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:08:18,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 02:08:18,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:08:23,687 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.18 vs. limit=12.0 2023-09-29 02:08:24,060 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 2.070e+02 2.347e+02 2.700e+02 4.079e+02, threshold=4.694e+02, percent-clipped=0.0 2023-09-29 02:08:24,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:08:24,310 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:08:24,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:08:27,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 02:08:29,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:08:30,836 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 02:08:32,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 02:08:32,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:08:34,754 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.13 vs. limit=6.0 2023-09-29 02:08:35,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:08:35,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 02:08:38,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:08:41,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:08:41,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:08:42,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:08:42,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:08:45,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:08:48,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:08:51,118 INFO [train.py:1039] (3/4) Epoch 7, batch 400, loss[loss=0.1959, simple_loss=0.2649, pruned_loss=0.06351, over 24634.00 frames. ], tot_loss[loss=0.2288, simple_loss=0.2917, pruned_loss=0.08297, over 4062261.21 frames. ], batch size: 60, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:08:51,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:08:51,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 02:08:51,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:08:52,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:08:54,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:08:54,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:08:57,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:08:57,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:09:01,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 02:09:02,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 02:09:02,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:09:04,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 02:09:04,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:09:10,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:09:10,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:09:10,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 02:09:11,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:09:12,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:09:12,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:09:13,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:09:16,113 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 02:09:16,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 02:09:20,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=215213.33333333334, ans=0.125 2023-09-29 02:09:21,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:09:22,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:09:24,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 02:09:25,812 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 02:09:26,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=215280.0, ans=15.0 2023-09-29 02:09:28,001 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.42 vs. limit=22.5 2023-09-29 02:09:28,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:09:31,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:09:34,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=215280.0, ans=0.125 2023-09-29 02:09:37,453 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=215280.0, ans=0.1 2023-09-29 02:09:38,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 02:09:40,919 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.29 vs. limit=10.0 2023-09-29 02:09:41,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:09:41,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 02:09:45,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:09:47,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:09:47,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 02:09:47,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=215346.66666666666, ans=0.1 2023-09-29 02:09:49,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=215346.66666666666, ans=0.0 2023-09-29 02:09:50,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:09:53,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 02:09:54,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:09:56,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:09:56,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 02:09:59,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 02:09:59,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 02:10:02,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:10:02,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:10:02,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=215413.33333333334, ans=0.125 2023-09-29 02:10:04,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 02:10:07,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:10:07,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:10:07,933 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 02:10:09,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 02:10:09,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:10:11,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:10:11,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:10:11,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 02:10:12,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:10:13,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=215480.0, ans=0.1 2023-09-29 02:10:14,719 INFO [train.py:1039] (3/4) Epoch 7, batch 450, loss[loss=0.2692, simple_loss=0.3204, pruned_loss=0.109, over 23669.00 frames. ], tot_loss[loss=0.2292, simple_loss=0.2925, pruned_loss=0.08289, over 4203334.74 frames. ], batch size: 256, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:10:14,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:10:17,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=215480.0, ans=0.1 2023-09-29 02:10:18,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:10:29,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:10:29,526 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:10:31,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=215546.66666666666, ans=0.2 2023-09-29 02:10:32,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 02:10:32,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 02:10:37,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:10:37,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=215546.66666666666, ans=0.5 2023-09-29 02:10:38,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:10:40,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:10:40,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=215546.66666666666, ans=0.125 2023-09-29 02:10:44,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:10:45,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:10:48,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 02:10:48,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 02:10:50,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 02:10:51,570 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.62 vs. limit=6.0 2023-09-29 02:10:52,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:10:52,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:10:54,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:10:57,305 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 02:10:57,330 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 02:10:57,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:10:58,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:11:00,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 02:11:02,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=215680.0, ans=0.125 2023-09-29 02:11:04,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 02:11:04,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:11:04,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 02:11:05,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 02:11:08,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:11:10,021 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 1.891e+02 2.155e+02 2.361e+02 4.169e+02, threshold=4.311e+02, percent-clipped=0.0 2023-09-29 02:11:10,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:11:10,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:11:11,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 02:11:11,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=215680.0, ans=0.125 2023-09-29 02:11:13,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=215680.0, ans=0.2 2023-09-29 02:11:16,112 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.21 vs. limit=10.0 2023-09-29 02:11:16,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:11:18,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 02:11:18,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 02:11:20,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:11:26,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:11:26,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:11:30,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:11:30,293 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 02:11:33,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:11:34,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:11:35,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:11:35,148 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 02:11:35,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=215813.33333333334, ans=0.07 2023-09-29 02:11:36,832 INFO [train.py:1039] (3/4) Epoch 7, batch 500, loss[loss=0.2547, simple_loss=0.3035, pruned_loss=0.103, over 22746.00 frames. ], tot_loss[loss=0.2292, simple_loss=0.2933, pruned_loss=0.08252, over 4328880.59 frames. ], batch size: 322, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:11:37,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 02:11:37,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:11:41,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 02:11:44,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 02:11:46,350 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:11:47,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:11:48,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:11:49,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:11:49,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=215813.33333333334, ans=0.125 2023-09-29 02:12:02,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:12:02,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:12:02,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=215880.0, ans=0.125 2023-09-29 02:12:03,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 02:12:03,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:12:03,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 02:12:04,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=215880.0, ans=0.0 2023-09-29 02:12:05,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:12:08,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:12:09,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:12:11,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:12:11,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:12:12,011 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=215946.66666666666, ans=0.125 2023-09-29 02:12:12,363 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.39 vs. limit=15.0 2023-09-29 02:12:13,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 02:12:14,898 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 02:12:17,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:12:19,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:12:20,261 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=215946.66666666666, ans=22.5 2023-09-29 02:12:21,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:12:21,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:12:22,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:12:24,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 02:12:26,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=216013.33333333334, ans=0.125 2023-09-29 02:12:28,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:12:29,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:12:34,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:12:38,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:12:42,995 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.27 vs. limit=15.0 2023-09-29 02:12:45,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:12:50,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 02:12:50,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:12:50,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:12:51,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 02:12:52,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=216080.0, ans=0.0 2023-09-29 02:12:53,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 02:12:53,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:12:59,426 INFO [train.py:1039] (3/4) Epoch 7, batch 550, loss[loss=0.2471, simple_loss=0.3185, pruned_loss=0.08783, over 24394.00 frames. ], tot_loss[loss=0.2309, simple_loss=0.2947, pruned_loss=0.08354, over 4418235.36 frames. ], batch size: 77, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:12:59,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 02:13:02,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 02:13:02,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:13:02,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 02:13:02,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:13:02,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:13:04,310 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:04,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:04,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:13:05,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:13:08,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:13:08,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=216146.66666666666, ans=0.125 2023-09-29 02:13:08,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=216146.66666666666, ans=0.1 2023-09-29 02:13:09,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 02:13:09,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:13:17,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:13:18,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:19,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:13:21,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:27,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 02:13:29,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 02:13:29,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=216213.33333333334, ans=0.1 2023-09-29 02:13:30,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:13:35,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:13:35,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:13:36,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:13:40,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:13:40,104 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 02:13:42,948 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:13:43,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 02:13:45,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:13:45,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=216280.0, ans=0.2 2023-09-29 02:13:47,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:13:47,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:13:49,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:13:49,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 02:13:51,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 02:13:52,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:13:52,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:13:54,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:13:54,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:13:55,678 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.132e+02 2.384e+02 2.779e+02 4.607e+02, threshold=4.767e+02, percent-clipped=1.0 2023-09-29 02:13:56,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:13:59,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:14:00,460 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.72 vs. limit=10.0 2023-09-29 02:14:01,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:14:02,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:14:02,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 02:14:04,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:14:05,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:14:05,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:14:07,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:14:08,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 02:14:08,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 02:14:13,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 02:14:19,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 02:14:22,409 INFO [train.py:1039] (3/4) Epoch 7, batch 600, loss[loss=0.2124, simple_loss=0.2862, pruned_loss=0.06926, over 24506.00 frames. ], tot_loss[loss=0.2315, simple_loss=0.2955, pruned_loss=0.08373, over 4481684.98 frames. ], batch size: 63, lr: 1.52e-02, grad_scale: 32.0 2023-09-29 02:14:22,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:14:22,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:14:22,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:14:30,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:14:34,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 02:14:34,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 02:14:37,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:14:37,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=216546.66666666666, ans=0.0 2023-09-29 02:14:39,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:14:39,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=216546.66666666666, ans=0.1 2023-09-29 02:14:42,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:14:45,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 02:14:45,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:14:48,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=216546.66666666666, ans=0.125 2023-09-29 02:14:50,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 02:14:50,350 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:14:53,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=216613.33333333334, ans=0.2 2023-09-29 02:14:55,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:14:55,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:14:55,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=216613.33333333334, ans=0.125 2023-09-29 02:14:56,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:15:01,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:15:01,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:15:03,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:15:06,097 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.10 vs. limit=12.0 2023-09-29 02:15:12,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:15:14,902 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:15:16,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:15:16,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:15:22,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 02:15:27,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 02:15:27,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:15:33,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 02:15:33,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:15:37,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 02:15:38,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:15:39,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:15:44,078 INFO [train.py:1039] (3/4) Epoch 7, batch 650, loss[loss=0.2022, simple_loss=0.2759, pruned_loss=0.06424, over 24449.00 frames. ], tot_loss[loss=0.2298, simple_loss=0.2939, pruned_loss=0.08287, over 4537697.97 frames. ], batch size: 63, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:15:45,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 02:15:47,204 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 02:15:48,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:15:49,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=216813.33333333334, ans=0.125 2023-09-29 02:15:51,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:15:51,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:15:54,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 02:15:56,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:16:01,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:16:01,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:16:04,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:16:07,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 02:16:11,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:16:11,914 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:16:16,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:16:16,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 02:16:18,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:16:19,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:21,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 02:16:21,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:22,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:16:24,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:16:25,710 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 02:16:25,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:16:25,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:16:25,978 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=216946.66666666666, ans=0.035 2023-09-29 02:16:28,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:30,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:16:30,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:16:30,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:16:30,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=217013.33333333334, ans=0.2 2023-09-29 02:16:34,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 02:16:34,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:16:34,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:16:36,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:16:37,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:16:38,836 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 2.233e+02 2.449e+02 2.795e+02 3.907e+02, threshold=4.898e+02, percent-clipped=0.0 2023-09-29 02:16:39,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:16:40,559 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 02:16:40,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 02:16:42,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:42,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:16:42,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:16:42,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:16:45,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:16:51,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:16:51,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:16:52,816 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:16:54,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:16:54,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:16:56,052 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:16:57,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=217080.0, ans=0.0 2023-09-29 02:17:02,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:17:02,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:17:03,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:17:03,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:17:05,255 INFO [train.py:1039] (3/4) Epoch 7, batch 700, loss[loss=0.2164, simple_loss=0.2951, pruned_loss=0.06886, over 24677.00 frames. ], tot_loss[loss=0.2289, simple_loss=0.2929, pruned_loss=0.08245, over 4568449.10 frames. ], batch size: 68, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:17:10,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 02:17:10,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 02:17:14,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 02:17:14,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=217146.66666666666, ans=0.0 2023-09-29 02:17:15,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:17:18,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:17:21,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 02:17:24,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:17:27,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:17:29,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:17:30,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:17:31,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:17:34,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:17:37,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 02:17:37,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:17:40,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 02:17:45,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 02:17:48,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:17:49,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:17:52,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:17:55,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=217346.66666666666, ans=0.1 2023-09-29 02:17:57,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:17:57,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 02:18:01,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:18:02,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:18:02,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 02:18:06,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:18:08,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:18:09,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=217413.33333333334, ans=0.125 2023-09-29 02:18:11,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:18:14,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:18:14,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 02:18:18,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 02:18:20,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 02:18:23,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:18:25,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:18:25,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:18:29,128 INFO [train.py:1039] (3/4) Epoch 7, batch 750, loss[loss=0.2352, simple_loss=0.3047, pruned_loss=0.08285, over 24481.00 frames. ], tot_loss[loss=0.2279, simple_loss=0.2917, pruned_loss=0.08203, over 4596454.11 frames. ], batch size: 66, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:18:29,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:18:29,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 02:18:33,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 02:18:33,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 02:18:33,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 02:18:35,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 02:18:35,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 02:18:35,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:18:38,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 02:18:38,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:18:39,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:18:41,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:18:42,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:18:44,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 02:18:44,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:18:48,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:18:50,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:18:52,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:18:54,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:18:56,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:18:56,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 02:18:57,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:18:58,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:19:00,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:19:01,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 02:19:03,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 02:19:03,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:19:05,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 02:19:05,562 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 02:19:05,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 02:19:05,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:19:07,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 02:19:07,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:19:10,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=217613.33333333334, ans=0.1 2023-09-29 02:19:14,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:19:14,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:19:14,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:19:16,667 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=217680.0, ans=0.125 2023-09-29 02:19:17,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:19:18,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=217680.0, ans=0.125 2023-09-29 02:19:19,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:19:19,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 02:19:21,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:19:23,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 02:19:23,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:19:24,706 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.977e+02 2.216e+02 2.498e+02 3.508e+02, threshold=4.433e+02, percent-clipped=0.0 2023-09-29 02:19:25,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:19:26,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 02:19:28,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:19:32,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:19:33,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:19:33,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:19:35,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:19:40,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 02:19:41,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:19:43,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:19:45,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:19:46,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:19:48,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:19:48,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 02:19:51,112 INFO [train.py:1039] (3/4) Epoch 7, batch 800, loss[loss=0.2203, simple_loss=0.2968, pruned_loss=0.07193, over 24648.00 frames. ], tot_loss[loss=0.2288, simple_loss=0.2928, pruned_loss=0.08238, over 4625320.56 frames. ], batch size: 73, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:19:57,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:19:57,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:19:59,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:19:59,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:20:00,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:20:00,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:04,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:20:05,538 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.67 vs. limit=15.0 2023-09-29 02:20:06,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=217880.0, ans=0.0 2023-09-29 02:20:08,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:20:09,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:20:12,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 02:20:12,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:14,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:20:14,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:20:15,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:20:15,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 02:20:15,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:20:17,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 02:20:19,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:20:21,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:20:23,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:20:23,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:20:25,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:25,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:30,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:20:30,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=217946.66666666666, ans=0.125 2023-09-29 02:20:31,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:20:31,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 02:20:33,946 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 02:20:33,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 02:20:35,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:20:35,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:20:37,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:20:39,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:20:44,480 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 02:20:44,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 02:20:44,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:20:47,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:20:51,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:20:55,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:20:56,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 02:20:56,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:21:01,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 02:21:01,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=218080.0, ans=0.125 2023-09-29 02:21:07,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:21:11,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:21:11,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 02:21:12,919 INFO [train.py:1039] (3/4) Epoch 7, batch 850, loss[loss=0.341, simple_loss=0.368, pruned_loss=0.157, over 19498.00 frames. ], tot_loss[loss=0.2294, simple_loss=0.2934, pruned_loss=0.08272, over 4645747.49 frames. ], batch size: 388, lr: 1.51e-02, grad_scale: 32.0 2023-09-29 02:21:13,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:21:13,706 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.12 vs. limit=15.0 2023-09-29 02:21:15,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:21:15,361 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:21:16,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 02:21:16,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:21:18,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:21:19,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:21:19,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:21:21,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:21:22,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 02:21:23,044 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 02:21:23,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 02:21:24,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:21:26,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:21:29,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:21:29,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:21:29,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:21:29,593 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=218213.33333333334, ans=0.0 2023-09-29 02:21:32,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:21:33,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:21:33,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 02:21:34,161 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=218213.33333333334, ans=0.2 2023-09-29 02:21:38,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 02:21:40,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:21:42,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 02:21:45,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 02:21:47,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 02:21:51,207 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 02:21:51,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:21:51,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:21:51,259 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 02:21:51,494 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=218280.0, ans=0.2 2023-09-29 02:21:51,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=218280.0, ans=0.09899494936611666 2023-09-29 02:21:54,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:21:55,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:21:57,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 02:22:00,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:22:00,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:22:01,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:22:01,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:22:02,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:22:03,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 02:22:05,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 02:22:05,498 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=218346.66666666666, ans=0.0 2023-09-29 02:22:09,489 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 2.250e+02 2.603e+02 3.078e+02 4.971e+02, threshold=5.207e+02, percent-clipped=2.0 2023-09-29 02:22:09,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:22:09,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:22:11,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:22:11,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:22:12,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:22:16,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:22:18,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:22:19,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=218413.33333333334, ans=0.125 2023-09-29 02:22:20,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:22:21,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:22:21,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:22:25,424 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.34 vs. limit=12.0 2023-09-29 02:22:28,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 02:22:29,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:22:29,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 02:22:30,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:22:31,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:22:32,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 02:22:34,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=218480.0, ans=0.125 2023-09-29 02:22:35,489 INFO [train.py:1039] (3/4) Epoch 7, batch 900, loss[loss=0.2371, simple_loss=0.2986, pruned_loss=0.08781, over 23367.00 frames. ], tot_loss[loss=0.2305, simple_loss=0.2944, pruned_loss=0.08328, over 4668065.79 frames. ], batch size: 105, lr: 1.51e-02, grad_scale: 16.0 2023-09-29 02:22:37,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:22:41,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:22:41,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 02:22:43,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=218480.0, ans=0.125 2023-09-29 02:22:44,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:22:45,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 02:22:48,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 02:22:50,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:22:50,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:22:50,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:22:51,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:22:58,908 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=218546.66666666666, ans=0.5 2023-09-29 02:23:01,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:23:01,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:23:01,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:23:02,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=218546.66666666666, ans=0.125 2023-09-29 02:23:04,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:23:09,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=218613.33333333334, ans=0.125 2023-09-29 02:23:10,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 02:23:12,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:23:16,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:23:18,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:23:18,511 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 02:23:18,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 02:23:21,232 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.21 vs. limit=15.0 2023-09-29 02:23:26,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:23:26,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:23:28,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:23:35,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:23:35,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:23:37,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 02:23:37,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:23:40,948 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 02:23:43,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:23:43,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:23:45,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:23:45,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:23:48,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 02:23:50,117 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 02:23:51,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 02:23:51,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 02:23:53,423 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:23:56,777 INFO [train.py:1039] (3/4) Epoch 7, batch 950, loss[loss=0.2364, simple_loss=0.3146, pruned_loss=0.07913, over 24643.00 frames. ], tot_loss[loss=0.2309, simple_loss=0.295, pruned_loss=0.08337, over 4692166.42 frames. ], batch size: 73, lr: 1.51e-02, grad_scale: 16.0 2023-09-29 02:23:58,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 02:24:03,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:24:05,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:24:05,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:24:06,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 02:24:08,437 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 02:24:08,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=218813.33333333334, ans=0.2 2023-09-29 02:24:12,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:24:13,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:24:14,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:24:15,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:24:15,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 02:24:16,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 02:24:18,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:24:19,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 02:24:19,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:24:24,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:24:24,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:24:24,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:24:25,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 02:24:27,591 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 02:24:27,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=218946.66666666666, ans=0.125 2023-09-29 02:24:31,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:24:32,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:24:39,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:24:39,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:24:41,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 02:24:43,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 02:24:43,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:24:43,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=218946.66666666666, ans=0.125 2023-09-29 02:24:44,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:24:44,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:24:44,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:24:50,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 02:24:50,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:24:53,264 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 2.008e+02 2.208e+02 2.603e+02 6.954e+02, threshold=4.417e+02, percent-clipped=1.0 2023-09-29 02:24:54,825 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:24:54,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:24:54,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 02:24:54,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:24:54,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:24:56,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 02:24:59,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:25:01,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:25:08,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:25:08,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 02:25:09,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 02:25:12,370 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.36 vs. limit=15.0 2023-09-29 02:25:13,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:25:17,793 INFO [train.py:1039] (3/4) Epoch 7, batch 1000, loss[loss=0.2507, simple_loss=0.2964, pruned_loss=0.1025, over 23735.00 frames. ], tot_loss[loss=0.2304, simple_loss=0.2939, pruned_loss=0.08348, over 4682324.47 frames. ], batch size: 164, lr: 1.51e-02, grad_scale: 16.0 2023-09-29 02:25:18,641 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.57 vs. limit=22.5 2023-09-29 02:25:19,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 02:25:19,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:25:19,721 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:25:22,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:25:25,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 02:25:25,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 02:25:30,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:25:30,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:25:32,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:25:34,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 02:25:38,557 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=219213.33333333334, ans=0.125 2023-09-29 02:25:39,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 02:25:41,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 02:25:43,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:25:44,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 02:25:46,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 02:25:46,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 02:25:48,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:25:49,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:25:57,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:25:58,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:25:59,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:25:59,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:25:59,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 02:25:59,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:26:01,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:26:01,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:26:02,551 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 02:26:06,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 02:26:06,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 02:26:07,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 02:26:10,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=219346.66666666666, ans=0.125 2023-09-29 02:26:11,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:26:18,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:26:18,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:26:20,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:26:21,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:26:21,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 02:26:24,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:26:24,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 02:26:24,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 02:26:27,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:26:27,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:26:28,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer_ff2.min_abs, batch_count=219413.33333333334, ans=0.1 2023-09-29 02:26:29,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:26:34,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:26:34,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=219413.33333333334, ans=0.125 2023-09-29 02:26:37,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:26:40,870 INFO [train.py:1039] (3/4) Epoch 7, batch 1050, loss[loss=0.2391, simple_loss=0.2971, pruned_loss=0.09056, over 23458.00 frames. ], tot_loss[loss=0.2285, simple_loss=0.292, pruned_loss=0.08246, over 4690401.40 frames. ], batch size: 120, lr: 1.51e-02, grad_scale: 16.0 2023-09-29 02:26:40,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:26:41,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:26:41,359 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=219480.0, ans=0.125 2023-09-29 02:26:42,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 02:26:44,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:26:46,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:26:46,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=219480.0, ans=0.125 2023-09-29 02:26:49,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:26:51,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:26:53,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:26:54,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:26:54,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:26:56,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:26:56,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 02:26:57,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:26:57,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 02:27:00,978 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:27:00,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 02:27:02,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 02:27:09,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:27:10,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:27:10,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:27:14,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 02:27:14,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 02:27:15,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:27:16,295 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=219613.33333333334, ans=0.125 2023-09-29 02:27:19,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 02:27:21,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 02:27:22,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:27:26,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 02:27:28,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 02:27:29,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:27:31,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:27:34,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:27:37,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 02:27:39,288 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.034e+02 2.317e+02 2.689e+02 3.658e+02, threshold=4.634e+02, percent-clipped=0.0 2023-09-29 02:27:39,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 02:27:39,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 02:27:39,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:27:40,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:27:41,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=219680.0, ans=0.1 2023-09-29 02:27:42,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 02:27:47,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:27:49,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:27:49,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:27:49,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:27:49,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:27:52,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=219746.66666666666, ans=0.125 2023-09-29 02:27:55,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:27:55,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 02:27:56,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:27:56,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 02:27:56,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 02:27:58,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:28:00,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:28:03,716 INFO [train.py:1039] (3/4) Epoch 7, batch 1100, loss[loss=0.2268, simple_loss=0.2819, pruned_loss=0.08588, over 23789.00 frames. ], tot_loss[loss=0.2285, simple_loss=0.292, pruned_loss=0.08249, over 4678636.52 frames. ], batch size: 212, lr: 1.50e-02, grad_scale: 16.0 2023-09-29 02:28:05,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:28:09,216 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.39 vs. limit=15.0 2023-09-29 02:28:10,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:28:10,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:28:11,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:28:11,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 02:28:14,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=219813.33333333334, ans=0.125 2023-09-29 02:28:15,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:28:18,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 02:28:20,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:28:22,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=219880.0, ans=0.125 2023-09-29 02:28:24,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=219880.0, ans=0.0 2023-09-29 02:28:25,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:28:25,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 02:28:25,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 02:28:27,345 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:28:27,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:28:27,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=219880.0, ans=0.0 2023-09-29 02:28:32,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:28:33,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:28:40,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:28:43,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 02:28:44,502 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 02:28:44,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:28:47,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:28:49,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 02:28:49,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:28:51,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 02:28:52,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:28:52,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:28:52,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:28:53,087 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:28:54,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:28:55,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 02:29:00,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:29:01,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 02:29:03,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:29:07,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:29:10,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 02:29:10,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 02:29:11,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:29:12,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=220080.0, ans=0.0 2023-09-29 02:29:13,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:29:14,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:29:15,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=220080.0, ans=0.125 2023-09-29 02:29:16,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 02:29:16,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:29:16,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:29:18,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 02:29:18,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:29:19,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 02:29:19,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=220080.0, ans=0.0 2023-09-29 02:29:19,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=220080.0, ans=0.125 2023-09-29 02:29:21,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:29:21,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:29:22,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:29:24,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=220146.66666666666, ans=0.0 2023-09-29 02:29:26,263 INFO [train.py:1039] (3/4) Epoch 7, batch 1150, loss[loss=0.2113, simple_loss=0.29, pruned_loss=0.06627, over 24630.00 frames. ], tot_loss[loss=0.2293, simple_loss=0.2931, pruned_loss=0.08281, over 4688647.42 frames. ], batch size: 73, lr: 1.50e-02, grad_scale: 16.0 2023-09-29 02:29:28,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:29:30,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:29:34,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:29:34,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:29:35,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 02:29:35,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:29:35,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=220146.66666666666, ans=0.125 2023-09-29 02:29:38,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 02:29:38,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:29:38,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:29:45,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 02:29:48,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:29:51,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:29:52,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:29:53,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 02:29:53,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:29:53,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:29:59,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 02:29:59,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:30:00,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:30:13,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:30:21,276 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:30:21,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 02:30:22,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:30:24,042 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.659e+02 2.131e+02 2.399e+02 2.911e+02 4.367e+02, threshold=4.797e+02, percent-clipped=0.0 2023-09-29 02:30:24,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:30:30,153 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 02:30:30,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:30:35,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=220413.33333333334, ans=0.2 2023-09-29 02:30:36,585 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 02:30:42,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:30:42,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=220413.33333333334, ans=0.1 2023-09-29 02:30:42,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=220413.33333333334, ans=0.1 2023-09-29 02:30:43,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:30:43,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:30:43,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:30:46,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:30:48,886 INFO [train.py:1039] (3/4) Epoch 7, batch 1200, loss[loss=0.2193, simple_loss=0.2941, pruned_loss=0.07228, over 24664.00 frames. ], tot_loss[loss=0.2302, simple_loss=0.2943, pruned_loss=0.08304, over 4704128.29 frames. ], batch size: 65, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:30:51,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:30:51,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:30:54,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:30:54,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:30:55,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:30:57,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:30:57,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_ff2.min_abs, batch_count=220480.0, ans=0.1 2023-09-29 02:30:58,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:31:00,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:31:00,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:31:03,492 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 02:31:06,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 02:31:09,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=220546.66666666666, ans=0.0 2023-09-29 02:31:10,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:31:13,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:31:14,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=220546.66666666666, ans=0.125 2023-09-29 02:31:16,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:31:18,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:31:18,907 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 02:31:19,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:31:20,971 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=220613.33333333334, ans=0.0 2023-09-29 02:31:27,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 02:31:27,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:31:27,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 02:31:29,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:31:32,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 02:31:38,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 02:31:38,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:31:38,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:31:39,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:31:40,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:31:41,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:31:41,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:31:41,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:31:43,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 02:31:43,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:31:43,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:31:44,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:31:49,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:31:49,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:31:54,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 02:31:55,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=220746.66666666666, ans=0.125 2023-09-29 02:31:57,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:32:00,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 02:32:03,872 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 02:32:05,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:32:08,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:32:10,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:32:11,615 INFO [train.py:1039] (3/4) Epoch 7, batch 1250, loss[loss=0.2288, simple_loss=0.2912, pruned_loss=0.08319, over 23544.00 frames. ], tot_loss[loss=0.2302, simple_loss=0.2945, pruned_loss=0.08296, over 4708741.26 frames. ], batch size: 120, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:32:11,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:32:14,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 02:32:17,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:32:19,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:32:19,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 02:32:22,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:32:23,520 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.62 vs. limit=15.0 2023-09-29 02:32:24,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:32:28,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 02:32:28,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:32:29,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:32:29,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:32:32,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:32:36,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 02:32:36,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:32:36,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:32:36,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:32:38,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:32:41,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:32:41,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 02:32:46,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 02:32:46,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:32:46,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=220946.66666666666, ans=0.125 2023-09-29 02:32:48,505 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.02 vs. limit=22.5 2023-09-29 02:32:49,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:32:50,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 02:32:52,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:32:52,888 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 02:32:52,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:32:54,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:32:57,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:33:04,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:33:04,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:33:04,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 02:33:05,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 02:33:05,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 02:33:08,750 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 2.040e+02 2.260e+02 2.662e+02 4.055e+02, threshold=4.521e+02, percent-clipped=0.0 2023-09-29 02:33:10,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:33:11,723 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.82 vs. limit=10.0 2023-09-29 02:33:12,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 02:33:12,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:33:15,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 02:33:15,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:33:18,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 02:33:18,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 02:33:20,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:33:20,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 02:33:20,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:33:22,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 02:33:22,617 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.85 vs. limit=15.0 2023-09-29 02:33:24,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:33:26,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:33:27,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:33:30,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:33:32,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:33:32,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 02:33:34,490 INFO [train.py:1039] (3/4) Epoch 7, batch 1300, loss[loss=0.2419, simple_loss=0.3107, pruned_loss=0.08656, over 24303.00 frames. ], tot_loss[loss=0.2297, simple_loss=0.2942, pruned_loss=0.08264, over 4719491.63 frames. ], batch size: 77, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:33:39,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:33:40,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 02:33:42,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:33:43,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:33:46,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:33:47,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 02:33:50,025 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.79 vs. limit=6.0 2023-09-29 02:33:50,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:33:52,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:33:53,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 02:33:58,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:34:03,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:34:05,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:34:06,195 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.17 vs. limit=15.0 2023-09-29 02:34:06,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:34:08,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:34:09,065 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:34:10,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:34:10,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 02:34:10,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 02:34:18,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:34:18,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:34:18,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 02:34:20,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 02:34:22,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:34:25,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:34:27,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 02:34:27,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:34:27,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 02:34:29,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:34:29,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=221346.66666666666, ans=0.125 2023-09-29 02:34:33,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:34:33,825 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:34:35,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=221346.66666666666, ans=0.125 2023-09-29 02:34:36,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 02:34:37,192 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=221346.66666666666, ans=0.05 2023-09-29 02:34:39,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 02:34:40,727 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 02:34:45,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:34:49,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 02:34:50,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:34:52,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=221413.33333333334, ans=0.2 2023-09-29 02:34:54,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=221413.33333333334, ans=0.125 2023-09-29 02:34:56,749 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:34:57,758 INFO [train.py:1039] (3/4) Epoch 7, batch 1350, loss[loss=0.2157, simple_loss=0.2943, pruned_loss=0.06855, over 24612.00 frames. ], tot_loss[loss=0.2304, simple_loss=0.2939, pruned_loss=0.08345, over 4717489.25 frames. ], batch size: 68, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:34:58,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 02:34:59,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=221480.0, ans=0.125 2023-09-29 02:35:02,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:35:04,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:35:07,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:35:08,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:35:10,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:35:10,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:35:15,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:35:15,875 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=221546.66666666666, ans=0.2 2023-09-29 02:35:17,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 02:35:18,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 02:35:20,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:35:22,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 02:35:22,674 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=221546.66666666666, ans=0.125 2023-09-29 02:35:24,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:35:25,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:35:25,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 02:35:27,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 02:35:29,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 02:35:31,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:35:31,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 02:35:42,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:35:50,000 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.84 vs. limit=12.0 2023-09-29 02:35:51,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:35:51,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:35:52,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 02:35:55,569 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 2.214e+02 2.683e+02 3.104e+02 4.290e+02, threshold=5.366e+02, percent-clipped=0.0 2023-09-29 02:35:55,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:35:57,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 02:35:57,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 02:35:59,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:36:01,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:36:03,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 02:36:04,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:36:08,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 02:36:09,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 02:36:11,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=221746.66666666666, ans=0.0 2023-09-29 02:36:17,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 02:36:19,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:36:20,694 INFO [train.py:1039] (3/4) Epoch 7, batch 1400, loss[loss=0.2218, simple_loss=0.2953, pruned_loss=0.07413, over 24465.00 frames. ], tot_loss[loss=0.2288, simple_loss=0.2918, pruned_loss=0.08294, over 4701871.03 frames. ], batch size: 66, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:36:23,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:36:23,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:36:27,868 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 02:36:31,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 02:36:39,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:36:41,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:36:44,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:36:44,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 02:36:49,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:36:50,227 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.26 vs. limit=15.0 2023-09-29 02:36:50,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 02:36:54,660 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.74 vs. limit=22.5 2023-09-29 02:36:58,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=221946.66666666666, ans=22.5 2023-09-29 02:37:00,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:02,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:06,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=221946.66666666666, ans=0.95 2023-09-29 02:37:07,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 02:37:09,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:37:09,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:37:09,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:37:11,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:37:12,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:37:12,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:37:14,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:37:15,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 02:37:15,918 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=222013.33333333334, ans=0.05 2023-09-29 02:37:17,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:37:20,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:23,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:37:32,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 02:37:33,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 02:37:33,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=222080.0, ans=0.04949747468305833 2023-09-29 02:37:33,511 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=222080.0, ans=0.2 2023-09-29 02:37:35,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:37:36,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 02:37:38,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:37:38,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:37:43,603 INFO [train.py:1039] (3/4) Epoch 7, batch 1450, loss[loss=0.2471, simple_loss=0.3047, pruned_loss=0.09476, over 23823.00 frames. ], tot_loss[loss=0.2278, simple_loss=0.2914, pruned_loss=0.08214, over 4709829.14 frames. ], batch size: 212, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:37:43,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:37:46,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:37:46,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:46,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 02:37:47,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=222146.66666666666, ans=0.125 2023-09-29 02:37:51,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:37:53,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:37:54,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:37:54,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 02:37:56,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:37:56,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 02:37:58,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:37:59,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:37:59,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 02:38:00,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:38:00,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:38:01,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 02:38:01,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:38:01,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:38:04,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:38:08,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:38:11,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:38:11,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:38:11,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=222213.33333333334, ans=0.125 2023-09-29 02:38:14,203 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.90 vs. limit=15.0 2023-09-29 02:38:14,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:38:14,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:38:18,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:38:18,636 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:38:18,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:38:20,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:38:23,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 02:38:26,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:38:29,635 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 02:38:31,324 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 02:38:32,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:38:32,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:38:34,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:38:36,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 02:38:36,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=222346.66666666666, ans=0.0 2023-09-29 02:38:39,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:38:41,368 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 2.076e+02 2.226e+02 2.557e+02 3.542e+02, threshold=4.452e+02, percent-clipped=0.0 2023-09-29 02:38:41,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 02:38:43,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 02:38:43,350 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=222346.66666666666, ans=0.1 2023-09-29 02:38:44,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:38:47,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:38:49,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:38:51,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 02:38:53,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 02:38:53,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 02:38:55,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:38:56,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:39:06,124 INFO [train.py:1039] (3/4) Epoch 7, batch 1500, loss[loss=0.2193, simple_loss=0.2844, pruned_loss=0.07714, over 23772.00 frames. ], tot_loss[loss=0.228, simple_loss=0.2919, pruned_loss=0.08207, over 4721358.42 frames. ], batch size: 232, lr: 1.50e-02, grad_scale: 32.0 2023-09-29 02:39:09,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 02:39:09,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:39:10,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:39:11,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:39:11,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:39:13,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:39:13,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 02:39:13,580 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=222480.0, ans=0.125 2023-09-29 02:39:15,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=222480.0, ans=0.125 2023-09-29 02:39:16,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:39:16,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 02:39:16,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:39:18,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:39:19,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:39:21,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:39:23,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=222546.66666666666, ans=0.125 2023-09-29 02:39:25,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:39:27,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 02:39:27,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:39:27,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:39:29,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:39:32,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 02:39:35,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 02:39:38,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:39:38,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 02:39:41,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 02:39:43,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:39:45,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:39:45,171 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:39:45,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 02:39:46,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:39:46,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:39:48,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 02:39:50,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:39:56,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:39:56,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 02:40:00,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=222680.0, ans=0.1 2023-09-29 02:40:03,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 02:40:05,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:40:09,819 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 02:40:09,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:40:09,909 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 02:40:11,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:40:13,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:40:13,131 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 02:40:14,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:40:19,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 02:40:21,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:40:21,658 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=222746.66666666666, ans=0.0 2023-09-29 02:40:24,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:40:24,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:40:24,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:40:25,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:40:25,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:40:27,980 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 02:40:28,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 02:40:28,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=222813.33333333334, ans=0.025 2023-09-29 02:40:29,314 INFO [train.py:1039] (3/4) Epoch 7, batch 1550, loss[loss=0.207, simple_loss=0.2779, pruned_loss=0.06808, over 24455.00 frames. ], tot_loss[loss=0.228, simple_loss=0.2923, pruned_loss=0.08183, over 4730794.23 frames. ], batch size: 63, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:40:29,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:40:30,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 02:40:31,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 02:40:32,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:40:36,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:40:36,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:40:36,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:40:37,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:40:38,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=222813.33333333334, ans=0.1 2023-09-29 02:40:39,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:40:40,986 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 02:40:42,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:40:42,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:40:43,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 02:40:45,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:40:45,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 02:40:47,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:40:48,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 02:40:48,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 02:40:48,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 02:40:50,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:40:51,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:40:56,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:40:59,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 02:40:59,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 02:41:07,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:41:11,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=222946.66666666666, ans=0.2 2023-09-29 02:41:13,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:41:13,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 02:41:13,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:41:13,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 02:41:19,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 02:41:22,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:41:24,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:41:24,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:41:25,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:41:25,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 02:41:25,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:41:27,785 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 2.115e+02 2.414e+02 2.837e+02 4.599e+02, threshold=4.828e+02, percent-clipped=1.0 2023-09-29 02:41:28,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=223013.33333333334, ans=0.125 2023-09-29 02:41:29,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:41:29,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:41:30,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 02:41:30,888 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 02:41:33,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:41:40,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 02:41:45,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:41:45,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:41:47,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 02:41:48,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:41:50,270 INFO [train.py:1039] (3/4) Epoch 7, batch 1600, loss[loss=0.2133, simple_loss=0.2834, pruned_loss=0.07155, over 24684.00 frames. ], tot_loss[loss=0.2281, simple_loss=0.2924, pruned_loss=0.08193, over 4724182.56 frames. ], batch size: 65, lr: 1.49e-02, grad_scale: 32.0 2023-09-29 02:41:50,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:41:50,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:41:50,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:41:51,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:41:54,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:41:55,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 02:41:55,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=223146.66666666666, ans=0.04949747468305833 2023-09-29 02:41:56,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 02:41:58,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 02:41:59,921 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:42:03,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 02:42:03,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:42:06,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:42:11,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:42:15,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 02:42:17,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:42:19,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 02:42:19,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:42:19,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 02:42:27,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 02:42:27,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=223280.0, ans=0.125 2023-09-29 02:42:33,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:42:35,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 02:42:35,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:42:37,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:42:37,305 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:42:40,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 02:42:43,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 02:42:46,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:42:46,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:42:46,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:42:48,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:42:50,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:42:52,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:42:53,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:42:59,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=223413.33333333334, ans=0.125 2023-09-29 02:43:00,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:43:02,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:43:05,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 02:43:05,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:43:06,526 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 02:43:11,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:43:13,081 INFO [train.py:1039] (3/4) Epoch 7, batch 1650, loss[loss=0.2148, simple_loss=0.2916, pruned_loss=0.06899, over 24675.00 frames. ], tot_loss[loss=0.229, simple_loss=0.2933, pruned_loss=0.08231, over 4726629.68 frames. ], batch size: 68, lr: 1.49e-02, grad_scale: 32.0 2023-09-29 02:43:14,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:43:14,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:43:14,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 02:43:14,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 02:43:14,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 02:43:14,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 02:43:19,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:43:21,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:43:22,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:43:22,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:43:26,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:43:28,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 02:43:28,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=223546.66666666666, ans=0.0 2023-09-29 02:43:30,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:43:30,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:43:30,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:43:30,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:43:32,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 02:43:33,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 02:43:39,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:43:41,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:43:42,177 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.18 vs. limit=15.0 2023-09-29 02:43:43,759 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=223546.66666666666, ans=0.125 2023-09-29 02:43:47,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=223613.33333333334, ans=0.04949747468305833 2023-09-29 02:43:48,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 02:43:50,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:43:51,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 02:43:56,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:43:59,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:43:59,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:43:59,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:44:01,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:44:01,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:44:05,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:44:05,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:44:07,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:44:07,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:44:08,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:44:10,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:44:13,135 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.715e+02 2.058e+02 2.405e+02 2.744e+02 4.179e+02, threshold=4.810e+02, percent-clipped=0.0 2023-09-29 02:44:13,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:44:13,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 02:44:16,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:44:18,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 02:44:18,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 02:44:18,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 02:44:19,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:44:21,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:44:21,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:44:22,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:44:22,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 02:44:26,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:44:27,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:44:27,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:44:29,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 02:44:34,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:44:34,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:44:35,909 INFO [train.py:1039] (3/4) Epoch 7, batch 1700, loss[loss=0.209, simple_loss=0.2475, pruned_loss=0.08527, over 19173.00 frames. ], tot_loss[loss=0.2286, simple_loss=0.2929, pruned_loss=0.08216, over 4726681.05 frames. ], batch size: 388, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:44:35,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 02:44:38,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:44:38,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:44:38,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:44:38,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=223813.33333333334, ans=0.125 2023-09-29 02:44:39,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:44:39,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:44:39,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 02:44:42,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:44:44,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=223813.33333333334, ans=0.125 2023-09-29 02:44:51,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:44:52,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:44:52,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=223880.0, ans=0.0 2023-09-29 02:44:58,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:44:59,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:44:59,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:45:00,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:45:04,237 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=223880.0, ans=0.125 2023-09-29 02:45:05,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 02:45:07,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:45:07,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:45:08,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:45:10,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 02:45:12,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 02:45:13,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 02:45:15,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:45:16,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 02:45:17,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:45:27,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:45:29,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:45:30,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:45:32,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=224013.33333333334, ans=0.125 2023-09-29 02:45:33,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 02:45:33,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 02:45:33,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:45:35,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:45:35,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 02:45:36,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:45:36,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:45:36,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:45:36,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:45:41,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:45:41,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:45:41,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:45:41,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:45:43,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:45:48,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:45:50,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 02:45:52,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:45:52,298 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:45:55,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 02:45:58,956 INFO [train.py:1039] (3/4) Epoch 7, batch 1750, loss[loss=0.2369, simple_loss=0.3077, pruned_loss=0.08304, over 24078.00 frames. ], tot_loss[loss=0.2271, simple_loss=0.2914, pruned_loss=0.0814, over 4729202.99 frames. ], batch size: 86, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:46:02,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:46:03,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:46:05,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 02:46:05,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 02:46:06,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:46:08,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=224146.66666666666, ans=0.125 2023-09-29 02:46:09,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:46:09,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:46:14,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 02:46:17,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:46:19,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 02:46:21,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:46:21,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:46:25,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 02:46:26,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 02:46:27,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:46:28,445 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 02:46:28,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=224213.33333333334, ans=0.125 2023-09-29 02:46:36,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:46:40,265 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=224280.0, ans=0.04949747468305833 2023-09-29 02:46:41,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:46:41,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:46:44,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:46:44,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:46:46,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:46:48,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:46:51,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:46:52,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:46:52,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 02:46:55,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:46:58,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 02:46:59,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:47:00,213 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.434e+02 2.137e+02 2.415e+02 2.786e+02 3.944e+02, threshold=4.830e+02, percent-clipped=0.0 2023-09-29 02:47:00,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:47:01,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:47:05,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:47:06,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 02:47:08,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:47:09,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:47:13,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:47:17,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:47:17,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:47:18,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=224413.33333333334, ans=0.125 2023-09-29 02:47:19,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 02:47:19,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:47:19,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=224480.0, ans=0.0 2023-09-29 02:47:21,252 INFO [train.py:1039] (3/4) Epoch 7, batch 1800, loss[loss=0.208, simple_loss=0.2836, pruned_loss=0.06617, over 24476.00 frames. ], tot_loss[loss=0.2265, simple_loss=0.2906, pruned_loss=0.08123, over 4728190.67 frames. ], batch size: 66, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:47:21,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:47:21,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:47:21,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 02:47:21,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:47:22,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:47:24,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:47:26,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:47:28,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 02:47:29,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:47:33,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 02:47:35,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:47:38,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:47:41,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:47:41,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:47:43,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:47:44,015 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.80 vs. limit=15.0 2023-09-29 02:47:46,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:47:46,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 02:47:46,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:47:49,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:47:52,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 02:47:55,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 02:47:55,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 02:47:56,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:47:57,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:47:57,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:47:59,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:48:06,749 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 02:48:08,277 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:48:11,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:48:12,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 02:48:14,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 02:48:14,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 02:48:16,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:48:16,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:48:20,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=224680.0, ans=0.0 2023-09-29 02:48:20,167 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=224680.0, ans=0.125 2023-09-29 02:48:22,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 02:48:26,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:48:27,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 02:48:27,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:48:27,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:48:29,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:48:29,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 02:48:32,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:48:32,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:48:36,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 02:48:36,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:48:38,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:48:38,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:48:38,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:48:40,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:48:40,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:48:45,582 INFO [train.py:1039] (3/4) Epoch 7, batch 1850, loss[loss=0.2462, simple_loss=0.3055, pruned_loss=0.09342, over 23432.00 frames. ], tot_loss[loss=0.227, simple_loss=0.291, pruned_loss=0.08145, over 4716478.20 frames. ], batch size: 285, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:48:45,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:48:45,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:48:47,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:48:48,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:48:50,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=224813.33333333334, ans=0.0 2023-09-29 02:48:55,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:48:56,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 02:48:59,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 02:49:02,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 02:49:04,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=224880.0, ans=0.0 2023-09-29 02:49:06,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:49:07,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 02:49:07,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 02:49:08,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=224880.0, ans=15.0 2023-09-29 02:49:11,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=224880.0, ans=0.125 2023-09-29 02:49:14,471 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=224880.0, ans=0.1 2023-09-29 02:49:19,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:49:21,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 02:49:25,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:49:25,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:49:28,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 02:49:28,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:49:29,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 02:49:31,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:49:33,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 02:49:36,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:49:40,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:49:41,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:49:41,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 02:49:41,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:49:42,432 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.24 vs. limit=22.5 2023-09-29 02:49:44,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:49:46,012 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.718e+02 2.062e+02 2.233e+02 2.588e+02 4.432e+02, threshold=4.466e+02, percent-clipped=0.0 2023-09-29 02:49:46,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:49:51,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 02:49:51,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:49:55,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:49:57,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 02:49:57,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 02:49:57,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 02:49:58,773 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 02:50:02,053 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 02:50:03,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:50:03,630 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:50:03,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:50:03,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:50:05,127 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 02:50:05,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:50:05,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:50:05,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:50:08,216 INFO [train.py:1039] (3/4) Epoch 7, batch 1900, loss[loss=0.1963, simple_loss=0.269, pruned_loss=0.06185, over 19714.00 frames. ], tot_loss[loss=0.2269, simple_loss=0.2911, pruned_loss=0.08132, over 4715796.05 frames. ], batch size: 43, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:50:08,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:50:09,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:50:09,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 02:50:13,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:50:13,471 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 02:50:13,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 02:50:14,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:50:21,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:50:24,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 02:50:24,825 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 02:50:26,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 02:50:27,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 02:50:27,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:50:27,976 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 02:50:29,532 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 02:50:33,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 02:50:34,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=225213.33333333334, ans=0.125 2023-09-29 02:50:36,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:50:39,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 02:50:42,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 02:50:46,472 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=225280.0, ans=0.125 2023-09-29 02:50:54,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 02:50:55,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 02:50:55,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:50:56,790 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.50 vs. limit=6.0 2023-09-29 02:50:57,401 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 02:50:57,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 02:50:57,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 02:50:57,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 02:50:57,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:51:02,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 02:51:05,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:51:07,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=225346.66666666666, ans=0.1 2023-09-29 02:51:08,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:51:08,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 02:51:12,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:51:12,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=225413.33333333334, ans=0.05 2023-09-29 02:51:15,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 02:51:17,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:51:22,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 02:51:22,485 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:51:22,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:51:24,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:51:26,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 02:51:27,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 02:51:27,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:51:30,587 INFO [train.py:1039] (3/4) Epoch 7, batch 1950, loss[loss=0.2124, simple_loss=0.2699, pruned_loss=0.07748, over 23699.00 frames. ], tot_loss[loss=0.2281, simple_loss=0.2918, pruned_loss=0.08216, over 4720447.74 frames. ], batch size: 149, lr: 1.49e-02, grad_scale: 16.0 2023-09-29 02:51:30,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:51:30,831 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:51:34,432 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:51:34,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:51:34,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 02:51:36,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:51:40,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:51:42,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:51:44,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:51:44,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:51:45,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 02:51:45,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 02:51:47,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:51:47,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:51:50,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:51:50,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:51:50,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:51:53,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:51:55,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:51:57,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:51:57,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 02:51:57,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:52:00,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:52:05,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:52:05,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:52:05,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 02:52:05,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 02:52:07,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 02:52:07,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:52:07,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:52:12,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:52:13,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:52:19,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 02:52:20,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:52:20,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:52:22,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 02:52:23,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:52:27,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:52:28,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:52:30,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:52:32,069 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.833e+02 2.356e+02 2.885e+02 3.538e+02 5.665e+02, threshold=5.770e+02, percent-clipped=6.0 2023-09-29 02:52:37,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:52:37,728 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=225746.66666666666, ans=0.0 2023-09-29 02:52:38,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:52:40,711 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.21 vs. limit=15.0 2023-09-29 02:52:42,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:52:44,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:52:47,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:52:47,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:52:49,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 02:52:49,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 02:52:49,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:52:51,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 02:52:53,928 INFO [train.py:1039] (3/4) Epoch 7, batch 2000, loss[loss=0.2434, simple_loss=0.304, pruned_loss=0.09136, over 23463.00 frames. ], tot_loss[loss=0.2285, simple_loss=0.2924, pruned_loss=0.08228, over 4726584.95 frames. ], batch size: 120, lr: 1.49e-02, grad_scale: 32.0 2023-09-29 02:52:54,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:52:57,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:52:58,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:52:58,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:53:01,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:53:03,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:53:03,544 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=225813.33333333334, ans=0.2 2023-09-29 02:53:06,275 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.42 vs. limit=6.0 2023-09-29 02:53:06,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 02:53:08,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 02:53:10,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:53:13,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 02:53:15,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 02:53:15,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:53:17,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:53:18,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 02:53:22,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:24,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:24,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:25,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 02:53:25,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 02:53:28,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 02:53:28,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:53:31,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:53:31,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 02:53:31,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:33,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:53:34,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:53:35,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=225946.66666666666, ans=0.1 2023-09-29 02:53:36,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 02:53:37,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 02:53:37,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:53:37,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:53:43,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:53:45,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:53:45,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:53:47,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:53:49,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:53:49,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:53:49,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=226013.33333333334, ans=0.125 2023-09-29 02:53:50,713 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:53:50,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:53:52,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:53:56,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:53:56,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 02:54:02,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:54:03,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:03,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=226080.0, ans=0.125 2023-09-29 02:54:08,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:08,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:54:11,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:15,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:54:15,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:15,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:54:16,479 INFO [train.py:1039] (3/4) Epoch 7, batch 2050, loss[loss=0.2303, simple_loss=0.2722, pruned_loss=0.09416, over 22657.00 frames. ], tot_loss[loss=0.228, simple_loss=0.2919, pruned_loss=0.08203, over 4729619.24 frames. ], batch size: 322, lr: 1.48e-02, grad_scale: 32.0 2023-09-29 02:54:16,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:54:19,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:19,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:23,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:54:23,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:30,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:54:31,828 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:54:31,954 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:54:33,468 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:54:35,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 02:54:35,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:54:35,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=226213.33333333334, ans=0.1 2023-09-29 02:54:36,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:54:38,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:54:40,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=226213.33333333334, ans=0.0 2023-09-29 02:54:47,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:54:47,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:51,373 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 02:54:51,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=226280.0, ans=0.125 2023-09-29 02:54:53,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:54:54,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 02:54:54,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:54:56,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:54:59,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=226280.0, ans=0.0 2023-09-29 02:55:00,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:55:00,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 02:55:00,648 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:55:02,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:55:03,004 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:55:04,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 02:55:07,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:55:10,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 02:55:12,051 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 02:55:14,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:55:15,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=226346.66666666666, ans=0.1 2023-09-29 02:55:19,287 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 2.164e+02 2.389e+02 2.987e+02 5.025e+02, threshold=4.777e+02, percent-clipped=0.0 2023-09-29 02:55:19,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:55:26,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:55:27,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=226413.33333333334, ans=0.1 2023-09-29 02:55:28,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 02:55:33,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:55:33,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:55:36,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 02:55:38,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 02:55:39,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=226480.0, ans=0.2 2023-09-29 02:55:40,153 INFO [train.py:1039] (3/4) Epoch 7, batch 2100, loss[loss=0.2555, simple_loss=0.31, pruned_loss=0.1005, over 23298.00 frames. ], tot_loss[loss=0.2266, simple_loss=0.2904, pruned_loss=0.08144, over 4710329.06 frames. ], batch size: 119, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 02:55:41,937 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 02:55:41,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:55:42,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:55:42,730 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.93 vs. limit=15.0 2023-09-29 02:55:43,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:55:43,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:55:43,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 02:55:43,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 02:55:46,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 02:55:49,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 02:55:49,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:55:52,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:55:54,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:55:54,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 02:55:56,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 02:55:56,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 02:55:56,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 02:55:58,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:55:59,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:55:59,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 02:56:00,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 02:56:05,305 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 02:56:05,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 02:56:08,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:56:08,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:56:13,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:56:13,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 02:56:14,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:56:14,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 02:56:17,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 02:56:17,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:56:17,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 02:56:17,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 02:56:19,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 02:56:22,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:56:23,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:56:26,528 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.87 vs. limit=15.0 2023-09-29 02:56:27,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:56:28,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 02:56:30,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:56:32,518 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:56:32,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 02:56:32,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:56:32,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:56:32,837 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=226680.0, ans=0.1 2023-09-29 02:56:34,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:56:34,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 02:56:36,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 02:56:36,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 02:56:41,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=226680.0, ans=0.125 2023-09-29 02:56:42,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 02:56:45,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 02:56:47,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 02:56:52,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:56:55,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 02:56:56,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:56:56,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:56:56,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 02:56:56,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:56:58,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:56:58,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:56:59,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 02:56:59,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:01,350 INFO [train.py:1039] (3/4) Epoch 7, batch 2150, loss[loss=0.2121, simple_loss=0.2887, pruned_loss=0.06781, over 24472.00 frames. ], tot_loss[loss=0.2265, simple_loss=0.2903, pruned_loss=0.08132, over 4714854.55 frames. ], batch size: 66, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 02:57:02,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 02:57:04,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 02:57:04,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:57:07,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:57:07,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 02:57:07,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 02:57:07,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:57:12,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 02:57:15,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:57:15,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:18,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:57:18,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:20,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:57:23,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:23,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 02:57:23,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 02:57:26,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:27,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 02:57:32,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:57:34,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 02:57:35,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:35,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:57:35,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:36,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:57:37,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:57:37,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 02:57:38,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:57:41,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 02:57:42,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 02:57:44,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:44,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:57:46,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 02:57:46,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:57:49,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:57:49,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:57:51,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:57:51,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 02:57:53,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 02:57:54,773 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.86 vs. limit=12.0 2023-09-29 02:57:56,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:57:56,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:57:58,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:58:00,530 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.39 vs. limit=15.0 2023-09-29 02:58:01,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 02:58:01,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:01,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:01,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 02:58:04,433 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.690e+02 2.076e+02 2.402e+02 2.714e+02 3.938e+02, threshold=4.805e+02, percent-clipped=0.0 2023-09-29 02:58:04,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 02:58:04,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 02:58:06,016 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 02:58:06,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:06,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:58:07,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 02:58:07,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 02:58:07,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 02:58:07,765 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 02:58:07,765 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 02:58:09,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 02:58:10,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:12,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:58:12,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 02:58:12,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:13,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 02:58:15,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:15,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:16,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=227080.0, ans=0.2 2023-09-29 02:58:23,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 02:58:24,647 INFO [train.py:1039] (3/4) Epoch 7, batch 2200, loss[loss=0.2164, simple_loss=0.2946, pruned_loss=0.06912, over 24375.00 frames. ], tot_loss[loss=0.2265, simple_loss=0.2905, pruned_loss=0.08126, over 4711143.75 frames. ], batch size: 74, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 02:58:24,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 02:58:29,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:58:34,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:36,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 02:58:36,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:58:37,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 02:58:38,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=227146.66666666666, ans=0.125 2023-09-29 02:58:39,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:58:39,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 02:58:39,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 02:58:41,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=227213.33333333334, ans=0.125 2023-09-29 02:58:44,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 02:58:45,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 02:58:53,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 02:58:58,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:58:58,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:58:58,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 02:59:03,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 02:59:03,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 02:59:04,262 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=227280.0, ans=0.0 2023-09-29 02:59:05,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 02:59:08,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:59:08,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 02:59:08,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=227280.0, ans=0.125 2023-09-29 02:59:11,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 02:59:11,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=227280.0, ans=0.0 2023-09-29 02:59:13,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=227346.66666666666, ans=0.125 2023-09-29 02:59:14,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:59:14,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=227346.66666666666, ans=0.125 2023-09-29 02:59:15,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 02:59:17,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:59:18,163 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=227346.66666666666, ans=0.0 2023-09-29 02:59:19,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 02:59:20,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:59:21,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 02:59:24,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:59:24,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 02:59:24,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 02:59:27,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 02:59:28,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 02:59:28,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:59:28,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 02:59:30,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 02:59:31,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 02:59:33,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 02:59:35,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=227413.33333333334, ans=0.125 2023-09-29 02:59:38,454 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 02:59:38,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 02:59:40,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 02:59:41,787 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 02:59:43,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 02:59:43,571 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 02:59:45,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 02:59:45,170 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 02:59:46,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:59:48,147 INFO [train.py:1039] (3/4) Epoch 7, batch 2250, loss[loss=0.2043, simple_loss=0.2788, pruned_loss=0.06488, over 24505.00 frames. ], tot_loss[loss=0.2262, simple_loss=0.2904, pruned_loss=0.08095, over 4722892.93 frames. ], batch size: 63, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 02:59:48,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 02:59:48,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 02:59:51,327 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 02:59:51,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 02:59:54,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:00:02,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:00:04,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:00:08,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:00:08,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=227546.66666666666, ans=0.125 2023-09-29 03:00:08,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=227546.66666666666, ans=0.1 2023-09-29 03:00:10,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:00:11,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:00:12,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=227546.66666666666, ans=0.125 2023-09-29 03:00:13,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 03:00:13,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:00:14,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:00:16,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 03:00:18,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:00:18,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:00:19,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:00:25,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:00:27,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 03:00:27,225 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 03:00:28,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 03:00:30,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:00:30,749 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=227613.33333333334, ans=0.05 2023-09-29 03:00:31,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:00:35,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:00:37,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:00:39,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:00:39,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:00:41,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:00:42,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:00:47,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:00:50,695 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.687e+02 2.075e+02 2.318e+02 2.667e+02 5.695e+02, threshold=4.636e+02, percent-clipped=1.0 2023-09-29 03:00:50,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 03:00:55,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:00:56,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:00:56,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:01:01,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 03:01:01,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=227746.66666666666, ans=0.2 2023-09-29 03:01:02,583 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=14.79 vs. limit=22.5 2023-09-29 03:01:03,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:01:03,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 03:01:04,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:01:04,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:01:04,960 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:01:08,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 03:01:09,684 INFO [train.py:1039] (3/4) Epoch 7, batch 2300, loss[loss=0.2325, simple_loss=0.3081, pruned_loss=0.07846, over 24018.00 frames. ], tot_loss[loss=0.2268, simple_loss=0.2913, pruned_loss=0.08116, over 4724722.62 frames. ], batch size: 80, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 03:01:13,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:01:13,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:01:15,288 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=227813.33333333334, ans=0.1 2023-09-29 03:01:19,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:01:20,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:01:23,502 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 03:01:25,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:01:31,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:01:31,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 03:01:31,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:01:32,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:01:32,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 03:01:34,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:01:36,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:01:37,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:01:40,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:01:44,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:01:50,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:01:51,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=227946.66666666666, ans=0.0 2023-09-29 03:01:56,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:01:56,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:01:59,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:02:00,712 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.15 vs. limit=22.5 2023-09-29 03:02:01,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:02:04,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:02:06,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:02:06,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:02:06,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 03:02:09,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 03:02:09,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:02:11,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:02:11,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:02:11,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:02:12,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 03:02:12,620 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:02:12,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 03:02:12,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:02:12,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:02:14,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 03:02:21,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:02:25,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:02:30,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:02:30,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:02:30,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:02:32,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:02:32,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:02:33,863 INFO [train.py:1039] (3/4) Epoch 7, batch 2350, loss[loss=0.2009, simple_loss=0.2773, pruned_loss=0.06219, over 24639.00 frames. ], tot_loss[loss=0.2276, simple_loss=0.2917, pruned_loss=0.08171, over 4726042.90 frames. ], batch size: 73, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 03:02:33,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:02:35,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 03:02:36,355 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.19 vs. limit=12.0 2023-09-29 03:02:40,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=228146.66666666666, ans=0.125 2023-09-29 03:02:41,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:02:41,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 03:02:48,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 03:02:51,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:02:54,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:02:54,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:02:54,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:02:54,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:02:58,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 03:03:01,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:03:08,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 03:03:09,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:03:12,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:03:12,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:03:14,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:03:15,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 03:03:17,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:03:18,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:03:18,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:03:18,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:03:21,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:03:25,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 03:03:26,197 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.99 vs. limit=15.0 2023-09-29 03:03:27,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:03:30,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:03:30,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:03:31,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 03:03:31,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:03:34,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 03:03:34,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:03:36,034 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.092e+02 2.385e+02 2.952e+02 4.935e+02, threshold=4.770e+02, percent-clipped=1.0 2023-09-29 03:03:39,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 03:03:44,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 03:03:46,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:03:46,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 03:03:46,095 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 03:03:46,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 03:03:49,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 03:03:49,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=228413.33333333334, ans=0.125 2023-09-29 03:03:50,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:03:51,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=228413.33333333334, ans=0.5 2023-09-29 03:03:55,182 INFO [train.py:1039] (3/4) Epoch 7, batch 2400, loss[loss=0.2427, simple_loss=0.2924, pruned_loss=0.09647, over 23617.00 frames. ], tot_loss[loss=0.2279, simple_loss=0.2914, pruned_loss=0.08214, over 4718984.21 frames. ], batch size: 256, lr: 1.48e-02, grad_scale: 32.0 2023-09-29 03:03:55,680 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=228480.0, ans=0.125 2023-09-29 03:03:55,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=228480.0, ans=0.0 2023-09-29 03:03:57,455 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:03:59,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:04:02,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:04:03,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 03:04:05,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 03:04:12,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:04:12,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:04:16,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 03:04:16,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:04:17,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:04:17,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 03:04:24,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:04:27,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 03:04:31,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:04:34,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 03:04:38,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:04:39,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=228613.33333333334, ans=0.125 2023-09-29 03:04:40,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:04:44,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:04:44,895 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:04:46,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 03:04:46,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:04:48,468 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.59 vs. limit=15.0 2023-09-29 03:04:48,499 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.88 vs. limit=15.0 2023-09-29 03:04:52,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:04:54,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:04:56,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:04:57,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:04:57,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 03:04:58,577 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.20 vs. limit=15.0 2023-09-29 03:04:59,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:04:59,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:04:59,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:05:00,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:05:03,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:05:04,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:05:04,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 03:05:06,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 03:05:06,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=228746.66666666666, ans=0.125 2023-09-29 03:05:08,100 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=228746.66666666666, ans=0.125 2023-09-29 03:05:09,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:05:09,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:05:11,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 03:05:11,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 03:05:12,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 03:05:12,023 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 03:05:12,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 03:05:13,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:05:16,722 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:05:16,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:05:16,883 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 03:05:18,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:05:18,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:05:19,798 INFO [train.py:1039] (3/4) Epoch 7, batch 2450, loss[loss=0.226, simple_loss=0.2865, pruned_loss=0.08279, over 23502.00 frames. ], tot_loss[loss=0.2266, simple_loss=0.29, pruned_loss=0.0816, over 4712675.78 frames. ], batch size: 134, lr: 1.48e-02, grad_scale: 16.0 2023-09-29 03:05:23,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:05:24,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:05:29,064 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:05:29,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:05:29,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 03:05:36,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:05:36,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:05:40,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:05:40,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:05:41,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:05:41,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 03:05:46,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:05:47,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:05:49,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:05:51,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:05:52,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:05:52,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:05:53,495 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.32 vs. limit=15.0 2023-09-29 03:05:54,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:05:55,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 03:05:57,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:05:59,119 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.37 vs. limit=15.0 2023-09-29 03:06:01,826 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=228946.66666666666, ans=0.125 2023-09-29 03:06:06,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:06:07,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=229013.33333333334, ans=0.2 2023-09-29 03:06:08,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:06:08,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:06:09,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:06:10,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:06:11,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:06:12,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 03:06:15,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:06:17,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:06:20,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:06:20,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:06:23,664 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.673e+02 2.159e+02 2.588e+02 3.094e+02 4.619e+02, threshold=5.175e+02, percent-clipped=0.0 2023-09-29 03:06:24,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:06:24,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 03:06:26,782 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:06:26,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:06:26,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 03:06:28,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:06:29,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:06:32,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:06:34,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=229080.0, ans=0.0 2023-09-29 03:06:36,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:06:36,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:06:39,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 03:06:41,318 INFO [train.py:1039] (3/4) Epoch 7, batch 2500, loss[loss=0.2352, simple_loss=0.3032, pruned_loss=0.08362, over 23711.00 frames. ], tot_loss[loss=0.2257, simple_loss=0.2892, pruned_loss=0.08116, over 4711804.73 frames. ], batch size: 85, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:06:41,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:06:49,277 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.48 vs. limit=6.0 2023-09-29 03:06:49,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:06:59,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:07:01,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:07:01,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:07:01,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 03:07:01,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=229213.33333333334, ans=0.1 2023-09-29 03:07:09,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:07:09,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:07:10,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 03:07:10,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 03:07:12,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 03:07:13,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:07:15,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:07:15,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 03:07:15,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:07:15,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 03:07:15,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:07:17,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=229280.0, ans=0.2 2023-09-29 03:07:22,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:07:24,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:07:27,594 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:07:29,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 03:07:29,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:07:31,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:07:34,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:07:37,775 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:07:39,971 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.44 vs. limit=22.5 2023-09-29 03:07:40,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:07:44,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 03:07:47,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 03:07:47,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:07:47,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 03:07:50,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:07:50,765 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:07:52,934 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 03:07:52,935 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 03:07:52,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 03:07:53,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=229413.33333333334, ans=0.125 2023-09-29 03:07:56,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:07:58,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 03:07:58,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 03:07:59,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:08:00,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 03:08:04,190 INFO [train.py:1039] (3/4) Epoch 7, batch 2550, loss[loss=0.2383, simple_loss=0.3098, pruned_loss=0.08345, over 24298.00 frames. ], tot_loss[loss=0.2256, simple_loss=0.2891, pruned_loss=0.08107, over 4719022.86 frames. ], batch size: 74, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:08:04,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 03:08:07,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:08:09,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=229480.0, ans=0.1 2023-09-29 03:08:10,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:08:10,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:08:10,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=229480.0, ans=0.025 2023-09-29 03:08:11,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:08:13,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 03:08:13,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:08:17,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 03:08:19,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:08:21,117 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:08:24,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:08:24,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 03:08:24,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:08:26,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:08:26,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:08:29,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:08:31,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 03:08:31,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 03:08:31,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:08:31,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 03:08:44,290 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.19 vs. limit=15.0 2023-09-29 03:08:45,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:08:49,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:08:50,195 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=229613.33333333334, ans=0.125 2023-09-29 03:08:51,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:08:51,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:08:51,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 03:08:51,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=229680.0, ans=0.125 2023-09-29 03:08:58,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:08:59,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:09:01,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:09:01,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:09:02,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 03:09:02,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:09:06,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:09:06,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:09:06,762 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=229680.0, ans=0.125 2023-09-29 03:09:07,775 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 2.133e+02 2.441e+02 2.869e+02 4.948e+02, threshold=4.883e+02, percent-clipped=0.0 2023-09-29 03:09:11,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:09:11,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 03:09:11,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:09:13,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:09:13,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 03:09:15,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:09:17,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:09:23,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:09:26,269 INFO [train.py:1039] (3/4) Epoch 7, batch 2600, loss[loss=0.2298, simple_loss=0.2947, pruned_loss=0.08244, over 23348.00 frames. ], tot_loss[loss=0.2273, simple_loss=0.291, pruned_loss=0.08182, over 4714166.74 frames. ], batch size: 105, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:09:26,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:09:27,229 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.20 vs. limit=6.0 2023-09-29 03:09:28,513 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 03:09:31,588 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 03:09:31,621 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:09:31,670 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 03:09:33,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 03:09:34,561 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 03:09:37,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:09:37,716 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 03:09:39,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 03:09:42,772 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 03:09:43,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:09:45,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 03:09:46,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 03:09:49,728 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 03:09:49,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 03:09:52,167 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:09:53,323 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 03:09:53,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 03:09:58,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=229946.66666666666, ans=0.0 2023-09-29 03:09:59,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:09:59,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:10:01,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:10:01,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 03:10:03,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:10:04,151 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.93 vs. limit=15.0 2023-09-29 03:10:08,077 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 03:10:14,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:10:16,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:10:17,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 03:10:19,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:10:19,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:10:19,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 03:10:21,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=230013.33333333334, ans=0.0 2023-09-29 03:10:22,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:10:22,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:10:24,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:10:26,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=230013.33333333334, ans=0.0 2023-09-29 03:10:29,538 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 03:10:29,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:10:30,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:10:35,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:10:37,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:10:37,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 03:10:37,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:10:40,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:10:42,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:10:46,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 03:10:48,337 INFO [train.py:1039] (3/4) Epoch 7, batch 2650, loss[loss=0.2532, simple_loss=0.3092, pruned_loss=0.09856, over 23442.00 frames. ], tot_loss[loss=0.2267, simple_loss=0.2911, pruned_loss=0.0812, over 4720810.89 frames. ], batch size: 93, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:10:48,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:10:50,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:10:54,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 03:10:54,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:10:55,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:10:57,395 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 03:10:57,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:10:59,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:11:03,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:11:05,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:11:08,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:11:09,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 03:11:09,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:11:09,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:11:13,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 03:11:15,156 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 03:11:18,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:11:18,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 03:11:19,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:11:20,265 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=230280.0, ans=0.125 2023-09-29 03:11:21,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 03:11:26,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:11:26,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:11:26,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:11:26,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:11:28,210 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=230280.0, ans=0.2 2023-09-29 03:11:31,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 03:11:31,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 03:11:35,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:11:38,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 03:11:39,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:11:41,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:11:41,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:11:41,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:11:41,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:11:44,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:11:46,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:11:48,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:11:48,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:11:49,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:11:51,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:11:51,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:11:52,730 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.070e+02 2.281e+02 2.771e+02 4.083e+02, threshold=4.562e+02, percent-clipped=0.0 2023-09-29 03:11:52,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:11:54,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:11:54,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:11:56,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=230413.33333333334, ans=0.125 2023-09-29 03:11:58,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:12:00,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:12:00,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:12:00,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 03:12:05,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:12:07,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:12:07,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:12:08,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:08,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:12:10,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:11,463 INFO [train.py:1039] (3/4) Epoch 7, batch 2700, loss[loss=0.2312, simple_loss=0.309, pruned_loss=0.07673, over 24551.00 frames. ], tot_loss[loss=0.2277, simple_loss=0.2924, pruned_loss=0.08149, over 4729433.52 frames. ], batch size: 71, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:12:11,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=230480.0, ans=0.0 2023-09-29 03:12:13,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:12:13,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 03:12:16,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:12:18,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 03:12:21,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:12:21,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:21,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:22,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:12:22,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:12:22,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:12:22,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 03:12:24,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 03:12:24,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:12:27,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:12:28,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:12:29,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:12:32,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:12:32,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 03:12:33,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:12:37,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=230546.66666666666, ans=0.0 2023-09-29 03:12:39,231 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.28 vs. limit=15.0 2023-09-29 03:12:42,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:12:42,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:12:47,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:12:47,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:12:47,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:12:47,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:12:51,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:12:54,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:12:54,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:12:54,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:12:58,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:12:58,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:13:01,727 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=230680.0, ans=0.125 2023-09-29 03:13:06,842 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.88 vs. limit=22.5 2023-09-29 03:13:08,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:13:08,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:13:13,340 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.39 vs. limit=22.5 2023-09-29 03:13:13,914 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:13:13,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:13:16,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:13:16,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=230746.66666666666, ans=0.0 2023-09-29 03:13:17,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:13:19,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:13:20,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:22,400 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:13:22,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:13:25,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:13:26,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:13:26,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:13:30,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 03:13:30,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:13:32,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:13:32,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 03:13:34,024 INFO [train.py:1039] (3/4) Epoch 7, batch 2750, loss[loss=0.2058, simple_loss=0.2524, pruned_loss=0.07959, over 23584.00 frames. ], tot_loss[loss=0.2285, simple_loss=0.293, pruned_loss=0.082, over 4717981.53 frames. ], batch size: 256, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:13:34,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 03:13:34,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:13:38,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:13:38,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:13:39,918 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=10.70 vs. limit=15.0 2023-09-29 03:13:41,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:41,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:13:42,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:46,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:13:47,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 03:13:47,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:13:47,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:47,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 03:13:47,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:13:47,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:13:54,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 03:13:57,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:13:57,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:13:57,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:13:57,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:13:58,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:14:00,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:14:00,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=230880.0, ans=0.125 2023-09-29 03:14:01,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:14:01,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:14:08,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:14:08,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 03:14:08,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:14:09,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:14:10,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 03:14:16,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:14:19,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:14:19,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:14:23,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:14:23,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:14:25,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:14:31,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:14:31,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:14:31,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 03:14:36,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:14:36,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 03:14:38,314 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 2.012e+02 2.382e+02 2.660e+02 4.649e+02, threshold=4.763e+02, percent-clipped=1.0 2023-09-29 03:14:41,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 03:14:42,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=231080.0, ans=0.1 2023-09-29 03:14:44,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:14:44,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 03:14:45,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=231080.0, ans=0.0 2023-09-29 03:14:46,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:14:50,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:14:50,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 03:14:50,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:14:54,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 03:14:55,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:14:55,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:14:55,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 03:14:55,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:14:57,146 INFO [train.py:1039] (3/4) Epoch 7, batch 2800, loss[loss=0.2236, simple_loss=0.2976, pruned_loss=0.07485, over 24655.00 frames. ], tot_loss[loss=0.2275, simple_loss=0.2912, pruned_loss=0.08192, over 4698893.61 frames. ], batch size: 65, lr: 1.47e-02, grad_scale: 32.0 2023-09-29 03:14:57,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:15:00,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:15:00,206 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 03:15:00,208 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 03:15:03,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:15:04,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:15:06,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:15:08,163 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=231146.66666666666, ans=0.0 2023-09-29 03:15:09,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:15:10,162 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.89 vs. limit=12.0 2023-09-29 03:15:12,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 03:15:15,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 03:15:17,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 03:15:17,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:15:19,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:15:19,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:15:23,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:15:23,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:15:23,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 03:15:23,852 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=231213.33333333334, ans=0.2 2023-09-29 03:15:25,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:15:34,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:15:37,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:15:40,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:15:40,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:15:42,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:15:48,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:15:48,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 03:15:48,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:15:50,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:15:50,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:15:53,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:15:55,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:15:59,090 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.18 vs. limit=6.0 2023-09-29 03:15:59,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:16:01,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:16:04,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:16:04,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:16:04,102 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 03:16:04,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:16:05,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:16:05,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 03:16:05,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:16:08,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:16:08,542 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:16:08,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 03:16:10,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:16:10,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:16:10,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:16:10,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=231413.33333333334, ans=0.0 2023-09-29 03:16:11,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 03:16:17,966 INFO [train.py:1039] (3/4) Epoch 7, batch 2850, loss[loss=0.2591, simple_loss=0.3061, pruned_loss=0.1061, over 22761.00 frames. ], tot_loss[loss=0.227, simple_loss=0.2908, pruned_loss=0.0816, over 4702335.63 frames. ], batch size: 322, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:16:18,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:16:18,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:16:19,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:16:21,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:16:25,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:16:25,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:16:25,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:16:29,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:16:29,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:16:31,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:16:31,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 03:16:38,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 03:16:38,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:16:40,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 03:16:41,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:16:46,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 03:16:46,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 03:16:47,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:16:58,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=231613.33333333334, ans=0.125 2023-09-29 03:17:00,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:17:00,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:17:00,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:17:02,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 03:17:02,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:17:02,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:17:02,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=231613.33333333334, ans=0.2 2023-09-29 03:17:04,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:17:04,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 03:17:06,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:17:07,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:17:08,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:17:09,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:17:12,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:17:13,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:17:13,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:17:16,630 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:17:19,448 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:17:19,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:17:21,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:17:24,052 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.676e+02 2.041e+02 2.225e+02 2.602e+02 4.724e+02, threshold=4.450e+02, percent-clipped=0.0 2023-09-29 03:17:24,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:17:28,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:17:30,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 03:17:30,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 03:17:31,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:17:33,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:17:33,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 03:17:35,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:17:35,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:17:35,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:17:35,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:17:35,468 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 03:17:35,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.max_positive, batch_count=231746.66666666666, ans=0.95 2023-09-29 03:17:36,218 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.08 vs. limit=15.0 2023-09-29 03:17:37,492 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 03:17:37,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:17:37,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:17:40,487 INFO [train.py:1039] (3/4) Epoch 7, batch 2900, loss[loss=0.2448, simple_loss=0.3143, pruned_loss=0.08761, over 23776.00 frames. ], tot_loss[loss=0.2259, simple_loss=0.2904, pruned_loss=0.08073, over 4718753.58 frames. ], batch size: 85, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:17:40,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=231813.33333333334, ans=0.125 2023-09-29 03:17:42,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 03:17:42,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:17:43,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:17:44,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 03:17:49,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:17:49,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 03:17:49,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 03:17:51,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:17:51,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:17:54,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:17:54,997 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=11.55 vs. limit=15.0 2023-09-29 03:17:55,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:17:56,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=231880.0, ans=0.125 2023-09-29 03:18:00,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:18:00,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:18:02,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:18:03,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 03:18:03,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:18:05,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:18:07,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 03:18:09,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 03:18:11,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:18:11,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 03:18:12,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:18:14,225 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:18:14,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 03:18:17,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:18:17,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:18:24,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:18:26,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:18:27,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 03:18:27,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 03:18:28,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:18:32,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:18:34,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 03:18:35,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:18:42,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:18:51,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:18:51,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:18:52,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 03:18:56,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:18:56,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 03:18:56,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:18:58,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:19:01,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer_ff3.min_abs, batch_count=232080.0, ans=0.2 2023-09-29 03:19:02,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:19:04,276 INFO [train.py:1039] (3/4) Epoch 7, batch 2950, loss[loss=0.2243, simple_loss=0.2859, pruned_loss=0.08133, over 23310.00 frames. ], tot_loss[loss=0.2264, simple_loss=0.2907, pruned_loss=0.08098, over 4717083.08 frames. ], batch size: 93, lr: 1.47e-02, grad_scale: 16.0 2023-09-29 03:19:04,459 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 03:19:05,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:19:05,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:19:07,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:19:09,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:19:10,490 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 03:19:11,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 03:19:12,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:19:12,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:19:19,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:19:22,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:19:24,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:19:25,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:19:29,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:19:29,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:19:30,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:19:32,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:19:32,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:19:35,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 03:19:39,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 03:19:40,649 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 03:19:40,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:19:42,337 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 03:19:43,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 03:19:43,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:19:45,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:19:45,781 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 03:19:45,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:19:50,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 03:19:50,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:19:51,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:19:54,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:19:57,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:19:58,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:19:58,545 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 03:19:58,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:19:58,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 03:20:06,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:20:08,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:20:09,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 03:20:09,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:20:11,823 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.080e+02 2.429e+02 2.872e+02 4.397e+02, threshold=4.858e+02, percent-clipped=0.0 2023-09-29 03:20:12,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 03:20:15,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:20:16,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:20:16,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:20:18,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:20:19,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 03:20:21,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:20:23,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:20:23,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:20:23,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:20:23,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:20:24,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:20:26,225 INFO [train.py:1039] (3/4) Epoch 7, batch 3000, loss[loss=0.2245, simple_loss=0.281, pruned_loss=0.08402, over 23635.00 frames. ], tot_loss[loss=0.226, simple_loss=0.2912, pruned_loss=0.08045, over 4713920.28 frames. ], batch size: 120, lr: 1.46e-02, grad_scale: 8.0 2023-09-29 03:20:26,226 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 03:20:40,696 INFO [train.py:1071] (3/4) Epoch 7, validation: loss=0.3621, simple_loss=0.3045, pruned_loss=0.2099, over 1125622.00 frames. 2023-09-29 03:20:40,697 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 03:20:40,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:20:40,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 03:20:41,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=232480.0, ans=0.125 2023-09-29 03:20:42,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:20:45,974 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:20:47,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:20:50,575 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 03:20:50,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 03:20:52,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:20:54,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:20:54,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 03:20:54,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:21:01,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:21:08,730 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=232546.66666666666, ans=0.1 2023-09-29 03:21:11,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:21:16,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=232613.33333333334, ans=0.1 2023-09-29 03:21:18,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 03:21:18,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:21:21,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:21:23,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:21:23,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:21:24,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:21:24,912 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 03:21:28,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 03:21:30,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:21:30,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 03:21:31,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:21:31,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:21:33,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:21:33,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:21:36,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:21:38,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:21:38,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:21:40,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:21:42,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 03:21:43,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:21:43,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:21:43,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:21:46,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:21:46,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:21:48,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 03:21:48,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 03:21:48,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:21:50,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 03:21:50,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:21:51,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 03:21:51,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=232746.66666666666, ans=0.05 2023-09-29 03:21:53,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=232746.66666666666, ans=0.0 2023-09-29 03:21:55,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:21:57,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:21:57,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 03:21:57,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 03:21:57,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 03:21:58,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:22:00,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:22:01,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 03:22:01,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:01,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:22:02,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=232813.33333333334, ans=0.125 2023-09-29 03:22:03,354 INFO [train.py:1039] (3/4) Epoch 7, batch 3050, loss[loss=0.3367, simple_loss=0.3656, pruned_loss=0.1539, over 19728.00 frames. ], tot_loss[loss=0.2278, simple_loss=0.2925, pruned_loss=0.08156, over 4698816.90 frames. ], batch size: 388, lr: 1.46e-02, grad_scale: 8.0 2023-09-29 03:22:05,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 03:22:08,485 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:22:11,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:22:11,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:22:14,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:17,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 03:22:21,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=232880.0, ans=0.0 2023-09-29 03:22:24,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 03:22:25,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 03:22:25,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:22:31,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:22:31,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=232880.0, ans=0.125 2023-09-29 03:22:35,062 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:35,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:22:36,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:22:39,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:22:39,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:22:39,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:22:41,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:22:41,204 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:22:42,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:44,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:22:44,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=232946.66666666666, ans=0.2 2023-09-29 03:22:46,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:22:46,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 03:22:48,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:22:48,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:22:52,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:22:53,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:22:53,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:22:54,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:23:00,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:23:00,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:23:09,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:23:09,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:23:09,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:23:11,240 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 2.044e+02 2.286e+02 2.664e+02 4.744e+02, threshold=4.572e+02, percent-clipped=0.0 2023-09-29 03:23:11,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:23:11,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 03:23:12,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:23:14,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 03:23:14,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:23:15,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:23:17,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 03:23:20,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:23:25,321 INFO [train.py:1039] (3/4) Epoch 7, batch 3100, loss[loss=0.2019, simple_loss=0.2742, pruned_loss=0.06478, over 24490.00 frames. ], tot_loss[loss=0.2259, simple_loss=0.291, pruned_loss=0.08035, over 4710226.00 frames. ], batch size: 63, lr: 1.46e-02, grad_scale: 8.0 2023-09-29 03:23:27,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:23:27,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:23:30,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:23:31,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 03:23:35,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 03:23:36,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 03:23:38,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:23:43,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:23:43,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:23:45,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 03:23:48,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:23:53,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 03:23:58,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 03:23:59,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:23:59,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:23:59,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:24:01,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 03:24:02,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=233280.0, ans=0.0 2023-09-29 03:24:04,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:24:04,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 03:24:04,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:24:05,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:24:07,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 03:24:09,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:24:12,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=233280.0, ans=0.125 2023-09-29 03:24:13,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:24:14,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 03:24:15,109 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:24:16,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 03:24:18,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:18,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:24:20,367 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:24:21,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:24:21,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:21,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:24:22,142 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=233346.66666666666, ans=0.0 2023-09-29 03:24:23,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:24:23,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:24:24,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:24:26,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:24:26,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:26,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 03:24:29,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:24:29,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 03:24:32,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:24:33,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 03:24:33,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:24:33,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:35,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 03:24:35,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=233413.33333333334, ans=0.125 2023-09-29 03:24:39,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=233413.33333333334, ans=0.125 2023-09-29 03:24:47,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 03:24:48,343 INFO [train.py:1039] (3/4) Epoch 7, batch 3150, loss[loss=0.2221, simple_loss=0.2812, pruned_loss=0.08148, over 23471.00 frames. ], tot_loss[loss=0.2262, simple_loss=0.2907, pruned_loss=0.08086, over 4709639.64 frames. ], batch size: 134, lr: 1.46e-02, grad_scale: 8.0 2023-09-29 03:24:50,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:24:50,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:24:51,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:24:51,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:24:53,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 03:24:54,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=233480.0, ans=0.2 2023-09-29 03:24:55,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:24:55,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 03:24:57,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 03:24:58,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:25:00,812 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 03:25:04,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 03:25:04,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:25:05,452 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 03:25:06,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 03:25:08,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 03:25:08,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 03:25:08,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 03:25:08,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:25:08,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:25:08,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:25:10,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 03:25:14,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:25:14,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:25:15,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:25:17,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 03:25:22,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 03:25:23,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:25:26,406 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:25:27,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:25:27,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 03:25:31,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 03:25:32,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:25:32,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 03:25:32,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 03:25:33,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:25:33,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:25:35,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:25:36,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:25:38,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 03:25:39,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:25:39,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:25:41,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:25:41,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:25:42,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 03:25:42,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:25:44,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 03:25:44,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:25:44,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 03:25:46,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 03:25:48,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:25:49,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:25:51,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 03:25:53,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 03:25:53,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:25:56,378 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 2.159e+02 2.421e+02 2.808e+02 3.931e+02, threshold=4.841e+02, percent-clipped=0.0 2023-09-29 03:25:56,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:25:58,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:25:58,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:25:58,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=233746.66666666666, ans=0.0 2023-09-29 03:26:01,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=233746.66666666666, ans=0.125 2023-09-29 03:26:04,143 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:26:04,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:26:07,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 03:26:08,590 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=7.33 vs. limit=12.0 2023-09-29 03:26:10,852 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.72 vs. limit=15.0 2023-09-29 03:26:11,313 INFO [train.py:1039] (3/4) Epoch 7, batch 3200, loss[loss=0.2321, simple_loss=0.2914, pruned_loss=0.08634, over 23255.00 frames. ], tot_loss[loss=0.2252, simple_loss=0.2899, pruned_loss=0.08029, over 4716033.96 frames. ], batch size: 93, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:26:14,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:26:14,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 03:26:18,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:26:20,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:26:20,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 03:26:24,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:26:27,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:26:30,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:26:30,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=233880.0, ans=0.125 2023-09-29 03:26:39,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:26:48,876 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.49 vs. limit=22.5 2023-09-29 03:26:51,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 03:26:51,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:26:54,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 03:26:54,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=233946.66666666666, ans=0.09899494936611666 2023-09-29 03:26:56,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 03:27:01,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:27:01,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:27:01,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:27:04,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 03:27:06,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 03:27:06,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=234013.33333333334, ans=0.0 2023-09-29 03:27:07,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 03:27:10,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 03:27:12,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:27:19,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:27:19,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:27:19,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:27:20,908 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 03:27:20,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:27:22,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=234080.0, ans=0.2 2023-09-29 03:27:26,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:27:26,722 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.83 vs. limit=15.0 2023-09-29 03:27:27,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 03:27:28,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 03:27:30,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 03:27:32,180 INFO [train.py:1039] (3/4) Epoch 7, batch 3250, loss[loss=0.2316, simple_loss=0.2874, pruned_loss=0.08784, over 23745.00 frames. ], tot_loss[loss=0.2258, simple_loss=0.2901, pruned_loss=0.08073, over 4712785.62 frames. ], batch size: 212, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:27:32,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 03:27:33,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:27:34,119 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=234146.66666666666, ans=0.125 2023-09-29 03:27:36,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:27:36,985 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 03:27:38,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:27:38,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:27:40,053 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 03:27:43,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:27:45,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=234146.66666666666, ans=0.125 2023-09-29 03:27:46,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:27:52,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=234213.33333333334, ans=0.125 2023-09-29 03:27:53,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:27:53,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 03:27:53,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:27:54,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:27:54,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:27:56,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:27:56,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:27:56,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=234213.33333333334, ans=0.125 2023-09-29 03:28:00,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:28:01,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:28:02,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:28:02,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:28:02,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:28:02,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:28:04,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:28:06,901 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=234280.0, ans=0.07 2023-09-29 03:28:08,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:28:09,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:28:10,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:28:12,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:28:12,536 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:28:12,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:28:12,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=234280.0, ans=0.0 2023-09-29 03:28:17,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 03:28:17,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:28:17,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:28:18,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:28:18,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:28:25,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:28:34,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=234346.66666666666, ans=0.0 2023-09-29 03:28:35,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:28:35,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:28:35,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 03:28:35,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:28:35,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 03:28:37,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:28:38,674 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 2.002e+02 2.212e+02 2.720e+02 4.684e+02, threshold=4.424e+02, percent-clipped=0.0 2023-09-29 03:28:38,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 03:28:38,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 03:28:39,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:28:41,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:28:42,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:28:44,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 03:28:44,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:28:48,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:28:48,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:28:50,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 03:28:50,350 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:28:52,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:28:52,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 03:28:53,363 INFO [train.py:1039] (3/4) Epoch 7, batch 3300, loss[loss=0.2024, simple_loss=0.2789, pruned_loss=0.06294, over 24639.00 frames. ], tot_loss[loss=0.2266, simple_loss=0.2913, pruned_loss=0.08095, over 4716464.34 frames. ], batch size: 65, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:28:55,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:28:55,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 03:28:58,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 03:29:00,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 03:29:00,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:29:00,549 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=234480.0, ans=0.125 2023-09-29 03:29:03,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:29:05,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:29:06,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:07,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 03:29:08,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 03:29:10,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:29:12,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:29:16,102 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 03:29:18,463 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.79 vs. limit=15.0 2023-09-29 03:29:18,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:29:19,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:29:20,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:20,515 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 03:29:23,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:29:23,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 03:29:23,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:29:23,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:29:23,626 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 03:29:28,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:29:28,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:29:30,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:30,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 03:29:31,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 03:29:31,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:32,008 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=234613.33333333334, ans=0.125 2023-09-29 03:29:34,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:29:35,014 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 03:29:36,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=234613.33333333334, ans=0.125 2023-09-29 03:29:38,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 03:29:38,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:29:42,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 03:29:45,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:29:48,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 03:29:48,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:29:53,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:29:53,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:29:53,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:29:54,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:29:56,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:29:58,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:29:58,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:29:59,976 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 03:30:00,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 03:30:03,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:30:04,704 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:30:04,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:30:06,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:30:06,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:30:07,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:30:07,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:07,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:30:09,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:30:11,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:30:15,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 03:30:15,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:16,645 INFO [train.py:1039] (3/4) Epoch 7, batch 3350, loss[loss=0.1982, simple_loss=0.272, pruned_loss=0.06214, over 24648.00 frames. ], tot_loss[loss=0.2264, simple_loss=0.2913, pruned_loss=0.08074, over 4723344.09 frames. ], batch size: 65, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:30:16,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:18,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:30:18,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:30:19,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=234813.33333333334, ans=0.2 2023-09-29 03:30:20,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:30:23,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:30:23,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:26,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:30:28,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:28,783 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.33 vs. limit=15.0 2023-09-29 03:30:29,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:30:31,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:31,968 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=234880.0, ans=0.125 2023-09-29 03:30:34,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:30:34,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:30:36,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:30:36,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 03:30:38,122 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 03:30:39,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:30:43,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 03:30:43,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 03:30:46,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:30:46,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:30:46,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:30:47,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 03:30:47,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:47,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:30:49,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:51,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:52,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:30:52,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:30:56,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:30:59,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:30:59,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:31:02,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:31:04,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:31:07,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:31:07,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:31:10,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:31:12,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 03:31:12,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:31:12,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 03:31:14,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:31:15,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 03:31:17,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:31:17,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:31:17,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=235013.33333333334, ans=0.04949747468305833 2023-09-29 03:31:20,487 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=235080.0, ans=0.125 2023-09-29 03:31:23,276 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 1.937e+02 2.206e+02 2.498e+02 3.654e+02, threshold=4.412e+02, percent-clipped=0.0 2023-09-29 03:31:27,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:31:27,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 03:31:28,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:31:29,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:31:30,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:31:35,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:31:38,485 INFO [train.py:1039] (3/4) Epoch 7, batch 3400, loss[loss=0.1986, simple_loss=0.271, pruned_loss=0.06304, over 24457.00 frames. ], tot_loss[loss=0.2272, simple_loss=0.2925, pruned_loss=0.08091, over 4742446.10 frames. ], batch size: 58, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:31:38,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 03:31:38,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:31:38,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:31:40,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:31:42,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 03:31:43,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:31:43,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 03:31:45,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:31:45,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:31:45,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:31:48,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:31:48,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 03:31:51,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 03:31:52,015 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 03:31:52,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:31:56,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:31:56,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:31:56,873 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:31:58,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:32:03,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:32:05,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 03:32:12,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:32:14,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:32:15,584 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:32:17,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 03:32:17,758 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=14.34 vs. limit=15.0 2023-09-29 03:32:22,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:32:25,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 03:32:30,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:32:31,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:32:31,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 03:32:31,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:32:33,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:32:33,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:32:35,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:32:38,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:32:43,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:32:43,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:32:51,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:32:52,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 03:32:52,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=235413.33333333334, ans=0.1 2023-09-29 03:32:57,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 03:33:00,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 03:33:01,932 INFO [train.py:1039] (3/4) Epoch 7, batch 3450, loss[loss=0.2016, simple_loss=0.2702, pruned_loss=0.06648, over 24306.00 frames. ], tot_loss[loss=0.2264, simple_loss=0.2916, pruned_loss=0.08054, over 4749453.61 frames. ], batch size: 56, lr: 1.46e-02, grad_scale: 16.0 2023-09-29 03:33:06,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 03:33:08,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:33:09,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:33:10,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 03:33:10,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:33:13,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:33:18,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:33:18,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:33:20,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:33:20,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:33:20,446 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=235546.66666666666, ans=0.125 2023-09-29 03:33:23,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:33:30,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 03:33:35,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 03:33:35,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 03:33:35,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:33:38,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:33:42,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 03:33:42,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=235613.33333333334, ans=0.125 2023-09-29 03:33:43,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:33:48,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:33:50,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:33:50,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:33:51,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:33:54,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 03:33:54,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:33:55,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=235680.0, ans=0.125 2023-09-29 03:33:57,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:34:00,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:34:03,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 03:34:05,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=235680.0, ans=0.0 2023-09-29 03:34:06,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:34:09,684 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 1.982e+02 2.278e+02 2.656e+02 4.314e+02, threshold=4.555e+02, percent-clipped=0.0 2023-09-29 03:34:11,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:34:13,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:16,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:34:16,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=235746.66666666666, ans=0.1 2023-09-29 03:34:21,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:21,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:34:21,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:34:23,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:34:24,935 INFO [train.py:1039] (3/4) Epoch 7, batch 3500, loss[loss=0.2371, simple_loss=0.2612, pruned_loss=0.1065, over 19376.00 frames. ], tot_loss[loss=0.2254, simple_loss=0.2903, pruned_loss=0.0803, over 4733306.28 frames. ], batch size: 389, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:34:26,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:34:28,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=235813.33333333334, ans=0.0 2023-09-29 03:34:30,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:34:30,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 03:34:32,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 03:34:35,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten.whitening_limit, batch_count=235813.33333333334, ans=22.5 2023-09-29 03:34:35,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 03:34:37,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:34:37,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 03:34:42,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:34:44,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:34:44,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:34:44,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:34:46,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 03:34:47,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:47,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:34:47,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 03:34:49,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:50,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 03:34:52,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:34:56,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:34:57,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 03:34:57,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:34:59,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=235946.66666666666, ans=0.125 2023-09-29 03:35:01,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:35:04,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:35:05,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:35:07,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:35:09,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:35:12,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 03:35:12,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 03:35:14,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 03:35:14,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:35:15,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:35:16,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:35:17,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:35:21,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 03:35:21,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:35:27,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:35:28,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 03:35:28,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 03:35:28,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:35:32,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:35:32,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:35:35,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:35:38,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 03:35:38,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:35:42,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:35:42,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 03:35:44,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 03:35:47,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:35:47,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:35:47,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:35:47,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:35:49,010 INFO [train.py:1039] (3/4) Epoch 7, batch 3550, loss[loss=0.2412, simple_loss=0.3039, pruned_loss=0.08926, over 23346.00 frames. ], tot_loss[loss=0.2239, simple_loss=0.289, pruned_loss=0.07941, over 4732132.16 frames. ], batch size: 93, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:35:50,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:36:00,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:36:03,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 03:36:07,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:36:08,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:36:08,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:36:08,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=236213.33333333334, ans=0.125 2023-09-29 03:36:10,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:36:10,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:36:14,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:36:15,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:36:15,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:36:15,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 03:36:16,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:36:19,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=236213.33333333334, ans=0.0 2023-09-29 03:36:22,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:36:22,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:36:23,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:36:23,825 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:36:23,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:36:23,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 03:36:25,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:36:26,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:36:28,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 03:36:32,633 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=236280.0, ans=6.0 2023-09-29 03:36:34,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:36:34,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:36:36,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:36:38,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 03:36:40,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:36:41,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 03:36:41,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:36:43,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:36:43,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:36:46,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 03:36:50,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:36:54,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:36:56,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 03:36:56,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:36:57,955 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 2.050e+02 2.243e+02 2.592e+02 3.943e+02, threshold=4.485e+02, percent-clipped=0.0 2023-09-29 03:36:59,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:37:01,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 03:37:07,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 03:37:07,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:37:08,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:37:08,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=236413.33333333334, ans=0.1 2023-09-29 03:37:10,909 INFO [train.py:1039] (3/4) Epoch 7, batch 3600, loss[loss=0.2237, simple_loss=0.2839, pruned_loss=0.08176, over 23718.00 frames. ], tot_loss[loss=0.2242, simple_loss=0.289, pruned_loss=0.0797, over 4740484.61 frames. ], batch size: 179, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:37:10,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:37:11,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:37:13,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:37:15,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=236480.0, ans=0.09899494936611666 2023-09-29 03:37:16,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:37:18,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:37:18,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:37:20,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:37:21,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:37:21,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 03:37:25,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:37:26,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:37:26,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=236546.66666666666, ans=0.0 2023-09-29 03:37:26,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=236546.66666666666, ans=0.125 2023-09-29 03:37:29,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:37:32,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:37:33,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:37:34,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:37:34,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 03:37:34,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:37:35,537 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.79 vs. limit=10.0 2023-09-29 03:37:38,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:37:38,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:37:41,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:37:42,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:37:42,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:37:45,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 03:37:45,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=236613.33333333334, ans=0.1 2023-09-29 03:37:51,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:37:54,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 03:37:54,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 03:37:57,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:38:02,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:38:05,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:38:09,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=236680.0, ans=0.2 2023-09-29 03:38:11,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:38:12,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:38:12,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 03:38:13,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 03:38:13,674 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 03:38:15,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:38:17,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:38:18,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 03:38:20,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:38:20,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:38:20,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:38:22,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 03:38:23,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 03:38:27,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:38:28,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 03:38:32,939 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.13 vs. limit=15.0 2023-09-29 03:38:34,830 INFO [train.py:1039] (3/4) Epoch 7, batch 3650, loss[loss=0.2595, simple_loss=0.315, pruned_loss=0.1021, over 23765.00 frames. ], tot_loss[loss=0.2259, simple_loss=0.2905, pruned_loss=0.08069, over 4734974.16 frames. ], batch size: 212, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:38:34,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 03:38:36,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:38:39,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 03:38:42,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 03:38:46,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:38:46,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:38:47,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:38:51,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 03:38:51,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:38:51,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 03:38:51,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:38:51,978 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=236880.0, ans=0.125 2023-09-29 03:38:53,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:38:53,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 03:38:54,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 03:38:55,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:38:57,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:38:58,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:39:00,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 03:39:02,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 03:39:04,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:39:05,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 03:39:05,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:39:05,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:39:06,203 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=236880.0, ans=0.07 2023-09-29 03:39:10,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=236946.66666666666, ans=0.125 2023-09-29 03:39:10,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=236946.66666666666, ans=0.125 2023-09-29 03:39:13,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:39:13,737 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=236946.66666666666, ans=0.1 2023-09-29 03:39:16,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:39:16,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:39:17,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:39:19,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:39:21,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:39:24,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:39:26,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:39:26,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:39:28,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:39:28,542 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:39:28,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=237013.33333333334, ans=0.1 2023-09-29 03:39:30,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:39:36,526 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 03:39:40,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=237080.0, ans=0.125 2023-09-29 03:39:41,621 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:39:41,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:39:43,171 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:39:43,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:39:44,557 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 2.001e+02 2.357e+02 2.750e+02 4.366e+02, threshold=4.713e+02, percent-clipped=0.0 2023-09-29 03:39:44,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:39:46,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:39:47,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 03:39:47,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:39:51,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:39:52,638 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:39:52,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:39:56,981 INFO [train.py:1039] (3/4) Epoch 7, batch 3700, loss[loss=0.2105, simple_loss=0.2929, pruned_loss=0.06404, over 24641.00 frames. ], tot_loss[loss=0.2269, simple_loss=0.2916, pruned_loss=0.08109, over 4737336.68 frames. ], batch size: 73, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:39:57,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:39:57,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 03:39:57,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:39:57,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 03:39:58,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 03:40:01,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=237146.66666666666, ans=0.125 2023-09-29 03:40:02,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 03:40:04,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:40:06,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:40:07,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:40:07,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:40:09,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 03:40:10,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:40:12,735 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 03:40:20,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:40:21,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 03:40:23,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:40:23,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 03:40:23,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:40:29,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:40:29,275 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=237280.0, ans=0.125 2023-09-29 03:40:30,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 03:40:30,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:40:31,418 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.15 vs. limit=15.0 2023-09-29 03:40:32,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:40:35,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:40:35,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:40:37,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 03:40:42,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:40:42,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 03:40:42,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:40:43,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 03:40:50,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:40:50,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:40:53,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:40:53,847 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=237346.66666666666, ans=0.0 2023-09-29 03:40:55,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 03:40:58,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:40:58,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:40:58,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:40:58,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:41:01,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:41:02,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 03:41:03,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 03:41:04,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:41:04,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:41:07,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:41:07,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:41:11,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:41:14,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:41:17,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:41:18,640 INFO [train.py:1039] (3/4) Epoch 7, batch 3750, loss[loss=0.2021, simple_loss=0.27, pruned_loss=0.06713, over 18988.00 frames. ], tot_loss[loss=0.2278, simple_loss=0.2929, pruned_loss=0.08134, over 4738090.98 frames. ], batch size: 41, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:41:18,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 03:41:20,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 03:41:21,024 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.03 vs. limit=10.0 2023-09-29 03:41:23,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 03:41:23,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 03:41:25,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:41:27,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:41:27,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:41:30,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:41:33,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:41:37,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 03:41:38,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=237546.66666666666, ans=0.125 2023-09-29 03:41:39,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:41:40,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:41:43,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:41:45,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 03:41:47,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:41:48,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:41:49,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:41:54,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 03:41:57,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 03:41:59,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:41:59,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:42:01,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:42:06,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:42:07,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 03:42:10,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 03:42:14,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:42:17,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:42:19,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:42:20,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:42:21,155 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=237680.0, ans=0.1 2023-09-29 03:42:27,704 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.086e+02 2.277e+02 2.555e+02 3.671e+02, threshold=4.554e+02, percent-clipped=0.0 2023-09-29 03:42:27,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 03:42:29,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 03:42:32,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:42:32,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:42:34,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=237746.66666666666, ans=0.125 2023-09-29 03:42:35,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 03:42:40,461 INFO [train.py:1039] (3/4) Epoch 7, batch 3800, loss[loss=0.2229, simple_loss=0.276, pruned_loss=0.08495, over 23811.00 frames. ], tot_loss[loss=0.2284, simple_loss=0.2925, pruned_loss=0.08216, over 4725722.82 frames. ], batch size: 195, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:42:45,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:42:49,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:42:49,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 03:42:51,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 03:42:52,503 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=237813.33333333334, ans=0.125 2023-09-29 03:42:53,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:42:55,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:42:55,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 03:42:55,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=237880.0, ans=0.1 2023-09-29 03:42:56,168 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.56 vs. limit=15.0 2023-09-29 03:42:56,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 03:42:56,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:42:58,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:42:58,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=237880.0, ans=0.0 2023-09-29 03:43:01,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:43:01,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:43:01,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:43:03,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 03:43:06,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 03:43:08,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:43:09,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:43:12,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=237946.66666666666, ans=0.0 2023-09-29 03:43:13,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:43:14,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 03:43:16,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:43:17,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:43:19,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:43:20,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:43:26,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:43:26,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 03:43:27,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:43:32,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:43:32,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=238013.33333333334, ans=0.0 2023-09-29 03:43:41,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:43:44,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 03:43:46,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 03:43:48,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:43:49,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:43:51,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:43:52,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 03:43:56,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 03:43:56,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 03:43:56,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:43:57,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:44:03,070 INFO [train.py:1039] (3/4) Epoch 7, batch 3850, loss[loss=0.2298, simple_loss=0.2988, pruned_loss=0.08041, over 24032.00 frames. ], tot_loss[loss=0.2271, simple_loss=0.2911, pruned_loss=0.08151, over 4725694.25 frames. ], batch size: 86, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:44:03,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:44:04,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:44:11,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:44:11,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 03:44:12,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:44:15,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:44:18,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 03:44:20,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:44:23,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 03:44:24,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 03:44:28,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=238213.33333333334, ans=0.0 2023-09-29 03:44:29,724 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:31,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:44:34,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:44:35,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:44:37,310 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.66 vs. limit=15.0 2023-09-29 03:44:39,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:39,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:44:40,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:44:40,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:44:41,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:44:44,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:44:44,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:44,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:44:46,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 03:44:46,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 03:44:48,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:44:48,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:52,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:44:52,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:44:52,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 03:44:55,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=238346.66666666666, ans=0.07 2023-09-29 03:44:56,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 03:44:58,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:44:59,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 03:45:01,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 03:45:07,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:45:08,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:45:13,053 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.749e+02 2.089e+02 2.371e+02 2.859e+02 5.421e+02, threshold=4.742e+02, percent-clipped=3.0 2023-09-29 03:45:13,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:45:13,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 03:45:15,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 03:45:18,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:45:18,603 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.02 vs. limit=15.0 2023-09-29 03:45:19,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:45:19,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:45:19,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 03:45:21,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:23,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:23,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:45:23,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 03:45:24,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:45:26,770 INFO [train.py:1039] (3/4) Epoch 7, batch 3900, loss[loss=0.1956, simple_loss=0.2338, pruned_loss=0.07871, over 19229.00 frames. ], tot_loss[loss=0.2257, simple_loss=0.2896, pruned_loss=0.08093, over 4709958.60 frames. ], batch size: 388, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:45:28,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 03:45:28,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:28,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:45:29,177 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.71 vs. limit=15.0 2023-09-29 03:45:29,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:45:30,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:31,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:45:33,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:45:33,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:45:33,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:45:33,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 03:45:34,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:39,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:45:39,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:45:41,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:45:41,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:45:43,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:45:44,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:46,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:45:47,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 03:45:47,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:45:50,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 03:45:50,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:45:50,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 03:45:52,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 03:45:57,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:45:59,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:46:00,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:46:01,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:46:03,163 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=238613.33333333334, ans=0.0 2023-09-29 03:46:04,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=238613.33333333334, ans=0.0 2023-09-29 03:46:05,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:46:07,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:46:12,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:46:12,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:46:13,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:46:19,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:46:20,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:46:25,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 03:46:26,840 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:46:37,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:46:39,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:46:41,142 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 03:46:41,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 03:46:41,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 03:46:42,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 03:46:44,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:46:44,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 03:46:48,673 INFO [train.py:1039] (3/4) Epoch 7, batch 3950, loss[loss=0.2248, simple_loss=0.3028, pruned_loss=0.07336, over 24436.00 frames. ], tot_loss[loss=0.2242, simple_loss=0.289, pruned_loss=0.07974, over 4708085.77 frames. ], batch size: 69, lr: 1.45e-02, grad_scale: 16.0 2023-09-29 03:46:52,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:46:54,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 03:46:55,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:46:57,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:47:00,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:47:02,258 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:47:04,017 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 03:47:06,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:47:06,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 03:47:07,467 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 03:47:07,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:47:11,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:47:11,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 03:47:11,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:47:14,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 03:47:17,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:47:17,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:47:17,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:47:18,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:47:18,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 03:47:27,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=238946.66666666666, ans=0.04949747468305833 2023-09-29 03:47:28,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=238946.66666666666, ans=0.125 2023-09-29 03:47:30,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:47:31,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:47:36,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 03:47:44,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 03:47:44,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 03:47:45,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:47:45,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:47:51,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:47:52,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 03:47:52,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:47:53,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:47:53,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 03:47:57,667 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 2.025e+02 2.218e+02 2.611e+02 3.934e+02, threshold=4.435e+02, percent-clipped=0.0 2023-09-29 03:47:57,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:47:58,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:48:01,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 03:48:01,973 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 03:48:11,837 INFO [train.py:1039] (3/4) Epoch 7, batch 4000, loss[loss=0.2121, simple_loss=0.2847, pruned_loss=0.06978, over 24470.00 frames. ], tot_loss[loss=0.2241, simple_loss=0.2893, pruned_loss=0.07945, over 4714720.49 frames. ], batch size: 63, lr: 1.45e-02, grad_scale: 32.0 2023-09-29 03:48:12,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:48:18,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:48:24,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:48:24,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:48:25,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=239146.66666666666, ans=0.125 2023-09-29 03:48:26,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:48:26,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 03:48:27,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:48:27,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 03:48:27,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:48:27,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 03:48:30,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:48:33,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:48:33,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:48:34,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:48:34,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=239213.33333333334, ans=0.0 2023-09-29 03:48:36,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:48:36,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 03:48:39,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:48:41,228 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 03:48:41,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:48:43,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:48:46,644 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 03:48:48,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:48:49,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:48:56,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=239280.0, ans=0.0 2023-09-29 03:48:57,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 03:48:57,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:48:59,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=239346.66666666666, ans=0.0 2023-09-29 03:49:00,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:49:02,084 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 03:49:03,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:49:03,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 03:49:03,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:49:05,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:49:06,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:49:08,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:49:09,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:49:09,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:49:11,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 03:49:11,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:49:12,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=239346.66666666666, ans=0.07 2023-09-29 03:49:13,305 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 03:49:18,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:49:20,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 03:49:22,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:49:22,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:49:23,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:49:25,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:49:29,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:49:30,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 03:49:30,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 03:49:31,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=239413.33333333334, ans=0.0 2023-09-29 03:49:33,626 INFO [train.py:1039] (3/4) Epoch 7, batch 4050, loss[loss=0.2621, simple_loss=0.3048, pruned_loss=0.1097, over 22884.00 frames. ], tot_loss[loss=0.2246, simple_loss=0.2897, pruned_loss=0.07979, over 4723657.62 frames. ], batch size: 322, lr: 1.44e-02, grad_scale: 32.0 2023-09-29 03:49:33,716 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:49:33,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:49:33,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 03:49:36,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:49:38,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:49:42,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:49:46,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:49:46,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 03:49:50,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:49:51,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:49:56,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:49:57,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=239546.66666666666, ans=0.2 2023-09-29 03:49:58,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:50:00,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 03:50:03,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 03:50:03,793 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 03:50:05,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:50:10,195 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=239613.33333333334, ans=10.0 2023-09-29 03:50:12,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 03:50:13,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:50:15,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:50:18,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:50:19,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:50:19,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:50:20,673 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.52 vs. limit=15.0 2023-09-29 03:50:23,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:50:26,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 03:50:26,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 03:50:28,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:50:30,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 03:50:34,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:50:41,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 03:50:42,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=239746.66666666666, ans=0.125 2023-09-29 03:50:43,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:50:43,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:50:44,510 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.27 vs. limit=15.0 2023-09-29 03:50:44,863 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.660e+02 1.977e+02 2.189e+02 2.469e+02 3.390e+02, threshold=4.378e+02, percent-clipped=0.0 2023-09-29 03:50:47,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 03:50:47,869 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 03:50:47,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:50:51,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:50:52,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:50:52,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:50:56,009 INFO [train.py:1039] (3/4) Epoch 7, batch 4100, loss[loss=0.2288, simple_loss=0.3032, pruned_loss=0.07726, over 24428.00 frames. ], tot_loss[loss=0.2244, simple_loss=0.2902, pruned_loss=0.07927, over 4739003.38 frames. ], batch size: 69, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:51:01,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 03:51:02,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 03:51:04,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 03:51:07,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 03:51:07,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:51:07,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=239813.33333333334, ans=0.0 2023-09-29 03:51:09,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:51:09,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:51:09,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:51:10,565 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 03:51:14,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:51:15,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:51:15,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:51:16,810 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.23 vs. limit=6.0 2023-09-29 03:51:17,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:51:18,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 03:51:20,490 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:51:21,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:51:21,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 03:51:22,237 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=239880.0, ans=0.125 2023-09-29 03:51:23,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:51:23,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:51:23,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:51:23,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:51:23,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 03:51:26,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:51:28,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 03:51:30,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:51:32,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:51:32,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 03:51:33,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:51:33,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:51:34,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 03:51:37,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 03:51:37,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=239946.66666666666, ans=0.125 2023-09-29 03:51:38,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 03:51:40,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 03:51:47,076 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.50 vs. limit=15.0 2023-09-29 03:51:48,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 03:51:48,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:51:50,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:51:53,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:51:56,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:52:01,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:52:02,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:52:09,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:52:09,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:52:13,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 03:52:13,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=240080.0, ans=0.125 2023-09-29 03:52:16,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:52:21,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:52:21,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:52:22,667 INFO [train.py:1039] (3/4) Epoch 7, batch 4150, loss[loss=0.2138, simple_loss=0.2881, pruned_loss=0.0698, over 24647.00 frames. ], tot_loss[loss=0.2247, simple_loss=0.2895, pruned_loss=0.07997, over 4725513.13 frames. ], batch size: 73, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:52:22,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:52:22,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:52:26,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 03:52:26,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:52:27,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 03:52:29,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 03:52:29,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 03:52:32,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:52:32,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=240146.66666666666, ans=0.1 2023-09-29 03:52:38,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:52:38,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:52:41,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=240213.33333333334, ans=0.0 2023-09-29 03:52:43,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:52:45,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:52:45,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 03:52:46,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 03:52:48,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 03:52:49,722 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 03:52:50,264 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=240213.33333333334, ans=0.125 2023-09-29 03:52:54,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:52:56,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:52:58,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 03:53:01,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 03:53:01,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:53:03,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 03:53:03,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:53:03,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:53:05,321 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.90 vs. limit=22.5 2023-09-29 03:53:06,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:53:06,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:53:10,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 03:53:13,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:53:16,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:53:16,451 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=240346.66666666666, ans=0.0 2023-09-29 03:53:18,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 03:53:18,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:53:18,980 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.82 vs. limit=15.0 2023-09-29 03:53:19,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 03:53:21,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:53:22,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:53:24,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:53:24,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 03:53:24,636 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:53:24,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 03:53:26,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 03:53:27,199 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=13.07 vs. limit=15.0 2023-09-29 03:53:29,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 03:53:29,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:53:29,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 03:53:29,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 03:53:31,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 03:53:32,547 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.682e+02 2.142e+02 2.391e+02 2.660e+02 4.088e+02, threshold=4.782e+02, percent-clipped=0.0 2023-09-29 03:53:32,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:53:32,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 03:53:32,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:53:33,559 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=5.09 vs. limit=12.0 2023-09-29 03:53:35,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:53:35,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 03:53:36,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 03:53:38,471 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=240413.33333333334, ans=0.1 2023-09-29 03:53:42,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:53:42,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=240480.0, ans=0.95 2023-09-29 03:53:44,027 INFO [train.py:1039] (3/4) Epoch 7, batch 4200, loss[loss=0.2229, simple_loss=0.2864, pruned_loss=0.07968, over 24428.00 frames. ], tot_loss[loss=0.224, simple_loss=0.2884, pruned_loss=0.07984, over 4708393.17 frames. ], batch size: 58, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:53:44,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 03:53:45,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:53:48,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:53:49,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=240480.0, ans=0.125 2023-09-29 03:53:51,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 03:53:51,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:53:51,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:53:51,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=240480.0, ans=0.125 2023-09-29 03:53:54,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 03:53:57,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 03:53:57,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:53:59,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:54:01,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:54:05,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 03:54:09,398 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:54:09,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:54:10,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 03:54:10,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 03:54:11,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:54:12,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:54:12,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 03:54:15,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 03:54:18,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 03:54:18,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:54:21,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 03:54:22,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=240613.33333333334, ans=0.125 2023-09-29 03:54:23,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:54:27,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:54:30,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:54:32,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:54:32,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 03:54:33,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:54:33,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:54:38,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 03:54:39,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:54:45,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:54:49,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 03:54:50,195 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=240746.66666666666, ans=0.0 2023-09-29 03:54:52,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:54:56,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 03:54:56,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:54:59,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 03:55:05,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 03:55:06,373 INFO [train.py:1039] (3/4) Epoch 7, batch 4250, loss[loss=0.2315, simple_loss=0.2988, pruned_loss=0.08214, over 24487.00 frames. ], tot_loss[loss=0.2237, simple_loss=0.2883, pruned_loss=0.07955, over 4719363.47 frames. ], batch size: 63, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:55:09,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 03:55:09,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 03:55:12,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:55:17,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 03:55:19,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 03:55:19,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:55:21,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:55:24,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:55:29,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:55:31,427 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:55:33,100 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 03:55:33,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:55:34,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:55:36,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:55:37,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:55:40,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:55:42,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:55:44,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 03:55:45,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=240946.66666666666, ans=0.0 2023-09-29 03:55:47,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 03:55:47,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:55:49,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:55:49,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:55:49,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=240946.66666666666, ans=0.0 2023-09-29 03:55:49,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=240946.66666666666, ans=0.125 2023-09-29 03:55:51,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 03:55:51,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:55:52,208 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.00 vs. limit=15.0 2023-09-29 03:55:52,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:55:56,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 03:55:57,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 03:56:00,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:56:02,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:56:03,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 03:56:03,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 03:56:06,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 03:56:07,481 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 03:56:09,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:56:09,936 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.38 vs. limit=15.0 2023-09-29 03:56:12,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:56:12,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:56:13,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 03:56:15,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 03:56:16,641 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.803e+02 2.211e+02 2.441e+02 2.743e+02 4.963e+02, threshold=4.882e+02, percent-clipped=1.0 2023-09-29 03:56:16,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 03:56:19,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:56:23,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:56:25,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 03:56:26,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:56:27,493 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.61 vs. limit=6.0 2023-09-29 03:56:28,516 INFO [train.py:1039] (3/4) Epoch 7, batch 4300, loss[loss=0.2112, simple_loss=0.281, pruned_loss=0.0707, over 24611.00 frames. ], tot_loss[loss=0.2236, simple_loss=0.288, pruned_loss=0.07963, over 4728599.17 frames. ], batch size: 60, lr: 1.44e-02, grad_scale: 8.0 2023-09-29 03:56:28,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:56:30,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:56:30,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:56:30,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 03:56:31,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:56:32,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=241146.66666666666, ans=0.05 2023-09-29 03:56:37,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:56:37,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:56:43,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:56:50,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:56:50,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 03:56:51,499 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 03:56:54,609 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:56:54,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 03:56:54,683 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 03:57:01,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 03:57:01,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:57:04,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 03:57:04,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 03:57:05,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 03:57:07,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 03:57:09,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 03:57:11,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 03:57:11,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 03:57:12,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 03:57:14,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:57:16,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:57:16,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 03:57:17,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 03:57:20,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:57:23,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:57:23,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 03:57:23,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:57:24,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 03:57:24,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 03:57:24,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 03:57:25,902 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 03:57:26,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:57:27,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 03:57:27,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 03:57:29,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:57:31,593 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.94 vs. limit=15.0 2023-09-29 03:57:32,862 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 03:57:32,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 03:57:36,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:57:36,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:57:39,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 03:57:41,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 03:57:41,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:57:41,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:57:41,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:57:42,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 03:57:42,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:57:44,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:57:45,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=241413.33333333334, ans=0.2 2023-09-29 03:57:46,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:57:46,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 03:57:51,828 INFO [train.py:1039] (3/4) Epoch 7, batch 4350, loss[loss=0.2298, simple_loss=0.2861, pruned_loss=0.08671, over 23654.00 frames. ], tot_loss[loss=0.2238, simple_loss=0.2886, pruned_loss=0.0795, over 4724173.46 frames. ], batch size: 135, lr: 1.44e-02, grad_scale: 8.0 2023-09-29 03:57:53,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 03:57:53,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 03:57:58,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:58:01,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:58:04,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 03:58:04,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:58:04,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=241480.0, ans=0.125 2023-09-29 03:58:10,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 03:58:13,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:58:15,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 03:58:16,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:58:20,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 03:58:23,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 03:58:25,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 03:58:27,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=241613.33333333334, ans=0.0 2023-09-29 03:58:30,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 03:58:31,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:58:32,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:58:37,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:58:39,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 03:58:44,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:58:46,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 03:58:49,670 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=241680.0, ans=0.125 2023-09-29 03:58:50,880 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 03:58:52,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:58:52,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 03:58:53,871 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 03:58:56,001 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 03:58:56,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:58:56,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:58:57,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 03:58:57,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:58:59,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 03:58:59,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:59:02,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 03:59:02,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:02,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:59:02,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:04,152 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.804e+02 2.126e+02 2.357e+02 2.632e+02 4.633e+02, threshold=4.715e+02, percent-clipped=0.0 2023-09-29 03:59:04,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 03:59:04,960 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.95 vs. limit=15.0 2023-09-29 03:59:05,826 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 03:59:05,833 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 03:59:05,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 03:59:08,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 03:59:09,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 03:59:09,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:59:10,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 03:59:12,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 03:59:13,468 INFO [train.py:1039] (3/4) Epoch 7, batch 4400, loss[loss=0.2272, simple_loss=0.3063, pruned_loss=0.07405, over 24635.00 frames. ], tot_loss[loss=0.2246, simple_loss=0.2895, pruned_loss=0.07985, over 4724242.57 frames. ], batch size: 68, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 03:59:15,118 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 03:59:15,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:19,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:59:21,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:22,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 03:59:24,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 03:59:24,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 03:59:25,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 03:59:25,830 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 03:59:25,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 03:59:25,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 03:59:29,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 03:59:31,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:32,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:59:32,606 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 03:59:37,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:59:37,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 03:59:39,156 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 03:59:42,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 03:59:43,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 03:59:43,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 03:59:43,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:59:45,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:59:45,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 03:59:46,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 03:59:48,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 03:59:48,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 03:59:50,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:59:51,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 03:59:51,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 03:59:53,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 03:59:53,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 03:59:53,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 03:59:55,813 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 03:59:58,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:00:05,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:00:08,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 04:00:15,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:00:16,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:00:18,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:00:19,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 04:00:19,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:00:19,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:00:19,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:00:21,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:00:25,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 04:00:29,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 04:00:31,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 04:00:31,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:00:31,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 04:00:34,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:00:36,380 INFO [train.py:1039] (3/4) Epoch 7, batch 4450, loss[loss=0.1835, simple_loss=0.2457, pruned_loss=0.06059, over 17555.00 frames. ], tot_loss[loss=0.2246, simple_loss=0.29, pruned_loss=0.07956, over 4715281.71 frames. ], batch size: 38, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 04:00:36,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:00:38,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 04:00:43,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:00:44,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:00:44,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:00:52,812 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:00:52,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:00:53,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=242213.33333333334, ans=0.1 2023-09-29 04:00:56,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:00:56,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:01:01,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:01:01,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:01:02,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 04:01:02,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:01:02,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:01:02,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:01:02,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:01:07,161 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.41 vs. limit=22.5 2023-09-29 04:01:07,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 04:01:12,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:01:12,511 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:01:14,712 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:01:14,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:01:16,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:01:20,270 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=242280.0, ans=0.2 2023-09-29 04:01:21,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 04:01:22,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 04:01:22,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 04:01:22,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:01:25,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:01:27,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 04:01:30,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:01:33,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:01:35,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 04:01:35,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:01:35,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:01:35,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:01:35,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:01:39,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:01:41,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 04:01:42,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 04:01:44,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:01:45,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:01:47,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:01:49,218 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.705e+02 2.081e+02 2.382e+02 2.836e+02 4.315e+02, threshold=4.764e+02, percent-clipped=0.0 2023-09-29 04:01:49,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:01:50,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 04:01:52,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:01:56,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 04:01:56,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=242413.33333333334, ans=0.125 2023-09-29 04:01:56,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=242413.33333333334, ans=0.04949747468305833 2023-09-29 04:01:57,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:01:59,219 INFO [train.py:1039] (3/4) Epoch 7, batch 4500, loss[loss=0.2083, simple_loss=0.2763, pruned_loss=0.07012, over 24649.00 frames. ], tot_loss[loss=0.2254, simple_loss=0.2905, pruned_loss=0.08011, over 4713859.25 frames. ], batch size: 65, lr: 1.44e-02, grad_scale: 16.0 2023-09-29 04:02:04,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:02:05,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 04:02:05,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 04:02:06,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=242480.0, ans=0.125 2023-09-29 04:02:08,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:02:12,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:02:12,812 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:02:14,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:02:15,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:02:15,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:02:15,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:02:23,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=242546.66666666666, ans=0.1 2023-09-29 04:02:28,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:02:29,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:02:31,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:02:32,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:02:34,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:02:40,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 04:02:44,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:02:49,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:02:50,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:02:52,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 04:02:52,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:02:54,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:02:56,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:02:57,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:02:59,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:02:59,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 04:02:59,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:02:59,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:03:04,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:03:04,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:03:07,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:03:10,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:03:10,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:03:13,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 04:03:13,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 04:03:13,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 04:03:19,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 04:03:22,556 INFO [train.py:1039] (3/4) Epoch 7, batch 4550, loss[loss=0.2193, simple_loss=0.2659, pruned_loss=0.08641, over 22636.00 frames. ], tot_loss[loss=0.225, simple_loss=0.289, pruned_loss=0.08051, over 4693978.41 frames. ], batch size: 322, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:03:22,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 04:03:22,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:03:26,279 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=242813.33333333334, ans=0.125 2023-09-29 04:03:28,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:03:29,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:03:31,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:03:36,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:03:37,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:03:39,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:03:39,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=242880.0, ans=0.0 2023-09-29 04:03:40,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:03:40,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:03:42,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:03:44,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:03:45,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:03:48,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 04:03:49,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 04:03:51,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:03:52,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 04:03:57,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 04:03:57,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:04:02,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 04:04:04,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:04:07,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:08,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:08,051 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:04:09,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 04:04:12,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:04:15,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:15,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:04:17,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:04:17,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 04:04:17,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 04:04:18,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:04:19,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 04:04:20,824 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 04:04:20,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:04:23,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:04:23,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:04:25,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:25,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:04:27,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:04:29,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 04:04:29,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=243080.0, ans=0.1 2023-09-29 04:04:31,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:04:31,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 04:04:31,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 04:04:31,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:04:31,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 04:04:34,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:04:34,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:04:36,259 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.221e+02 2.599e+02 3.023e+02 5.403e+02, threshold=5.198e+02, percent-clipped=1.0 2023-09-29 04:04:37,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:04:40,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:04:40,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:04:41,648 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:04:43,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:04:46,086 INFO [train.py:1039] (3/4) Epoch 7, batch 4600, loss[loss=0.2156, simple_loss=0.2774, pruned_loss=0.07686, over 23355.00 frames. ], tot_loss[loss=0.2241, simple_loss=0.2881, pruned_loss=0.08002, over 4693749.14 frames. ], batch size: 119, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:04:46,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:04:47,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:04:50,727 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:04:50,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:04:52,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:04:53,869 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 04:04:55,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:04:59,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:05:01,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:05:03,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:07,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=243213.33333333334, ans=0.0 2023-09-29 04:05:10,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 04:05:12,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:14,715 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:05:14,976 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.46 vs. limit=22.5 2023-09-29 04:05:15,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:17,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:05:17,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:05:23,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 04:05:23,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 04:05:25,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:05:32,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:32,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:05:32,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=243346.66666666666, ans=0.0 2023-09-29 04:05:35,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:05:39,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 04:05:41,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:05:45,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:05:46,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:05:49,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:05:49,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 04:05:49,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=243346.66666666666, ans=0.125 2023-09-29 04:05:49,953 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.84 vs. limit=15.0 2023-09-29 04:05:50,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:05:50,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 04:05:50,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:05:50,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:05:52,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:05:53,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:05:55,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:05:56,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 04:05:56,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 04:05:56,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 04:05:56,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:05:58,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:05:59,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:06:01,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:06:07,203 INFO [train.py:1039] (3/4) Epoch 7, batch 4650, loss[loss=0.2202, simple_loss=0.2775, pruned_loss=0.08145, over 23543.00 frames. ], tot_loss[loss=0.224, simple_loss=0.2882, pruned_loss=0.07991, over 4710033.47 frames. ], batch size: 134, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:06:07,792 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:06:12,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:06:12,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=243480.0, ans=0.2 2023-09-29 04:06:16,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:06:16,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:06:18,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:06:18,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:06:18,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:06:18,728 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=243480.0, ans=0.0 2023-09-29 04:06:19,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:06:22,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 04:06:26,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:06:29,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 04:06:29,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:06:29,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 04:06:31,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:06:31,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 04:06:32,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 04:06:32,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:06:32,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:06:34,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:06:37,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:06:37,466 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 04:06:41,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:06:43,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 04:06:46,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:06:46,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:06:46,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 04:06:48,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:06:51,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:06:52,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=243613.33333333334, ans=0.1 2023-09-29 04:06:56,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:07:00,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:07:04,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:07:04,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:07:06,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:07:06,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 04:07:07,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 04:07:07,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 04:07:07,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 04:07:10,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:07:18,476 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 1.995e+02 2.331e+02 2.666e+02 3.727e+02, threshold=4.663e+02, percent-clipped=0.0 2023-09-29 04:07:18,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:07:18,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:07:18,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 04:07:18,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:07:20,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:07:20,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:07:22,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:07:24,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:07:24,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:07:26,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:07:27,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:07:29,939 INFO [train.py:1039] (3/4) Epoch 7, batch 4700, loss[loss=0.2019, simple_loss=0.2707, pruned_loss=0.06657, over 24251.00 frames. ], tot_loss[loss=0.2243, simple_loss=0.2888, pruned_loss=0.07988, over 4717650.72 frames. ], batch size: 56, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:07:30,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:07:30,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:07:31,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 04:07:32,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:07:33,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 04:07:40,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:07:42,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:07:44,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:07:45,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:07:47,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:07:50,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 04:07:51,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 04:07:53,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:07:55,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:07:57,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:08:01,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:08:01,291 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_na.min_abs, batch_count=243946.66666666666, ans=0.02 2023-09-29 04:08:01,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=243946.66666666666, ans=0.1 2023-09-29 04:08:06,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:08:08,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=243946.66666666666, ans=0.1 2023-09-29 04:08:09,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 04:08:11,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:08:15,143 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.34 vs. limit=15.0 2023-09-29 04:08:17,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 04:08:17,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=244013.33333333334, ans=0.125 2023-09-29 04:08:19,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:08:22,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:22,556 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:08:25,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 04:08:26,954 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:08:32,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:08:32,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 04:08:33,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:33,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:08:36,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:08:39,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:08:39,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 04:08:39,184 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 04:08:41,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:08:43,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:43,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:43,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 04:08:46,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:08:49,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 04:08:50,857 INFO [train.py:1039] (3/4) Epoch 7, batch 4750, loss[loss=0.1972, simple_loss=0.27, pruned_loss=0.06219, over 24600.00 frames. ], tot_loss[loss=0.2238, simple_loss=0.289, pruned_loss=0.07928, over 4730416.20 frames. ], batch size: 60, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:08:52,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:08:52,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:08:52,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=244146.66666666666, ans=0.2 2023-09-29 04:08:54,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=244146.66666666666, ans=0.0 2023-09-29 04:08:58,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:08:58,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:09:01,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 04:09:01,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:09:04,017 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.06 vs. limit=22.5 2023-09-29 04:09:06,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 04:09:09,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:09:09,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:09:09,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:09:09,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=244213.33333333334, ans=10.0 2023-09-29 04:09:14,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 04:09:19,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:09:19,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=244213.33333333334, ans=0.125 2023-09-29 04:09:20,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 04:09:21,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:09:25,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:09:25,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:09:25,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:09:25,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=244280.0, ans=0.2 2023-09-29 04:09:26,974 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 04:09:26,978 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 04:09:34,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 04:09:36,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten.whitening_limit, batch_count=244280.0, ans=15.0 2023-09-29 04:09:37,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:09:40,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:09:42,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:09:42,836 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 04:09:42,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:09:43,255 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=244346.66666666666, ans=0.0 2023-09-29 04:09:46,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:09:48,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:09:51,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 04:09:51,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 04:09:51,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:09:53,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:09:53,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:09:54,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 04:09:54,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 04:09:57,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 04:10:00,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:01,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=244413.33333333334, ans=0.0 2023-09-29 04:10:02,301 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.924e+02 2.114e+02 2.511e+02 3.995e+02, threshold=4.229e+02, percent-clipped=0.0 2023-09-29 04:10:02,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:10:02,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 04:10:02,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:10:04,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:10:06,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:10:07,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:10:08,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:10:10,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:10:10,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer_ff3.min_abs, batch_count=244480.0, ans=0.2 2023-09-29 04:10:11,387 INFO [train.py:1039] (3/4) Epoch 7, batch 4800, loss[loss=0.2283, simple_loss=0.2974, pruned_loss=0.07965, over 24387.00 frames. ], tot_loss[loss=0.2245, simple_loss=0.2903, pruned_loss=0.07935, over 4735074.20 frames. ], batch size: 77, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:10:11,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 04:10:11,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 04:10:11,835 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=244480.0, ans=0.0 2023-09-29 04:10:13,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 04:10:17,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:10:19,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:10:20,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 04:10:24,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=244480.0, ans=0.0 2023-09-29 04:10:26,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:10:27,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:27,901 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=244546.66666666666, ans=0.1 2023-09-29 04:10:32,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:10:32,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:10:32,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:10:33,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 04:10:35,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:10:35,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:10:35,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:10:38,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=244546.66666666666, ans=0.0 2023-09-29 04:10:41,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:10:42,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:10:42,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:10:43,610 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.26 vs. limit=22.5 2023-09-29 04:10:44,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:10:44,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 04:10:44,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:44,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:10:48,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:10:48,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=244613.33333333334, ans=0.125 2023-09-29 04:10:50,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=244613.33333333334, ans=0.0 2023-09-29 04:10:51,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:53,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:10:53,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:10:55,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 04:10:57,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:10:58,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 04:10:58,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 04:11:00,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:11:00,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:11:00,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:11:00,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:11:00,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:11:02,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:11:02,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:11:07,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:11:09,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:11,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:11:15,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 04:11:17,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:11:17,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:17,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:11:17,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:11:23,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:11:23,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:11:23,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:24,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:11:24,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:11:26,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:11:30,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:11:30,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:30,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:11:31,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 04:11:34,688 INFO [train.py:1039] (3/4) Epoch 7, batch 4850, loss[loss=0.2916, simple_loss=0.3296, pruned_loss=0.1268, over 19959.00 frames. ], tot_loss[loss=0.2263, simple_loss=0.2914, pruned_loss=0.08063, over 4703733.54 frames. ], batch size: 388, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:11:36,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 04:11:36,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:11:36,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:11:37,838 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:11:37,840 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:11:40,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:11:48,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 04:11:49,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:11:53,980 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.01 vs. limit=15.0 2023-09-29 04:11:54,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:11:54,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:11:56,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:12:00,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:12:00,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:12:04,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:12:04,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 04:12:04,970 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.85 vs. limit=15.0 2023-09-29 04:12:07,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:12:10,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:12:10,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 04:12:10,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:12:10,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 04:12:14,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:12:14,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:12:17,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:12:18,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 04:12:19,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 04:12:20,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:12:22,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=245013.33333333334, ans=0.0 2023-09-29 04:12:25,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:12:27,276 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 04:12:29,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:12:29,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:12:31,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:12:33,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 04:12:33,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:12:35,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 04:12:35,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:12:38,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:12:38,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 04:12:46,411 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.770e+02 2.140e+02 2.410e+02 2.869e+02 4.952e+02, threshold=4.821e+02, percent-clipped=3.0 2023-09-29 04:12:46,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:12:52,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:12:52,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:12:55,907 INFO [train.py:1039] (3/4) Epoch 7, batch 4900, loss[loss=0.2147, simple_loss=0.2582, pruned_loss=0.08559, over 22630.00 frames. ], tot_loss[loss=0.2266, simple_loss=0.291, pruned_loss=0.0811, over 4695643.13 frames. ], batch size: 322, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:12:57,002 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.81 vs. limit=15.0 2023-09-29 04:12:57,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 04:12:57,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:13:04,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:13:05,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:13:05,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:13:09,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 04:13:13,764 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.39 vs. limit=15.0 2023-09-29 04:13:15,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 04:13:19,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 04:13:20,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 04:13:20,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:13:22,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:13:22,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:13:22,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:13:22,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:13:22,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 04:13:25,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 04:13:25,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:13:28,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:13:28,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:13:31,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:13:31,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:13:33,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:13:33,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 04:13:33,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:13:35,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:13:35,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 04:13:35,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 04:13:40,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 04:13:42,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:13:44,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:13:44,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:13:46,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:13:46,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 04:13:46,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:13:47,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 04:13:50,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:13:52,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 04:13:53,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:13:57,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 04:13:58,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:13:58,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 04:13:58,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 04:14:01,198 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.42 vs. limit=22.5 2023-09-29 04:14:03,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:14:05,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:14:06,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 04:14:06,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 04:14:06,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:14:10,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:14:14,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:14:14,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:14:15,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:14:15,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 04:14:16,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:14:18,185 INFO [train.py:1039] (3/4) Epoch 7, batch 4950, loss[loss=0.2109, simple_loss=0.2852, pruned_loss=0.06827, over 24625.00 frames. ], tot_loss[loss=0.2245, simple_loss=0.289, pruned_loss=0.08007, over 4692816.38 frames. ], batch size: 68, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:14:18,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:14:20,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 04:14:23,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 04:14:24,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 04:14:24,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:14:25,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 04:14:25,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:14:26,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:14:26,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:14:26,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:14:27,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=245480.0, ans=0.125 2023-09-29 04:14:28,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:14:30,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:14:31,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:14:32,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:14:33,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=245546.66666666666, ans=0.125 2023-09-29 04:14:34,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:14:34,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:14:34,901 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=245546.66666666666, ans=0.04949747468305833 2023-09-29 04:14:38,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:14:38,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=245546.66666666666, ans=0.125 2023-09-29 04:14:47,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:14:48,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:14:50,309 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:14:50,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:14:52,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:14:53,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 04:14:53,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 04:14:57,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:14:57,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=245613.33333333334, ans=0.125 2023-09-29 04:14:58,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:15:00,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:15:00,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=245613.33333333334, ans=0.0 2023-09-29 04:15:01,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:15:01,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:15:02,562 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.37 vs. limit=15.0 2023-09-29 04:15:03,406 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:15:04,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:15:07,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:15:08,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:15:09,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:15:09,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:15:11,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 04:15:11,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:15:15,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:15:18,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:15:18,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=245680.0, ans=0.125 2023-09-29 04:15:21,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:15:21,490 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:15:21,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:15:22,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:15:23,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:15:25,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:15:26,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:15:26,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:15:27,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=245746.66666666666, ans=0.0 2023-09-29 04:15:28,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 04:15:31,893 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.749e+02 2.077e+02 2.324e+02 2.627e+02 6.143e+02, threshold=4.647e+02, percent-clipped=3.0 2023-09-29 04:15:32,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:15:39,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 04:15:39,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 04:15:40,899 INFO [train.py:1039] (3/4) Epoch 7, batch 5000, loss[loss=0.2266, simple_loss=0.2878, pruned_loss=0.0827, over 13658.00 frames. ], tot_loss[loss=0.2235, simple_loss=0.2878, pruned_loss=0.07966, over 4681846.32 frames. ], batch size: 30, lr: 1.43e-02, grad_scale: 32.0 2023-09-29 04:15:44,882 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=14.82 vs. limit=15.0 2023-09-29 04:15:47,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:15:47,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:15:49,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 04:15:50,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 04:15:50,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=245813.33333333334, ans=0.125 2023-09-29 04:15:52,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=245813.33333333334, ans=0.125 2023-09-29 04:15:53,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:15:56,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 04:15:56,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:15:56,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 04:15:56,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 04:15:56,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:15:58,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:15:59,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 04:15:59,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:15:59,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:16:01,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 04:16:01,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 04:16:03,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:16:03,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 04:16:03,373 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:16:04,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:16:04,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:16:04,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 04:16:04,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 04:16:07,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 04:16:07,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:16:08,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:16:10,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 04:16:11,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:16:11,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:16:13,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:16:14,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=245946.66666666666, ans=0.025 2023-09-29 04:16:16,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 04:16:19,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 04:16:19,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:16:21,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:16:23,177 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=245946.66666666666, ans=0.0 2023-09-29 04:16:24,594 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 04:16:28,322 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:16:28,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:16:28,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:16:33,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 04:16:34,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:16:34,740 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:16:34,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:16:36,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 04:16:37,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:16:40,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:16:42,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:16:44,796 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:16:49,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=246080.0, ans=0.0 2023-09-29 04:16:50,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 04:16:50,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=246080.0, ans=0.1 2023-09-29 04:16:53,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:16:59,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=246080.0, ans=0.2 2023-09-29 04:17:02,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:17:03,839 INFO [train.py:1039] (3/4) Epoch 7, batch 5050, loss[loss=0.2399, simple_loss=0.2955, pruned_loss=0.09216, over 23669.00 frames. ], tot_loss[loss=0.2242, simple_loss=0.2887, pruned_loss=0.07984, over 4694022.11 frames. ], batch size: 164, lr: 1.43e-02, grad_scale: 16.0 2023-09-29 04:17:04,078 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:05,418 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:17:05,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:17:05,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:17:06,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:17:08,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:10,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=246146.66666666666, ans=0.5 2023-09-29 04:17:13,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:13,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 04:17:14,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:17:16,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:17:18,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:17:18,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 04:17:20,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:17:20,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:17:23,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:17:23,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:17:24,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:17:31,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 04:17:31,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 04:17:33,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:17:33,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 04:17:33,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:17:35,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:17:37,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:17:37,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:17:37,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 04:17:38,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 04:17:40,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:17:42,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:17:45,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:17:46,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 04:17:48,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:17:50,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 04:17:52,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:17:52,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:17:52,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:17:53,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:17:56,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:17:58,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:17:59,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:17:59,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:17:59,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:17:59,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 04:17:59,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:18:03,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:18:06,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:18:06,592 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 04:18:06,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 04:18:08,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:18:10,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:18:10,347 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 04:18:13,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:18:13,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 04:18:13,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:18:17,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=246413.33333333334, ans=0.125 2023-09-29 04:18:18,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:18:18,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:18:18,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 04:18:18,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=246413.33333333334, ans=0.1 2023-09-29 04:18:19,888 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 2.259e+02 2.586e+02 3.154e+02 5.284e+02, threshold=5.172e+02, percent-clipped=3.0 2023-09-29 04:18:20,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 04:18:24,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:18:24,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:18:24,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:18:26,482 INFO [train.py:1039] (3/4) Epoch 7, batch 5100, loss[loss=0.2191, simple_loss=0.3037, pruned_loss=0.06725, over 24327.00 frames. ], tot_loss[loss=0.2246, simple_loss=0.2893, pruned_loss=0.07992, over 4704067.47 frames. ], batch size: 74, lr: 1.42e-02, grad_scale: 8.0 2023-09-29 04:18:28,077 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 04:18:31,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:18:35,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 04:18:35,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 04:18:35,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:18:37,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:18:42,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:18:42,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 04:18:42,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 04:18:47,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:18:47,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:18:53,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:18:55,238 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.67 vs. limit=15.0 2023-09-29 04:18:57,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 04:18:57,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:19:00,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:19:00,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 04:19:02,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:19:03,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:19:03,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 04:19:07,196 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 04:19:07,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:19:07,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 04:19:07,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 04:19:12,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:19:16,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=246680.0, ans=0.0 2023-09-29 04:19:21,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=246680.0, ans=0.125 2023-09-29 04:19:22,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:19:24,346 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=246680.0, ans=0.1 2023-09-29 04:19:25,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 04:19:27,567 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 04:19:27,579 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 04:19:29,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 04:19:29,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:19:32,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 04:19:32,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=246746.66666666666, ans=0.125 2023-09-29 04:19:34,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=246746.66666666666, ans=0.0 2023-09-29 04:19:35,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 04:19:37,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 04:19:38,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:19:40,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 04:19:43,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:19:45,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 04:19:48,728 INFO [train.py:1039] (3/4) Epoch 7, batch 5150, loss[loss=0.2302, simple_loss=0.3065, pruned_loss=0.07691, over 24315.00 frames. ], tot_loss[loss=0.2247, simple_loss=0.2902, pruned_loss=0.07965, over 4720481.37 frames. ], batch size: 77, lr: 1.42e-02, grad_scale: 8.0 2023-09-29 04:19:49,655 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.62 vs. limit=15.0 2023-09-29 04:19:50,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:19:50,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:19:50,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:19:51,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:19:51,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:19:53,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:19:55,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 04:19:55,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 04:19:55,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 04:19:55,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:19:55,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 04:19:58,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:19:58,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 04:19:59,054 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:20:01,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:20:02,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=246813.33333333334, ans=0.2 2023-09-29 04:20:03,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=246880.0, ans=0.1 2023-09-29 04:20:07,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 04:20:07,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 04:20:08,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:20:09,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:20:11,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:20:11,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:20:11,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:20:12,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:20:12,599 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:20:12,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 04:20:14,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:20:14,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:20:15,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:20:18,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 04:20:19,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:20:26,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:20:29,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 04:20:32,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:20:39,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:20:40,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:20:43,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:20:44,054 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:20:47,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 04:20:49,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:20:50,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:20:50,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:20:55,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:20:55,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:20:55,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 04:21:00,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:21:03,043 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:21:05,025 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.669e+02 2.008e+02 2.205e+02 2.538e+02 3.618e+02, threshold=4.410e+02, percent-clipped=0.0 2023-09-29 04:21:05,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:21:05,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:21:06,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 04:21:06,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:21:06,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:21:06,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:21:10,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:21:11,426 INFO [train.py:1039] (3/4) Epoch 7, batch 5200, loss[loss=0.2764, simple_loss=0.3166, pruned_loss=0.1181, over 19727.00 frames. ], tot_loss[loss=0.2249, simple_loss=0.2903, pruned_loss=0.07976, over 4722487.34 frames. ], batch size: 389, lr: 1.42e-02, grad_scale: 16.0 2023-09-29 04:21:12,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:21:14,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:21:19,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 04:21:21,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:21:22,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:21:25,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:21:27,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:21:27,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:21:30,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 04:21:33,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:21:33,767 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.36 vs. limit=12.0 2023-09-29 04:21:35,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:21:36,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 04:21:37,290 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=247213.33333333334, ans=0.0 2023-09-29 04:21:39,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:21:40,527 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.68 vs. limit=15.0 2023-09-29 04:21:41,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:21:41,925 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.94 vs. limit=15.0 2023-09-29 04:21:42,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 04:21:42,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 04:21:45,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 04:21:47,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:21:47,117 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 04:21:47,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:21:48,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:21:49,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:21:50,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 04:21:50,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:21:50,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=247280.0, ans=0.04949747468305833 2023-09-29 04:21:53,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:21:53,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=247280.0, ans=0.1 2023-09-29 04:21:55,108 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=247280.0, ans=0.125 2023-09-29 04:21:57,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 04:21:57,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 04:21:57,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 04:22:01,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 04:22:03,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:22:06,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=247346.66666666666, ans=0.04949747468305833 2023-09-29 04:22:09,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:22:09,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:22:12,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 04:22:12,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:22:12,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=247346.66666666666, ans=0.05 2023-09-29 04:22:13,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 04:22:13,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:22:13,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:22:18,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:22:18,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:22:19,918 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=247413.33333333334, ans=0.125 2023-09-29 04:22:21,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:22:22,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:22:22,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:22:27,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:22:28,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 04:22:30,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:22:30,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:22:31,763 INFO [train.py:1039] (3/4) Epoch 7, batch 5250, loss[loss=0.2318, simple_loss=0.2818, pruned_loss=0.09087, over 23573.00 frames. ], tot_loss[loss=0.2245, simple_loss=0.2899, pruned_loss=0.07957, over 4717769.05 frames. ], batch size: 149, lr: 1.42e-02, grad_scale: 16.0 2023-09-29 04:22:31,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:22:31,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:22:34,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:22:35,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:22:40,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:22:40,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:22:41,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:22:48,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:22:50,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:22:52,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:22:53,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:22:55,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 04:22:55,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:22:57,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:23:12,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=247613.33333333334, ans=0.035 2023-09-29 04:23:17,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=247680.0, ans=0.125 2023-09-29 04:23:24,118 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=247680.0, ans=0.0 2023-09-29 04:23:25,386 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=247680.0, ans=0.04949747468305833 2023-09-29 04:23:25,842 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=14.65 vs. limit=15.0 2023-09-29 04:23:29,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=247680.0, ans=0.125 2023-09-29 04:23:32,848 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.63 vs. limit=15.0 2023-09-29 04:23:41,159 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 2.140e+02 2.318e+02 2.697e+02 3.802e+02, threshold=4.635e+02, percent-clipped=0.0 2023-09-29 04:23:47,147 INFO [train.py:1039] (3/4) Epoch 7, batch 5300, loss[loss=0.1815, simple_loss=0.252, pruned_loss=0.05549, over 24583.00 frames. ], tot_loss[loss=0.2238, simple_loss=0.289, pruned_loss=0.07934, over 4726159.44 frames. ], batch size: 60, lr: 1.42e-02, grad_scale: 16.0 2023-09-29 04:24:01,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:24:01,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 04:24:01,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 04:24:02,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:24:02,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:24:02,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:24:02,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:24:02,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:24:02,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:02,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:24:02,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:24:03,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:24:03,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 04:24:03,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 04:24:03,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 04:24:03,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 04:24:03,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 04:24:03,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 04:24:04,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:24:04,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:24:05,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:24:05,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:24:05,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:24:05,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:24:05,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:24:05,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:24:06,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:24:06,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:24:06,054 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:24:06,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:24:06,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:24:07,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 04:24:07,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:24:07,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:24:07,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 04:24:07,630 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 04:24:07,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:24:07,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:24:07,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 04:24:08,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 04:24:08,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 04:24:09,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:24:09,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:24:09,786 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 04:24:09,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 04:24:09,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:24:10,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:24:10,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 04:24:10,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 04:24:10,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 04:24:10,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 04:24:19,229 INFO [train.py:1039] (3/4) Epoch 8, batch 0, loss[loss=0.2164, simple_loss=0.2908, pruned_loss=0.07095, over 24443.00 frames. ], tot_loss[loss=0.2164, simple_loss=0.2908, pruned_loss=0.07095, over 24443.00 frames. ], batch size: 69, lr: 1.34e-02, grad_scale: 32.0 2023-09-29 04:24:19,230 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 04:24:33,515 INFO [train.py:1071] (3/4) Epoch 8, validation: loss=0.2869, simple_loss=0.2985, pruned_loss=0.1377, over 1125622.00 frames. 2023-09-29 04:24:33,516 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 04:24:33,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 04:24:35,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:24:36,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:24:41,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:24:41,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:24:41,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:42,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 04:24:44,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 04:24:47,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:49,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:53,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:24:53,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:24:55,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:24:55,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:24:57,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 04:25:01,054 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:25:10,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:25:10,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:25:13,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 04:25:17,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:25:17,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:25:19,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:25:24,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:25:30,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:25:35,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 04:25:39,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 04:25:39,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:25:39,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:25:40,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:25:41,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:25:42,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=248160.0, ans=0.0 2023-09-29 04:25:46,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 04:25:47,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:25:49,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:25:52,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:25:52,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=248160.0, ans=0.0 2023-09-29 04:25:55,320 INFO [train.py:1039] (3/4) Epoch 8, batch 50, loss[loss=0.2339, simple_loss=0.2889, pruned_loss=0.08945, over 23606.00 frames. ], tot_loss[loss=0.2242, simple_loss=0.2905, pruned_loss=0.07891, over 1079898.65 frames. ], batch size: 256, lr: 1.34e-02, grad_scale: 32.0 2023-09-29 04:25:55,412 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 04:25:55,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:26:00,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:26:01,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:26:01,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 04:26:03,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:26:03,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:26:06,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:26:07,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:26:09,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:26:12,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 04:26:14,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:26:23,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 04:26:24,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 04:26:26,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 04:26:27,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:26:29,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:26:29,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:26:31,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:26:31,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:26:32,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:26:32,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:26:40,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:26:42,355 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:26:42,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 04:26:42,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 04:26:45,440 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:26:46,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:26:46,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 04:26:48,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:26:48,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 04:26:48,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=248426.66666666666, ans=0.0 2023-09-29 04:26:50,456 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.737e+02 2.177e+02 2.443e+02 2.821e+02 4.431e+02, threshold=4.886e+02, percent-clipped=0.0 2023-09-29 04:26:56,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:26:57,943 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:26:58,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:27:00,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:27:00,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:27:04,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 04:27:04,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 04:27:05,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:27:05,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:27:07,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:27:07,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:27:07,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 04:27:08,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 04:27:10,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 04:27:12,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:27:12,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:27:13,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 04:27:13,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 04:27:13,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:27:15,391 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:27:16,870 INFO [train.py:1039] (3/4) Epoch 8, batch 100, loss[loss=0.2258, simple_loss=0.2999, pruned_loss=0.07591, over 24641.00 frames. ], tot_loss[loss=0.2239, simple_loss=0.2912, pruned_loss=0.07829, over 1905106.70 frames. ], batch size: 68, lr: 1.34e-02, grad_scale: 32.0 2023-09-29 04:27:16,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 04:27:16,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:27:18,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:27:23,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:27:26,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:27:30,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 04:27:30,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:27:34,051 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:27:34,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:27:34,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:27:34,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:27:34,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:27:37,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 04:27:38,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:27:40,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:27:40,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:27:40,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:27:44,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 04:27:44,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:27:46,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:27:48,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:27:49,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:27:51,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=248693.33333333334, ans=0.04949747468305833 2023-09-29 04:27:52,259 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.10 vs. limit=15.0 2023-09-29 04:27:52,884 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 04:27:52,908 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 04:27:54,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:27:54,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:27:59,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 04:28:01,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:28:01,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:07,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:09,006 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 04:28:10,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 04:28:13,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:28:15,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:28:18,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:20,400 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:28:23,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:28:24,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:28:25,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=248826.66666666666, ans=0.125 2023-09-29 04:28:29,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:29,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:28:31,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:28:31,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:28:31,534 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:28:33,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 04:28:33,045 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 04:28:33,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:28:33,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:28:34,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:34,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:28:34,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 04:28:34,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 04:28:34,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:28:34,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:35,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=248826.66666666666, ans=0.125 2023-09-29 04:28:36,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:28:39,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:28:40,431 INFO [train.py:1039] (3/4) Epoch 8, batch 150, loss[loss=0.3045, simple_loss=0.3464, pruned_loss=0.1313, over 19619.00 frames. ], tot_loss[loss=0.2252, simple_loss=0.2924, pruned_loss=0.079, over 2528441.88 frames. ], batch size: 388, lr: 1.34e-02, grad_scale: 32.0 2023-09-29 04:28:40,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:28:40,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:28:43,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:28:46,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:28:46,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:28:46,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:48,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:28:50,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:51,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:28:51,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:28:52,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=248893.33333333334, ans=0.1 2023-09-29 04:28:55,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 04:28:56,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 04:28:56,931 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 04:28:59,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:28:59,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:29:01,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:29:03,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=248960.0, ans=0.0 2023-09-29 04:29:04,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:29:04,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:29:05,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:29:06,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:29:08,213 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 04:29:11,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:29:15,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:29:17,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=249026.66666666666, ans=0.125 2023-09-29 04:29:17,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=249026.66666666666, ans=0.125 2023-09-29 04:29:18,182 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.70 vs. limit=15.0 2023-09-29 04:29:20,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:29:20,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 04:29:24,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:29:24,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:29:24,975 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:29:26,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:29:28,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:29:30,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:29:31,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:29:31,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 04:29:36,132 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 2.113e+02 2.401e+02 2.733e+02 5.079e+02, threshold=4.803e+02, percent-clipped=1.0 2023-09-29 04:29:38,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:29:40,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:29:40,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:29:40,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:29:43,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:29:45,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 04:29:48,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:29:50,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:29:52,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:29:55,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:29:55,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 04:29:55,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:29:55,495 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 04:29:58,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:30:03,307 INFO [train.py:1039] (3/4) Epoch 8, batch 200, loss[loss=0.2074, simple_loss=0.2801, pruned_loss=0.06731, over 24661.00 frames. ], tot_loss[loss=0.224, simple_loss=0.2911, pruned_loss=0.07845, over 3008981.07 frames. ], batch size: 65, lr: 1.33e-02, grad_scale: 32.0 2023-09-29 04:30:03,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:30:03,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:30:05,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 04:30:07,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:30:07,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:30:10,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 04:30:11,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 04:30:13,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:30:15,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:30:18,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:30:19,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:30:19,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:30:20,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=249293.33333333334, ans=0.0 2023-09-29 04:30:27,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=249293.33333333334, ans=0.0 2023-09-29 04:30:28,868 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=249293.33333333334, ans=0.125 2023-09-29 04:30:43,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:30:43,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:30:44,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:30:46,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:30:46,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 04:30:46,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:30:47,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:30:49,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:30:49,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:30:49,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:30:51,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 04:30:53,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 04:30:53,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:30:55,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=249426.66666666666, ans=0.0 2023-09-29 04:30:55,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=249426.66666666666, ans=0.125 2023-09-29 04:30:58,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:31:03,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:31:09,859 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:09,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:31:16,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:19,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 04:31:21,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:31:21,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:31:21,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:31:22,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:31:24,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 04:31:25,899 INFO [train.py:1039] (3/4) Epoch 8, batch 250, loss[loss=0.2215, simple_loss=0.2689, pruned_loss=0.08702, over 23353.00 frames. ], tot_loss[loss=0.224, simple_loss=0.2907, pruned_loss=0.07869, over 3384089.68 frames. ], batch size: 285, lr: 1.33e-02, grad_scale: 32.0 2023-09-29 04:31:25,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:31:26,006 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 04:31:28,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:29,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:31:32,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:32,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:31:34,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:31:36,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:31:36,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:31:39,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:31:44,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=249626.66666666666, ans=0.125 2023-09-29 04:31:50,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:31:55,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:31:55,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:32:03,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:32:03,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:32:04,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:32:05,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:32:05,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:32:05,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:32:05,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:32:10,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:32:13,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 04:32:13,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:32:16,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:32:16,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:32:16,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:32:17,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:32:17,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:32:17,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:32:20,837 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.722e+02 2.201e+02 2.590e+02 2.939e+02 4.400e+02, threshold=5.181e+02, percent-clipped=0.0 2023-09-29 04:32:20,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:32:22,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:32:22,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:32:24,726 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.43 vs. limit=22.5 2023-09-29 04:32:27,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:32:33,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:32:36,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:32:38,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=249826.66666666666, ans=0.0 2023-09-29 04:32:41,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:32:43,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:32:46,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 04:32:47,800 INFO [train.py:1039] (3/4) Epoch 8, batch 300, loss[loss=0.2242, simple_loss=0.2561, pruned_loss=0.09618, over 19489.00 frames. ], tot_loss[loss=0.2223, simple_loss=0.2879, pruned_loss=0.07835, over 3673373.63 frames. ], batch size: 389, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:32:47,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:32:47,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:32:49,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 04:32:50,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 04:32:51,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:32:51,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 04:32:55,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:32:56,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:33:00,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:33:00,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 04:33:02,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:33:04,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:33:04,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 04:33:04,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:33:08,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:33:14,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:33:16,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 04:33:19,247 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 04:33:19,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:20,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:33:22,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:22,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 04:33:22,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:33:25,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:33:27,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:33:27,941 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.44 vs. limit=6.0 2023-09-29 04:33:28,713 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:33:33,309 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 04:33:33,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 04:33:33,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:33:36,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:38,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 04:33:38,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=250093.33333333334, ans=0.035 2023-09-29 04:33:40,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:33:41,832 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff2.min_abs, batch_count=250093.33333333334, ans=0.1 2023-09-29 04:33:45,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:33:47,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=250093.33333333334, ans=0.1 2023-09-29 04:33:49,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:33:49,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 04:33:54,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:54,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:33:56,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:33:57,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:33:57,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 04:33:57,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:33:59,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:34:01,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 04:34:01,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:34:02,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:04,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:34:04,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:34:05,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:10,634 INFO [train.py:1039] (3/4) Epoch 8, batch 350, loss[loss=0.2309, simple_loss=0.2952, pruned_loss=0.08328, over 23333.00 frames. ], tot_loss[loss=0.2209, simple_loss=0.2867, pruned_loss=0.07757, over 3899688.42 frames. ], batch size: 93, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:34:12,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:34:12,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 04:34:15,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:22,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:34:25,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:34:26,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:29,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 04:34:30,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:34:30,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 04:34:33,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:33,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 04:34:35,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:34:36,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 04:34:38,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:34:40,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:34:41,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:34:43,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:34:43,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:34:45,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:34:45,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:34:45,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:34:46,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:34:46,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:34:48,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=250360.0, ans=0.2 2023-09-29 04:34:55,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:34:55,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:34:55,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:34:55,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:34:58,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=250360.0, ans=0.0 2023-09-29 04:34:59,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 04:34:59,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:35:01,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=250426.66666666666, ans=0.09899494936611666 2023-09-29 04:35:05,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:35:05,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:35:05,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:35:07,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 04:35:08,766 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.953e+02 2.305e+02 2.882e+02 6.292e+02, threshold=4.610e+02, percent-clipped=1.0 2023-09-29 04:35:09,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:10,487 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 04:35:12,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 04:35:12,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:35:15,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:35:15,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 04:35:18,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:22,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:35:22,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:35:24,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:24,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:35:28,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:35:31,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:35:34,519 INFO [train.py:1039] (3/4) Epoch 8, batch 400, loss[loss=0.2311, simple_loss=0.3109, pruned_loss=0.07567, over 24307.00 frames. ], tot_loss[loss=0.2201, simple_loss=0.2859, pruned_loss=0.07719, over 4082651.99 frames. ], batch size: 74, lr: 1.33e-02, grad_scale: 32.0 2023-09-29 04:35:34,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:35:34,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 04:35:36,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:36,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:35:36,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:35:37,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:35:38,409 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:35:39,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:35:39,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:35:41,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 04:35:43,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 04:35:43,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:35:44,103 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.55 vs. limit=6.0 2023-09-29 04:35:44,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 04:35:46,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:35:47,949 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=250560.0, ans=0.125 2023-09-29 04:35:49,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:35:49,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:35:49,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 04:35:49,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:35:49,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:35:51,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:35:51,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:35:55,246 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 04:35:56,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 04:36:00,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=250626.66666666666, ans=0.125 2023-09-29 04:36:03,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:36:03,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:36:04,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=250626.66666666666, ans=0.1 2023-09-29 04:36:05,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 04:36:06,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=250626.66666666666, ans=0.0 2023-09-29 04:36:07,298 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 04:36:10,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:36:12,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:36:19,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 04:36:20,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=250693.33333333334, ans=0.125 2023-09-29 04:36:22,959 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 04:36:23,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=250760.0, ans=0.0 2023-09-29 04:36:24,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 04:36:26,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=250760.0, ans=0.0 2023-09-29 04:36:27,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:36:27,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:36:27,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=250760.0, ans=0.0 2023-09-29 04:36:29,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 04:36:33,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:36:36,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:36:39,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:36:43,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:36:44,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 04:36:46,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 04:36:47,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 04:36:50,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:36:50,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:36:52,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 04:36:55,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 04:36:55,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:36:55,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 04:36:56,773 INFO [train.py:1039] (3/4) Epoch 8, batch 450, loss[loss=0.2521, simple_loss=0.3014, pruned_loss=0.1014, over 23637.00 frames. ], tot_loss[loss=0.2213, simple_loss=0.2872, pruned_loss=0.07768, over 4224745.92 frames. ], batch size: 256, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:36:57,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 04:36:57,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:36:58,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:36:59,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:36:59,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 04:36:59,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:37:01,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:37:03,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:37:16,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:37:16,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:37:18,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 04:37:18,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 04:37:22,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=250960.0, ans=10.0 2023-09-29 04:37:23,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:37:23,674 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=250960.0, ans=0.125 2023-09-29 04:37:25,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=250960.0, ans=0.125 2023-09-29 04:37:26,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:37:28,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:37:31,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:37:32,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:37:34,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=251026.66666666666, ans=0.0 2023-09-29 04:37:35,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 04:37:35,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 04:37:38,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 04:37:38,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:37:38,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:37:40,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:37:43,857 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 04:37:43,871 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 04:37:44,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=251093.33333333334, ans=0.0 2023-09-29 04:37:45,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:37:47,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:37:49,322 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 04:37:52,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:37:52,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:37:54,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 04:37:54,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 04:37:56,015 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.144e+02 2.402e+02 2.848e+02 5.479e+02, threshold=4.804e+02, percent-clipped=2.0 2023-09-29 04:37:57,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:37:59,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:37:59,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:37:59,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 04:38:02,974 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=251160.0, ans=0.0 2023-09-29 04:38:04,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:38:04,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 04:38:05,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 04:38:07,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 04:38:07,979 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.11 vs. limit=15.0 2023-09-29 04:38:13,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:38:13,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=251160.0, ans=0.125 2023-09-29 04:38:14,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:38:17,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:38:18,557 INFO [train.py:1039] (3/4) Epoch 8, batch 500, loss[loss=0.1919, simple_loss=0.2708, pruned_loss=0.05646, over 24466.00 frames. ], tot_loss[loss=0.2226, simple_loss=0.2883, pruned_loss=0.07847, over 4331845.57 frames. ], batch size: 66, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:38:18,636 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 04:38:22,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:38:22,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=251226.66666666666, ans=0.125 2023-09-29 04:38:24,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:38:24,102 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:38:25,492 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 04:38:25,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 04:38:25,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:38:29,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:38:32,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 04:38:35,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:38:37,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:38:38,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:38:39,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:38:49,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:38:49,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 04:38:50,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 04:38:50,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:38:50,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 04:38:50,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 04:38:53,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=251360.0, ans=0.125 2023-09-29 04:38:54,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:38:55,267 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.27 vs. limit=15.0 2023-09-29 04:38:56,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:38:56,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:38:56,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:38:58,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 04:39:00,734 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 04:39:03,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:39:05,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:39:06,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:39:06,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:39:07,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:39:09,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 04:39:13,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:39:14,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:39:17,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:39:19,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:39:25,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:39:26,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=251493.33333333334, ans=0.0 2023-09-29 04:39:28,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 04:39:28,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:39:28,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:39:32,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 04:39:34,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 04:39:35,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:39:41,370 INFO [train.py:1039] (3/4) Epoch 8, batch 550, loss[loss=0.2376, simple_loss=0.3089, pruned_loss=0.08312, over 23977.00 frames. ], tot_loss[loss=0.2243, simple_loss=0.2897, pruned_loss=0.07946, over 4427930.74 frames. ], batch size: 86, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:39:41,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 04:39:43,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 04:39:43,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:39:43,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 04:39:44,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:39:44,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:39:44,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=251560.0, ans=0.0 2023-09-29 04:39:46,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:39:47,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:39:47,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:39:49,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:39:50,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:39:52,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 04:39:52,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:39:56,992 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:39:58,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:39:58,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:40:00,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:40:02,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:40:08,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 04:40:08,966 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=251626.66666666666, ans=0.125 2023-09-29 04:40:10,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 04:40:11,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:40:16,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:40:16,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:40:16,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:40:20,921 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:40:20,930 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 04:40:22,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:40:23,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 04:40:25,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:40:25,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 04:40:27,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:40:27,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:40:28,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 04:40:30,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 04:40:30,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:40:32,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:40:32,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:40:32,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:40:37,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:40:37,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:40:38,972 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.624e+02 2.030e+02 2.358e+02 2.809e+02 4.445e+02, threshold=4.716e+02, percent-clipped=0.0 2023-09-29 04:40:40,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:40:41,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:40:41,898 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=251760.0, ans=0.04949747468305833 2023-09-29 04:40:42,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 04:40:43,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:40:44,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:40:46,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:40:46,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:40:48,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 04:40:49,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 04:40:55,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 04:40:58,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 04:40:59,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:40:59,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:41:01,266 INFO [train.py:1039] (3/4) Epoch 8, batch 600, loss[loss=0.2277, simple_loss=0.2996, pruned_loss=0.07795, over 24044.00 frames. ], tot_loss[loss=0.2239, simple_loss=0.29, pruned_loss=0.07895, over 4511092.60 frames. ], batch size: 86, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:41:01,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:41:08,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:41:12,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:41:14,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 04:41:15,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 04:41:18,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:41:21,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:41:24,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 04:41:24,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:41:30,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 04:41:33,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:41:33,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:41:33,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:41:40,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:41:41,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:41:43,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:41:51,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:41:56,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:41:56,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:41:56,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:41:58,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=252093.33333333334, ans=0.125 2023-09-29 04:42:02,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 04:42:07,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:42:08,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:42:12,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 04:42:12,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:42:15,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 04:42:15,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:42:16,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:42:22,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 04:42:25,724 INFO [train.py:1039] (3/4) Epoch 8, batch 650, loss[loss=0.2369, simple_loss=0.2991, pruned_loss=0.08733, over 23191.00 frames. ], tot_loss[loss=0.2233, simple_loss=0.2888, pruned_loss=0.0789, over 4542611.12 frames. ], batch size: 119, lr: 1.33e-02, grad_scale: 16.0 2023-09-29 04:42:25,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 04:42:28,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:42:29,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:42:31,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:42:34,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 04:42:35,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:42:40,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:42:40,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:42:43,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:42:46,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 04:42:48,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:42:49,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:42:52,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:42:54,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 04:42:55,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:42:57,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:42:59,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:42:59,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:02,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:43:05,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:43:05,721 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 04:43:05,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:43:05,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:43:09,095 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=252360.0, ans=0.0 2023-09-29 04:43:10,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:11,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:43:11,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:43:11,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:43:13,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 04:43:14,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:43:14,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:43:16,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 04:43:16,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:43:16,608 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:43:18,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 04:43:19,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 04:43:19,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 04:43:19,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:19,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:43:21,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:43:21,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:43:23,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:43:26,233 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 2.104e+02 2.347e+02 2.945e+02 4.272e+02, threshold=4.693e+02, percent-clipped=0.0 2023-09-29 04:43:31,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:32,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:43:34,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:43:37,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:43:37,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 04:43:38,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:43:42,579 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.71 vs. limit=15.0 2023-09-29 04:43:46,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:43:46,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:43:47,889 INFO [train.py:1039] (3/4) Epoch 8, batch 700, loss[loss=0.224, simple_loss=0.2994, pruned_loss=0.07435, over 24341.00 frames. ], tot_loss[loss=0.2211, simple_loss=0.2868, pruned_loss=0.07768, over 4590466.40 frames. ], batch size: 77, lr: 1.33e-02, grad_scale: 8.0 2023-09-29 04:43:47,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:43:48,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:43:52,713 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 04:43:52,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 04:43:54,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 04:43:56,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:43:59,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:44:01,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 04:44:06,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:44:09,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:44:11,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:44:13,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:44:13,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:44:16,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:44:16,686 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=252626.66666666666, ans=0.125 2023-09-29 04:44:19,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 04:44:19,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:44:21,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 04:44:22,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 04:44:26,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:44:26,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:44:27,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:44:32,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:44:34,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 04:44:38,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:44:38,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:44:38,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 04:44:40,087 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=252760.0, ans=0.125 2023-09-29 04:44:43,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:44:45,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:44:48,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:44:53,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:44:53,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 04:44:56,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 04:44:56,259 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 04:44:59,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:45:03,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:45:04,922 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:45:05,150 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:45:06,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 04:45:11,568 INFO [train.py:1039] (3/4) Epoch 8, batch 750, loss[loss=0.2212, simple_loss=0.2848, pruned_loss=0.07878, over 23600.00 frames. ], tot_loss[loss=0.2205, simple_loss=0.2862, pruned_loss=0.07736, over 4613264.51 frames. ], batch size: 149, lr: 1.33e-02, grad_scale: 8.0 2023-09-29 04:45:11,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 04:45:11,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 04:45:11,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 04:45:13,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 04:45:13,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 04:45:14,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:45:16,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 04:45:18,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:45:18,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:45:19,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:45:21,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:45:21,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:45:21,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:45:24,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:45:26,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:45:28,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:45:31,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:45:33,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:45:34,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 04:45:35,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:45:36,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:45:37,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:45:39,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 04:45:40,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 04:45:41,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:45:41,109 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=252960.0, ans=0.125 2023-09-29 04:45:42,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 04:45:44,044 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 04:45:45,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 04:45:45,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:45:45,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 04:45:47,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:45:50,396 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.16 vs. limit=22.5 2023-09-29 04:45:55,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:45:55,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:45:55,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:45:58,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:45:58,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:45:58,948 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=253093.33333333334, ans=0.0 2023-09-29 04:46:00,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 04:46:00,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:46:01,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 04:46:03,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:46:04,848 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=253093.33333333334, ans=0.125 2023-09-29 04:46:08,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:46:08,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 04:46:08,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:46:11,299 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.684e+02 2.056e+02 2.287e+02 2.694e+02 4.439e+02, threshold=4.575e+02, percent-clipped=0.0 2023-09-29 04:46:13,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:46:13,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:46:15,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:46:17,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=253160.0, ans=0.125 2023-09-29 04:46:18,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:46:23,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 04:46:24,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:46:24,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:46:28,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:46:28,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:46:31,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:46:32,944 INFO [train.py:1039] (3/4) Epoch 8, batch 800, loss[loss=0.1968, simple_loss=0.2697, pruned_loss=0.06193, over 24548.00 frames. ], tot_loss[loss=0.2214, simple_loss=0.2872, pruned_loss=0.07779, over 4635739.12 frames. ], batch size: 60, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:46:33,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 04:46:42,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:46:42,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:46:44,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:46:44,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:46:46,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:46:46,258 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:46:47,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:46:53,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:46:54,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:46:56,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 04:46:56,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:46:59,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:46:59,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:47:00,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:47:00,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 04:47:00,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:47:01,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 04:47:04,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:47:07,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:47:08,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:47:08,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:47:13,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:47:13,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:47:16,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:47:18,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:47:18,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 04:47:20,195 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 04:47:21,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 04:47:21,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 04:47:21,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:47:23,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:47:23,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:47:28,447 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 04:47:29,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 04:47:30,297 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=253426.66666666666, ans=0.125 2023-09-29 04:47:31,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:47:33,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 04:47:36,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:47:40,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:47:41,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 04:47:42,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:47:43,269 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=253493.33333333334, ans=0.125 2023-09-29 04:47:44,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 04:47:52,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:47:56,245 INFO [train.py:1039] (3/4) Epoch 8, batch 850, loss[loss=0.211, simple_loss=0.2928, pruned_loss=0.06453, over 24264.00 frames. ], tot_loss[loss=0.2213, simple_loss=0.2876, pruned_loss=0.07747, over 4658672.63 frames. ], batch size: 74, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:47:56,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:47:56,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 04:47:57,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:47:59,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:47:59,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 04:48:00,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:48:02,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:48:03,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:48:05,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:48:06,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:48:07,833 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.68 vs. limit=15.0 2023-09-29 04:48:08,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 04:48:08,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 04:48:08,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 04:48:11,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:48:11,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:48:12,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:48:14,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:48:14,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:48:20,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:48:20,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:48:20,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 04:48:24,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 04:48:26,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:48:27,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=253693.33333333334, ans=0.2 2023-09-29 04:48:28,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 04:48:30,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=253693.33333333334, ans=0.1 2023-09-29 04:48:32,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 04:48:34,050 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 04:48:37,122 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 04:48:37,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:48:37,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:48:37,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 04:48:40,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:48:41,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:48:41,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 04:48:44,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 04:48:45,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:48:47,753 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:48:47,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:48:49,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=253760.0, ans=0.125 2023-09-29 04:48:50,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:48:52,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 04:48:52,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 04:48:56,838 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.738e+02 2.108e+02 2.249e+02 2.560e+02 3.769e+02, threshold=4.498e+02, percent-clipped=0.0 2023-09-29 04:48:57,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:48:57,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:48:58,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:48:58,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:48:59,626 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.88 vs. limit=15.0 2023-09-29 04:49:00,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:49:01,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:49:05,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 04:49:05,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=253826.66666666666, ans=0.0 2023-09-29 04:49:06,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 04:49:08,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:49:08,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:49:08,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=253826.66666666666, ans=0.125 2023-09-29 04:49:09,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=253826.66666666666, ans=0.1 2023-09-29 04:49:15,266 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=253826.66666666666, ans=0.125 2023-09-29 04:49:17,885 INFO [train.py:1039] (3/4) Epoch 8, batch 900, loss[loss=0.2306, simple_loss=0.2944, pruned_loss=0.08344, over 23987.00 frames. ], tot_loss[loss=0.2223, simple_loss=0.2885, pruned_loss=0.07802, over 4678080.17 frames. ], batch size: 86, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:49:18,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 04:49:19,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=253893.33333333334, ans=0.125 2023-09-29 04:49:20,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:49:20,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 04:49:20,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:49:20,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:49:21,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 04:49:28,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:49:33,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:49:33,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 04:49:36,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:49:36,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 04:49:38,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 04:49:38,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=253960.0, ans=0.07 2023-09-29 04:49:40,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:49:40,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:49:40,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 04:49:40,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:49:44,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=253960.0, ans=0.125 2023-09-29 04:49:50,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:49:50,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:49:50,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:49:52,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:49:56,310 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.61 vs. limit=15.0 2023-09-29 04:49:58,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 04:50:00,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:50:04,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 04:50:04,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 04:50:05,819 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 04:50:05,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 04:50:08,434 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.48 vs. limit=15.0 2023-09-29 04:50:15,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 04:50:15,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:50:15,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:50:18,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=254093.33333333334, ans=0.125 2023-09-29 04:50:21,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:50:21,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:50:21,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=254093.33333333334, ans=0.0 2023-09-29 04:50:25,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 04:50:25,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:50:26,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=254160.0, ans=0.1 2023-09-29 04:50:28,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 04:50:29,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:50:31,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:50:31,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:50:31,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:50:36,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 04:50:36,852 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 04:50:37,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=254160.0, ans=0.0 2023-09-29 04:50:38,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 04:50:38,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 04:50:41,190 INFO [train.py:1039] (3/4) Epoch 8, batch 950, loss[loss=0.211, simple_loss=0.2859, pruned_loss=0.06798, over 24674.00 frames. ], tot_loss[loss=0.2226, simple_loss=0.2885, pruned_loss=0.07831, over 4686264.39 frames. ], batch size: 68, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:50:41,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:50:46,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 04:50:51,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:50:53,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:50:53,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:50:55,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 04:50:55,936 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.78 vs. limit=15.0 2023-09-29 04:50:56,227 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.67 vs. limit=15.0 2023-09-29 04:50:56,858 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 04:51:01,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:51:02,054 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:51:03,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:51:04,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:51:04,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 04:51:04,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 04:51:05,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=254293.33333333334, ans=0.125 2023-09-29 04:51:07,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:51:08,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 04:51:08,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:51:12,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:51:12,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:51:13,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:51:13,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 04:51:15,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 04:51:18,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:51:20,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:51:24,779 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:51:24,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:51:26,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=254360.0, ans=0.0 2023-09-29 04:51:27,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 04:51:30,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 04:51:30,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 04:51:30,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:51:31,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:51:31,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:51:37,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 04:51:39,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:51:41,996 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.975e+02 2.273e+02 2.583e+02 4.078e+02, threshold=4.545e+02, percent-clipped=0.0 2023-09-29 04:51:42,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:51:42,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:51:42,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 04:51:42,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:51:42,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:51:42,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 04:51:48,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:51:50,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=254493.33333333334, ans=0.125 2023-09-29 04:51:52,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:51:52,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=254493.33333333334, ans=0.1 2023-09-29 04:51:59,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:51:59,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 04:52:01,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 04:52:04,006 INFO [train.py:1039] (3/4) Epoch 8, batch 1000, loss[loss=0.2022, simple_loss=0.2506, pruned_loss=0.07695, over 22697.00 frames. ], tot_loss[loss=0.2217, simple_loss=0.2874, pruned_loss=0.07801, over 4682911.96 frames. ], batch size: 322, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:52:04,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:52:08,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 04:52:09,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:52:15,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:52:16,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 04:52:16,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 04:52:22,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:52:22,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:52:23,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:52:24,676 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.86 vs. limit=22.5 2023-09-29 04:52:27,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 04:52:31,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 04:52:32,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 04:52:32,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:52:34,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 04:52:37,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 04:52:37,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 04:52:39,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:52:40,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:52:49,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:52:50,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:52:50,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:52:51,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:52:51,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 04:52:52,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:52:53,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 04:52:53,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:52:53,728 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 04:52:58,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 04:52:58,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 04:53:00,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 04:53:02,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:53:04,964 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=254760.0, ans=0.125 2023-09-29 04:53:09,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:53:09,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:53:10,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:53:10,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:53:13,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 04:53:14,266 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 04:53:15,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:53:15,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 04:53:16,052 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=254826.66666666666, ans=0.125 2023-09-29 04:53:17,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 04:53:19,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:53:19,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:53:22,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:53:22,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:53:25,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:53:27,357 INFO [train.py:1039] (3/4) Epoch 8, batch 1050, loss[loss=0.1904, simple_loss=0.2607, pruned_loss=0.06003, over 24455.00 frames. ], tot_loss[loss=0.2205, simple_loss=0.286, pruned_loss=0.07747, over 4693091.12 frames. ], batch size: 58, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:53:30,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:53:30,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:53:30,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=254893.33333333334, ans=0.125 2023-09-29 04:53:32,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 04:53:33,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:53:38,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:53:39,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 04:53:40,451 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.96 vs. limit=22.5 2023-09-29 04:53:41,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 04:53:42,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:53:44,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:53:44,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 04:53:44,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 04:53:46,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 04:53:46,419 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=254960.0, ans=0.0 2023-09-29 04:53:47,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:53:47,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 04:53:51,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:53:51,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 04:53:51,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 04:53:57,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:53:59,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 04:54:00,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:54:03,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 04:54:03,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 04:54:03,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 04:54:07,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 04:54:10,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 04:54:11,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=255026.66666666666, ans=0.2 2023-09-29 04:54:12,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:54:15,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 04:54:19,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 04:54:19,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:54:19,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:54:22,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 04:54:27,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 04:54:28,814 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 2.059e+02 2.267e+02 2.771e+02 5.438e+02, threshold=4.534e+02, percent-clipped=2.0 2023-09-29 04:54:28,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 04:54:29,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 04:54:29,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=255093.33333333334, ans=0.125 2023-09-29 04:54:30,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:54:30,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 04:54:32,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 04:54:37,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:54:38,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 04:54:38,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:54:38,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:54:38,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:54:44,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:54:44,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 04:54:46,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 04:54:46,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 04:54:47,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 04:54:47,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:54:50,434 INFO [train.py:1039] (3/4) Epoch 8, batch 1100, loss[loss=0.1917, simple_loss=0.259, pruned_loss=0.0622, over 24298.00 frames. ], tot_loss[loss=0.2196, simple_loss=0.2849, pruned_loss=0.07712, over 4693840.59 frames. ], batch size: 56, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:54:50,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=255226.66666666666, ans=0.125 2023-09-29 04:54:52,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:54:56,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:54:59,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=255226.66666666666, ans=0.0 2023-09-29 04:55:00,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 04:55:03,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 04:55:03,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:55:03,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 04:55:05,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:55:08,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 04:55:10,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:55:13,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 04:55:15,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 04:55:15,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 04:55:17,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:55:17,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:55:20,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:55:22,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=255360.0, ans=0.0 2023-09-29 04:55:23,859 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 04:55:28,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=255360.0, ans=0.0 2023-09-29 04:55:29,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:55:32,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 04:55:34,487 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 04:55:35,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:55:37,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:55:39,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:55:40,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:55:42,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 04:55:43,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:55:43,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 04:55:43,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:55:44,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:55:45,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 04:55:50,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 04:55:50,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 04:55:54,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 04:55:59,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 04:56:01,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 04:56:01,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 04:56:02,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:56:05,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:56:05,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:56:07,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 04:56:08,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:56:08,648 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:56:10,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 04:56:10,148 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 04:56:11,508 INFO [train.py:1039] (3/4) Epoch 8, batch 1150, loss[loss=0.2429, simple_loss=0.3156, pruned_loss=0.08509, over 24464.00 frames. ], tot_loss[loss=0.2195, simple_loss=0.2854, pruned_loss=0.07676, over 4699837.29 frames. ], batch size: 66, lr: 1.32e-02, grad_scale: 16.0 2023-09-29 04:56:11,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 04:56:13,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:56:13,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:56:15,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:56:19,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:56:21,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 04:56:25,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:56:25,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:56:25,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 04:56:26,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:56:29,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 04:56:32,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:56:32,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:56:36,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 04:56:38,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:56:41,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:56:43,111 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:56:43,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 04:56:43,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 04:56:44,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:56:45,292 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.01 vs. limit=15.0 2023-09-29 04:56:48,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 04:56:49,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:56:51,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:57:03,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:57:08,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=255760.0, ans=0.0 2023-09-29 04:57:09,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:57:09,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 04:57:11,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:57:11,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:57:12,648 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 2.096e+02 2.373e+02 2.802e+02 4.520e+02, threshold=4.746e+02, percent-clipped=0.0 2023-09-29 04:57:17,538 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 04:57:19,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:57:27,161 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 04:57:30,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:57:32,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:57:32,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 04:57:32,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=255893.33333333334, ans=0.0 2023-09-29 04:57:33,622 INFO [train.py:1039] (3/4) Epoch 8, batch 1200, loss[loss=0.224, simple_loss=0.2834, pruned_loss=0.08228, over 23727.00 frames. ], tot_loss[loss=0.2195, simple_loss=0.2859, pruned_loss=0.07658, over 4718542.52 frames. ], batch size: 212, lr: 1.32e-02, grad_scale: 32.0 2023-09-29 04:57:33,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 04:57:37,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:57:42,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 04:57:43,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 04:57:45,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:57:45,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:57:45,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 04:57:46,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:57:48,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 04:57:49,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:57:49,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:57:51,425 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 04:57:56,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 04:58:00,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 04:58:00,814 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.65 vs. limit=22.5 2023-09-29 04:58:03,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 04:58:04,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:58:05,577 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=6.64 vs. limit=15.0 2023-09-29 04:58:06,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:58:06,967 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 04:58:09,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:58:18,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 04:58:18,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:58:18,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 04:58:20,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 04:58:23,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 04:58:26,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 04:58:26,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:58:28,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:58:29,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:58:30,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 04:58:32,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 04:58:32,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 04:58:33,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 04:58:33,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 04:58:33,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 04:58:35,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:58:35,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 04:58:38,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:58:38,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:58:43,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 04:58:46,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 04:58:48,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 04:58:50,440 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 04:58:53,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:58:56,272 INFO [train.py:1039] (3/4) Epoch 8, batch 1250, loss[loss=0.3027, simple_loss=0.3441, pruned_loss=0.1307, over 19355.00 frames. ], tot_loss[loss=0.2208, simple_loss=0.2871, pruned_loss=0.07723, over 4710950.96 frames. ], batch size: 388, lr: 1.32e-02, grad_scale: 8.0 2023-09-29 04:58:56,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 04:58:56,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 04:58:59,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 04:59:01,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 04:59:06,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 04:59:08,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:59:08,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 04:59:11,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 04:59:11,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 04:59:15,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 04:59:17,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:59:18,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 04:59:18,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:59:20,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 04:59:22,966 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2.whitening_limit, batch_count=256293.33333333334, ans=15.0 2023-09-29 04:59:25,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 04:59:25,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 04:59:25,459 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 04:59:25,652 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 04:59:27,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:59:30,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:59:31,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 04:59:36,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 04:59:36,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 04:59:40,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 04:59:41,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 04:59:43,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 04:59:43,060 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 04:59:43,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:59:43,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 04:59:46,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:59:50,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 04:59:51,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 04:59:52,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 04:59:52,208 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 04:59:52,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 04:59:56,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 04:59:57,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=256426.66666666666, ans=0.2 2023-09-29 04:59:58,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 04:59:58,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:00:00,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=256426.66666666666, ans=0.125 2023-09-29 05:00:01,795 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.911e+02 2.146e+02 2.412e+02 3.765e+02, threshold=4.292e+02, percent-clipped=0.0 2023-09-29 05:00:03,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 05:00:03,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:00:05,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 05:00:05,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 05:00:05,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:00:05,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:00:06,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:00:08,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 05:00:10,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:00:12,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:00:13,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 05:00:13,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=256493.33333333334, ans=0.125 2023-09-29 05:00:16,534 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 05:00:19,984 INFO [train.py:1039] (3/4) Epoch 8, batch 1300, loss[loss=0.2163, simple_loss=0.2799, pruned_loss=0.07629, over 23167.00 frames. ], tot_loss[loss=0.2218, simple_loss=0.2882, pruned_loss=0.07773, over 4715951.29 frames. ], batch size: 93, lr: 1.32e-02, grad_scale: 8.0 2023-09-29 05:00:21,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:00:22,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 05:00:25,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:00:26,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:00:26,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:00:28,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:00:31,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:00:33,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 05:00:39,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:00:39,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:00:42,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 05:00:44,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:00:49,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:00:50,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:00:51,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:00:53,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:00:54,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:00:54,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 05:00:56,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 05:00:57,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=256693.33333333334, ans=0.125 2023-09-29 05:01:00,453 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=14.40 vs. limit=15.0 2023-09-29 05:01:02,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:01:02,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:01:04,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 05:01:06,208 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 05:01:09,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:01:09,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:01:10,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 05:01:10,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:01:12,383 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 05:01:13,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:01:17,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:01:17,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:01:22,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 05:01:22,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 05:01:23,794 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 05:01:27,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:01:27,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=256826.66666666666, ans=0.2 2023-09-29 05:01:30,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 05:01:32,561 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=256826.66666666666, ans=0.1 2023-09-29 05:01:34,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:01:42,314 INFO [train.py:1039] (3/4) Epoch 8, batch 1350, loss[loss=0.2279, simple_loss=0.2993, pruned_loss=0.07824, over 24447.00 frames. ], tot_loss[loss=0.2205, simple_loss=0.2867, pruned_loss=0.07717, over 4722122.15 frames. ], batch size: 63, lr: 1.32e-02, grad_scale: 8.0 2023-09-29 05:01:42,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 05:01:46,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:01:48,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:01:51,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:01:51,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:01:54,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:01:54,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:01:58,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:02:00,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 05:02:02,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 05:02:03,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:02:05,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 05:02:05,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:02:07,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:02:07,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 05:02:10,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 05:02:12,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 05:02:14,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:02:14,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 05:02:21,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=257026.66666666666, ans=0.2 2023-09-29 05:02:27,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:02:37,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:02:37,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:02:37,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 05:02:40,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:02:41,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=257093.33333333334, ans=0.2 2023-09-29 05:02:41,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=257093.33333333334, ans=0.0 2023-09-29 05:02:44,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 05:02:44,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 05:02:45,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:02:47,107 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.702e+02 2.144e+02 2.487e+02 2.898e+02 4.537e+02, threshold=4.974e+02, percent-clipped=1.0 2023-09-29 05:02:48,049 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.87 vs. limit=15.0 2023-09-29 05:02:48,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:02:50,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 05:02:53,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:02:58,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 05:03:00,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 05:03:05,410 INFO [train.py:1039] (3/4) Epoch 8, batch 1400, loss[loss=0.2083, simple_loss=0.2361, pruned_loss=0.09025, over 19046.00 frames. ], tot_loss[loss=0.2189, simple_loss=0.2848, pruned_loss=0.07648, over 4729576.94 frames. ], batch size: 389, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:03:05,832 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=257226.66666666666, ans=0.0 2023-09-29 05:03:08,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 05:03:10,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:03:12,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:03:13,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:03:18,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 05:03:19,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=257226.66666666666, ans=0.1 2023-09-29 05:03:22,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 05:03:22,881 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.63 vs. limit=12.0 2023-09-29 05:03:30,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:03:32,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:03:34,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:03:34,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:03:38,147 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:03:38,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 05:03:48,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:03:50,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:03:54,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 05:03:54,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:03:56,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:03:56,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:03:57,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:03:59,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:03:59,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:04:01,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:04:01,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 05:04:02,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:04:07,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:04:08,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=257426.66666666666, ans=0.125 2023-09-29 05:04:11,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:04:19,726 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 05:04:21,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 05:04:21,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:04:24,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 05:04:24,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:04:26,888 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=257493.33333333334, ans=0.125 2023-09-29 05:04:28,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:04:29,509 INFO [train.py:1039] (3/4) Epoch 8, batch 1450, loss[loss=0.2265, simple_loss=0.2928, pruned_loss=0.08005, over 23528.00 frames. ], tot_loss[loss=0.218, simple_loss=0.284, pruned_loss=0.07604, over 4716304.07 frames. ], batch size: 120, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:04:31,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:04:34,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:04:34,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:04:34,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 05:04:39,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:04:40,049 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=257560.0, ans=0.2 2023-09-29 05:04:41,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:04:42,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:04:42,856 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 05:04:44,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:04:46,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 05:04:46,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:04:48,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:04:48,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 05:04:50,043 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:04:50,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:04:51,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 05:04:51,943 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=257626.66666666666, ans=0.125 2023-09-29 05:04:52,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:04:54,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:04:56,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:04:59,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:05:01,064 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:05:02,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:05:02,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:05:05,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:05:05,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:05:10,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:05:10,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:05:10,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:05:10,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:05:13,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 05:05:15,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:05:17,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=257760.0, ans=0.125 2023-09-29 05:05:20,634 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 05:05:22,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:05:22,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:05:25,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:05:27,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 05:05:30,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:05:31,418 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.89 vs. limit=15.0 2023-09-29 05:05:32,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 05:05:33,284 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 2.099e+02 2.346e+02 2.740e+02 3.754e+02, threshold=4.692e+02, percent-clipped=0.0 2023-09-29 05:05:33,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 05:05:35,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:05:36,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:05:38,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:05:40,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 05:05:44,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 05:05:44,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 05:05:46,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:05:48,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:05:48,669 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:05:51,404 INFO [train.py:1039] (3/4) Epoch 8, batch 1500, loss[loss=0.2252, simple_loss=0.2974, pruned_loss=0.07656, over 24459.00 frames. ], tot_loss[loss=0.2185, simple_loss=0.2849, pruned_loss=0.07606, over 4720250.60 frames. ], batch size: 69, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:05:58,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 05:06:00,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:06:00,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:06:01,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:06:02,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:06:03,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:06:05,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 05:06:05,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:06:06,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 05:06:06,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:06:08,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:06:09,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:06:11,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:06:15,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:06:15,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 05:06:16,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:06:16,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:06:16,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:06:21,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 05:06:23,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=258026.66666666666, ans=0.1 2023-09-29 05:06:27,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 05:06:28,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:06:28,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 05:06:31,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 05:06:33,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:06:33,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:06:33,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:06:36,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 05:06:36,959 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:06:37,561 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.37 vs. limit=10.0 2023-09-29 05:06:38,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:06:38,536 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 05:06:39,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:06:44,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:06:44,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 05:06:52,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 05:06:53,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:06:59,018 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 05:07:00,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:07:00,513 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 05:07:02,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:07:02,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:07:03,042 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 05:07:05,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:07:07,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 05:07:11,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:07:12,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:07:12,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:07:14,175 INFO [train.py:1039] (3/4) Epoch 8, batch 1550, loss[loss=0.2932, simple_loss=0.3361, pruned_loss=0.1251, over 19433.00 frames. ], tot_loss[loss=0.219, simple_loss=0.2856, pruned_loss=0.07616, over 4714174.21 frames. ], batch size: 388, lr: 1.31e-02, grad_scale: 4.0 2023-09-29 05:07:14,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:07:14,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:07:15,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:07:17,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 05:07:18,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 05:07:18,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:07:20,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 05:07:20,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 05:07:22,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:07:23,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:07:23,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:07:25,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:07:27,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:07:27,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:07:29,143 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 05:07:29,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:07:30,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:07:30,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 05:07:32,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:07:32,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 05:07:34,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:07:36,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 05:07:36,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 05:07:36,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 05:07:38,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:07:40,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:07:42,753 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.11 vs. limit=15.0 2023-09-29 05:07:43,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:07:43,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 05:07:43,554 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 05:07:52,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=258360.0, ans=0.2 2023-09-29 05:07:53,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:07:57,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:07:57,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:07:57,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=258360.0, ans=0.0 2023-09-29 05:07:58,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:07:58,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 05:08:04,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:08:06,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:08:10,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:08:10,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=258426.66666666666, ans=0.0 2023-09-29 05:08:13,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:08:15,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:08:15,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 05:08:15,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:08:18,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:08:18,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:08:19,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 05:08:19,671 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 05:08:20,936 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.986e+02 2.177e+02 2.778e+02 5.075e+02, threshold=4.355e+02, percent-clipped=1.0 2023-09-29 05:08:21,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:08:27,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 05:08:31,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:08:31,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:08:33,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 05:08:36,172 INFO [train.py:1039] (3/4) Epoch 8, batch 1600, loss[loss=0.2114, simple_loss=0.2917, pruned_loss=0.06555, over 24659.00 frames. ], tot_loss[loss=0.2189, simple_loss=0.286, pruned_loss=0.0759, over 4715824.58 frames. ], batch size: 73, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:08:37,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:08:37,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:08:37,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:08:37,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:08:39,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:08:42,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=258560.0, ans=0.1 2023-09-29 05:08:43,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:08:43,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 05:08:45,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 05:08:46,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 05:08:49,727 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.17 vs. limit=10.0 2023-09-29 05:08:50,481 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:08:50,759 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=258560.0, ans=0.1 2023-09-29 05:08:52,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 05:08:53,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:08:55,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:08:59,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:09:02,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 05:09:03,126 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=258626.66666666666, ans=0.035 2023-09-29 05:09:06,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:09:07,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 05:09:07,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:09:09,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 05:09:15,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 05:09:25,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:09:25,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 05:09:25,644 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=258760.0, ans=0.125 2023-09-29 05:09:25,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=258760.0, ans=0.1 2023-09-29 05:09:26,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:09:26,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:09:26,802 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:09:29,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 05:09:33,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 05:09:34,845 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:09:36,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:09:36,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:09:36,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:09:36,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=258760.0, ans=0.125 2023-09-29 05:09:39,417 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:09:40,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:09:41,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:09:49,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:09:49,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:09:52,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 05:09:52,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:09:53,061 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 05:09:57,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=258893.33333333334, ans=0.0 2023-09-29 05:09:58,307 INFO [train.py:1039] (3/4) Epoch 8, batch 1650, loss[loss=0.2288, simple_loss=0.296, pruned_loss=0.08079, over 23463.00 frames. ], tot_loss[loss=0.221, simple_loss=0.2876, pruned_loss=0.07716, over 4719106.94 frames. ], batch size: 93, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:09:59,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:10:01,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:10:01,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:10:01,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 05:10:01,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 05:10:01,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 05:10:02,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 05:10:07,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:10:07,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:10:07,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:10:07,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:10:10,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:10:13,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 05:10:14,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=258960.0, ans=0.2 2023-09-29 05:10:16,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:10:16,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:10:16,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:10:16,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:10:16,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 05:10:16,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 05:10:22,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:10:25,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:10:35,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 05:10:35,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:10:37,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 05:10:39,026 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=259026.66666666666, ans=0.125 2023-09-29 05:10:41,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:10:43,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:10:43,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:10:44,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:10:46,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:10:47,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:10:50,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:10:50,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:10:50,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:10:51,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:10:53,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:10:54,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:10:58,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:10:58,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=259093.33333333334, ans=0.0 2023-09-29 05:10:59,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 05:11:01,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:11:02,712 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.647e+02 1.991e+02 2.189e+02 2.754e+02 4.240e+02, threshold=4.377e+02, percent-clipped=0.0 2023-09-29 05:11:02,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 05:11:02,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 05:11:03,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 05:11:03,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:11:05,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:11:05,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:11:06,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:11:06,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 05:11:09,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:11:11,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:11:11,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:11:14,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 05:11:14,601 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:11:18,719 INFO [train.py:1039] (3/4) Epoch 8, batch 1700, loss[loss=0.2219, simple_loss=0.2762, pruned_loss=0.08376, over 23644.00 frames. ], tot_loss[loss=0.2207, simple_loss=0.2874, pruned_loss=0.07704, over 4718725.32 frames. ], batch size: 134, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:11:18,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:11:18,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:11:18,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 05:11:19,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:11:19,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:11:19,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:11:25,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:11:25,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:11:25,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 05:11:25,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=259226.66666666666, ans=0.125 2023-09-29 05:11:27,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=259226.66666666666, ans=0.0 2023-09-29 05:11:28,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:11:31,029 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.13 vs. limit=15.0 2023-09-29 05:11:37,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:11:40,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:11:47,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:11:47,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:11:48,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:11:48,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:11:51,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 05:11:52,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=259360.0, ans=0.05 2023-09-29 05:11:53,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:11:53,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:11:54,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:11:56,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 05:11:57,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 05:11:58,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 05:12:00,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:12:00,546 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=259360.0, ans=0.2 2023-09-29 05:12:02,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 05:12:03,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:12:07,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=259426.66666666666, ans=0.125 2023-09-29 05:12:12,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:12:14,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:12:15,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:12:15,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:12:15,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 05:12:17,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:12:18,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:12:18,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 05:12:18,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:12:19,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:12:20,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:12:20,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:12:22,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:12:22,050 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:12:24,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:12:25,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:12:25,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:12:29,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:12:29,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 05:12:33,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:12:35,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:12:38,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 05:12:41,285 INFO [train.py:1039] (3/4) Epoch 8, batch 1750, loss[loss=0.2022, simple_loss=0.2651, pruned_loss=0.06963, over 18175.00 frames. ], tot_loss[loss=0.2207, simple_loss=0.2862, pruned_loss=0.07762, over 4694926.13 frames. ], batch size: 39, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:12:43,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:12:45,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:12:45,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:12:46,258 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.00 vs. limit=15.0 2023-09-29 05:12:47,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 05:12:47,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:12:47,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=259560.0, ans=0.2 2023-09-29 05:12:51,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:12:51,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:12:54,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 05:12:57,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:13:00,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 05:13:00,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:13:02,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:13:05,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 05:13:05,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 05:13:08,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:13:08,943 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 05:13:19,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:13:22,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:13:22,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:13:27,448 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:13:27,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:13:29,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:13:30,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:13:33,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:13:33,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:13:35,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 05:13:36,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:13:38,556 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=259760.0, ans=0.0 2023-09-29 05:13:39,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 05:13:39,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:13:42,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:13:42,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:13:47,666 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.659e+02 1.975e+02 2.294e+02 2.712e+02 4.778e+02, threshold=4.588e+02, percent-clipped=2.0 2023-09-29 05:13:47,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 05:13:47,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 05:13:49,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:13:49,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:13:56,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:13:58,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:14:00,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:14:02,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 05:14:02,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:14:03,442 INFO [train.py:1039] (3/4) Epoch 8, batch 1800, loss[loss=0.2168, simple_loss=0.2735, pruned_loss=0.08011, over 23731.00 frames. ], tot_loss[loss=0.2193, simple_loss=0.2843, pruned_loss=0.0772, over 4686857.65 frames. ], batch size: 232, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:14:03,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:14:03,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:14:03,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:14:03,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:14:05,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:14:08,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:14:08,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=259893.33333333334, ans=0.0 2023-09-29 05:14:09,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:14:11,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 05:14:14,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:14:17,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:14:20,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:14:23,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:14:25,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:14:25,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=259960.0, ans=0.125 2023-09-29 05:14:26,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:14:28,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:14:30,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:14:30,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 05:14:31,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:14:32,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=259960.0, ans=0.0 2023-09-29 05:14:34,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:14:38,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 05:14:39,988 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=260026.66666666666, ans=0.0 2023-09-29 05:14:41,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 05:14:41,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 05:14:41,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:14:41,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:14:41,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:14:42,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:14:50,443 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 05:14:50,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:14:52,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:14:55,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 05:14:55,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 05:14:57,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:14:59,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:15:00,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:15:06,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 05:15:12,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:15:12,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 05:15:14,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:15:14,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:15:14,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:15:14,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 05:15:17,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:15:17,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:15:20,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 05:15:20,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:15:21,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:15:21,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:15:21,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:15:24,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:15:24,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:15:25,493 INFO [train.py:1039] (3/4) Epoch 8, batch 1850, loss[loss=0.2396, simple_loss=0.3012, pruned_loss=0.08896, over 22878.00 frames. ], tot_loss[loss=0.2194, simple_loss=0.2848, pruned_loss=0.07697, over 4704065.97 frames. ], batch size: 322, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:15:27,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:15:27,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:15:29,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=260226.66666666666, ans=0.125 2023-09-29 05:15:30,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:15:32,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:15:42,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:15:42,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 05:15:45,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 05:15:48,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 05:15:52,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:15:52,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 05:15:52,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 05:16:02,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:16:03,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 05:16:07,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:16:07,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:16:12,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 05:16:14,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:14,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:16:14,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=260426.66666666666, ans=0.035 2023-09-29 05:16:15,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:16:18,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:16:21,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:16:24,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:16:24,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:24,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 05:16:26,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:16:27,794 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:16:29,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:16:30,632 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 1.959e+02 2.142e+02 2.407e+02 4.178e+02, threshold=4.283e+02, percent-clipped=0.0 2023-09-29 05:16:33,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 05:16:35,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:16:38,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:16:39,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:16:39,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 05:16:39,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 05:16:42,832 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 05:16:45,027 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 05:16:45,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:16:46,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:16:46,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:16:46,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:47,993 INFO [train.py:1039] (3/4) Epoch 8, batch 1900, loss[loss=0.2229, simple_loss=0.2858, pruned_loss=0.07997, over 23368.00 frames. ], tot_loss[loss=0.2204, simple_loss=0.2861, pruned_loss=0.07732, over 4708223.79 frames. ], batch size: 119, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:16:48,111 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 05:16:48,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:16:48,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:48,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=260560.0, ans=0.0 2023-09-29 05:16:49,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:16:51,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 05:16:52,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:16:52,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 05:16:54,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:16:54,291 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 05:16:54,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:16:55,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:17:00,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:17:03,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:17:03,827 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 05:17:05,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 05:17:05,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:17:06,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:17:06,882 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 05:17:08,340 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 05:17:11,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 05:17:13,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:17:19,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 05:17:20,982 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=260693.33333333334, ans=0.0 2023-09-29 05:17:22,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 05:17:31,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 05:17:33,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 05:17:33,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:17:33,488 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 05:17:33,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 05:17:33,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 05:17:33,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 05:17:33,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:17:35,375 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=260760.0, ans=0.5 2023-09-29 05:17:38,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 05:17:41,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:17:44,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:17:44,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 05:17:45,717 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.33 vs. limit=22.5 2023-09-29 05:17:46,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:17:51,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 05:17:51,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:17:57,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:17:57,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:17:57,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:17:59,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:18:00,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 05:18:00,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 05:18:01,119 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=260826.66666666666, ans=0.0 2023-09-29 05:18:02,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:18:05,648 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:18:05,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:18:08,566 INFO [train.py:1039] (3/4) Epoch 8, batch 1950, loss[loss=0.3198, simple_loss=0.3533, pruned_loss=0.1431, over 19735.00 frames. ], tot_loss[loss=0.2221, simple_loss=0.2873, pruned_loss=0.07846, over 4705338.20 frames. ], batch size: 388, lr: 1.31e-02, grad_scale: 8.0 2023-09-29 05:18:08,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:18:08,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:18:08,771 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:18:11,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:18:13,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:18:13,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=260893.33333333334, ans=0.1 2023-09-29 05:18:16,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:18:16,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:18:16,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:18:19,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 05:18:19,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 05:18:21,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:18:23,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:18:23,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=260960.0, ans=0.125 2023-09-29 05:18:27,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:18:27,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:18:27,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:18:27,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=260960.0, ans=0.125 2023-09-29 05:18:28,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:18:30,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:18:30,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:18:30,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:18:31,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:18:34,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=260960.0, ans=0.1 2023-09-29 05:18:35,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:18:38,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:18:38,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:18:38,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 05:18:38,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 05:18:38,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:18:39,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:18:39,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:18:44,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:18:47,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:18:50,122 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=261026.66666666666, ans=0.125 2023-09-29 05:18:51,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:18:55,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:18:55,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:18:55,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 05:18:56,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:19:00,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:19:01,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:19:02,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:19:08,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=261093.33333333334, ans=0.1 2023-09-29 05:19:12,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:19:14,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:19:15,356 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 2.025e+02 2.337e+02 2.726e+02 4.544e+02, threshold=4.674e+02, percent-clipped=3.0 2023-09-29 05:19:16,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:19:17,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=261160.0, ans=0.2 2023-09-29 05:19:19,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:19:21,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:19:23,089 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:19:25,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 05:19:25,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:19:26,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:19:26,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 05:19:30,278 INFO [train.py:1039] (3/4) Epoch 8, batch 2000, loss[loss=0.1961, simple_loss=0.2647, pruned_loss=0.0638, over 20395.00 frames. ], tot_loss[loss=0.2227, simple_loss=0.2877, pruned_loss=0.07884, over 4708416.51 frames. ], batch size: 44, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:19:30,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:19:35,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:19:35,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:19:37,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:19:39,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:19:40,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:19:42,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=261226.66666666666, ans=0.1 2023-09-29 05:19:43,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 05:19:43,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:19:45,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=261293.33333333334, ans=0.125 2023-09-29 05:19:46,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:19:49,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 05:19:49,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:19:49,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:19:51,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=261293.33333333334, ans=0.0 2023-09-29 05:19:53,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=261293.33333333334, ans=0.125 2023-09-29 05:19:54,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:19:56,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 05:19:57,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:19:59,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:19:59,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:01,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 05:20:01,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:20:03,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 05:20:03,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:20:06,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:20:08,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 05:20:08,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:08,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:20:10,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:20:12,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 05:20:15,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 05:20:15,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:20:15,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:20:20,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:20:21,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:20:21,816 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:20:23,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:20:26,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:20:26,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:20:26,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:20:27,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:20:29,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:29,633 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=261426.66666666666, ans=0.125 2023-09-29 05:20:31,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:20:33,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 05:20:36,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=261493.33333333334, ans=0.125 2023-09-29 05:20:39,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:20:39,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:20:43,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:20:43,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:20:48,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:48,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=261493.33333333334, ans=0.1 2023-09-29 05:20:48,716 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=261493.33333333334, ans=0.0 2023-09-29 05:20:49,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:20:49,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:51,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:20:52,725 INFO [train.py:1039] (3/4) Epoch 8, batch 2050, loss[loss=0.1861, simple_loss=0.2548, pruned_loss=0.05871, over 24340.00 frames. ], tot_loss[loss=0.2222, simple_loss=0.2868, pruned_loss=0.07885, over 4687653.05 frames. ], batch size: 56, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:20:52,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:20:54,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:20:54,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:20:57,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:20:57,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:21:05,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:21:07,050 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:21:08,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:21:08,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:21:09,477 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.90 vs. limit=15.0 2023-09-29 05:21:11,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 05:21:11,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:21:11,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:21:11,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:21:24,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:21:24,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:21:24,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=261693.33333333334, ans=0.0 2023-09-29 05:21:27,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 05:21:28,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:21:29,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 05:21:30,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:21:33,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:21:35,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:21:37,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:21:37,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:21:38,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:21:38,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:21:40,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:21:40,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=261760.0, ans=0.125 2023-09-29 05:21:43,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:21:46,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:21:48,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:21:50,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:21:50,319 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=261760.0, ans=0.1 2023-09-29 05:21:54,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:21:59,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:22:01,288 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.781e+02 2.119e+02 2.372e+02 3.018e+02 5.017e+02, threshold=4.745e+02, percent-clipped=2.0 2023-09-29 05:22:01,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 05:22:06,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:22:07,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:22:09,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:22:09,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 05:22:14,329 INFO [train.py:1039] (3/4) Epoch 8, batch 2100, loss[loss=0.2179, simple_loss=0.2752, pruned_loss=0.0803, over 23657.00 frames. ], tot_loss[loss=0.2204, simple_loss=0.2854, pruned_loss=0.07768, over 4703201.92 frames. ], batch size: 256, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:22:14,532 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 05:22:14,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:22:14,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:22:16,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:22:18,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:22:18,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 05:22:18,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 05:22:20,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:22:23,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:22:23,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:22:26,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:22:28,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:22:28,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 05:22:28,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=261893.33333333334, ans=0.05 2023-09-29 05:22:30,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:22:30,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 05:22:30,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 05:22:33,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:22:33,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:22:33,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 05:22:33,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 05:22:39,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 05:22:39,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:22:41,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:22:41,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:22:46,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:22:46,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=262026.66666666666, ans=0.0 2023-09-29 05:22:47,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 05:22:47,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:22:47,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 05:22:51,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 05:22:53,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:22:53,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 05:22:53,481 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 05:22:53,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=262026.66666666666, ans=0.125 2023-09-29 05:22:54,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 05:22:56,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:22:57,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:22:58,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=262026.66666666666, ans=0.0 2023-09-29 05:23:01,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:23:01,546 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=262026.66666666666, ans=0.125 2023-09-29 05:23:02,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:23:04,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:06,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:23:06,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 05:23:06,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:23:06,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:23:07,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:07,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 05:23:08,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=262093.33333333334, ans=0.0 2023-09-29 05:23:09,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 05:23:10,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 05:23:14,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:23:17,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:23:17,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 05:23:24,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:23:26,733 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=262160.0, ans=0.125 2023-09-29 05:23:29,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:23:29,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:23:29,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:23:29,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 05:23:29,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=262160.0, ans=0.0 2023-09-29 05:23:31,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:23:32,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:23:32,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:23:32,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:23:34,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:34,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=262160.0, ans=0.1 2023-09-29 05:23:35,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 05:23:37,096 INFO [train.py:1039] (3/4) Epoch 8, batch 2150, loss[loss=0.1924, simple_loss=0.2296, pruned_loss=0.07764, over 19079.00 frames. ], tot_loss[loss=0.2195, simple_loss=0.285, pruned_loss=0.077, over 4708368.63 frames. ], batch size: 388, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:23:37,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 05:23:37,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:23:40,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:23:40,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:23:40,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:23:40,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:23:47,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 05:23:50,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:23:51,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:54,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:23:54,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:23:54,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:23:59,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:23:59,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:23:59,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:24:02,608 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.92 vs. limit=22.5 2023-09-29 05:24:03,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:04,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 05:24:05,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=262293.3333333333, ans=0.125 2023-09-29 05:24:09,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:24:09,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:24:11,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:11,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:24:13,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:13,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:24:13,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:24:13,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:24:14,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:24:16,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 05:24:17,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:24:18,105 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=262360.0, ans=0.0 2023-09-29 05:24:19,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:24:19,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:24:20,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:24:21,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:24:21,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=262360.0, ans=0.035 2023-09-29 05:24:23,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=262360.0, ans=0.0 2023-09-29 05:24:24,619 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:24:24,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:24:26,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:24:26,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 05:24:26,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 05:24:31,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:24:32,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:34,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:24:34,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:24:35,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:24:37,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:37,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 05:24:40,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 05:24:40,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:24:41,437 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 05:24:41,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:24:41,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:24:42,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 05:24:43,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:24:43,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 05:24:43,063 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 05:24:43,063 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 05:24:44,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 05:24:44,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=262493.3333333333, ans=0.0 2023-09-29 05:24:45,903 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.644e+02 2.226e+02 2.643e+02 3.151e+02 6.561e+02, threshold=5.285e+02, percent-clipped=6.0 2023-09-29 05:24:46,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:24:46,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:24:46,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:24:47,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:47,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 05:24:47,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:24:47,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:24:57,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:24:58,892 INFO [train.py:1039] (3/4) Epoch 8, batch 2200, loss[loss=0.2022, simple_loss=0.2764, pruned_loss=0.06401, over 24462.00 frames. ], tot_loss[loss=0.2192, simple_loss=0.2846, pruned_loss=0.07694, over 4710270.00 frames. ], batch size: 66, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:24:59,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 05:25:04,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:25:04,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=262560.0, ans=0.015 2023-09-29 05:25:08,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:25:10,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:25:11,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:25:11,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:25:14,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:25:14,921 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=262626.6666666667, ans=0.125 2023-09-29 05:25:16,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:25:16,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 05:25:22,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 05:25:23,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:25:28,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 05:25:32,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:25:33,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:25:33,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:25:34,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=262693.3333333333, ans=0.125 2023-09-29 05:25:37,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:25:37,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 05:25:42,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:25:44,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:25:44,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 05:25:45,082 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:25:47,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:25:49,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:25:51,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:25:52,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:25:55,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 05:25:57,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:25:58,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 05:26:01,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:26:01,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:26:01,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:26:04,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:26:04,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:26:04,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:26:04,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:26:06,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:26:06,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:26:09,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:26:13,481 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 05:26:14,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:26:16,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:26:16,678 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 05:26:20,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:26:20,928 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 05:26:22,275 INFO [train.py:1039] (3/4) Epoch 8, batch 2250, loss[loss=0.2906, simple_loss=0.3332, pruned_loss=0.124, over 19533.00 frames. ], tot_loss[loss=0.2199, simple_loss=0.2854, pruned_loss=0.07716, over 4716684.17 frames. ], batch size: 388, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:26:22,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:26:23,985 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 05:26:25,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:26:25,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 05:26:27,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:26:28,713 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 05:26:30,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:26:32,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:26:38,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:26:38,373 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:26:40,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:26:42,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:26:42,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:26:45,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 05:26:45,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:26:47,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:26:48,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 05:26:51,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:26:51,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:26:52,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:26:57,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:26:58,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 05:26:58,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:27:00,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 05:27:01,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:27:03,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:27:06,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:27:08,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:27:08,552 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=263026.6666666667, ans=0.125 2023-09-29 05:27:09,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:27:09,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:27:11,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:27:13,104 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=263093.3333333333, ans=0.0 2023-09-29 05:27:14,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:27:18,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:27:21,485 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 05:27:27,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:27:27,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:27:28,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:27:31,596 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.954e+02 2.186e+02 2.448e+02 4.409e+02, threshold=4.373e+02, percent-clipped=0.0 2023-09-29 05:27:35,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=263160.0, ans=0.0 2023-09-29 05:27:36,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 05:27:39,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 05:27:39,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 05:27:39,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:27:39,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:27:42,745 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=263226.6666666667, ans=0.0 2023-09-29 05:27:43,928 INFO [train.py:1039] (3/4) Epoch 8, batch 2300, loss[loss=0.2264, simple_loss=0.2853, pruned_loss=0.08373, over 23318.00 frames. ], tot_loss[loss=0.2211, simple_loss=0.286, pruned_loss=0.0781, over 4709405.33 frames. ], batch size: 119, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:27:43,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 05:27:45,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:27:45,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:27:46,943 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=263226.6666666667, ans=0.07 2023-09-29 05:27:52,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:27:52,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:27:54,199 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 05:27:55,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:27:59,074 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=15.02 vs. limit=15.0 2023-09-29 05:28:03,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:28:03,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 05:28:04,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:28:04,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:28:04,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 05:28:06,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:28:07,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:28:07,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:28:11,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:28:14,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:28:17,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:28:22,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:28:22,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:28:25,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:28:27,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=263360.0, ans=0.2 2023-09-29 05:28:29,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:28:32,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:28:34,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:28:34,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:28:34,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 05:28:36,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=263426.6666666667, ans=0.125 2023-09-29 05:28:37,794 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 05:28:37,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:28:37,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:28:37,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:28:39,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:28:40,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 05:28:40,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:28:42,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 05:28:42,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:28:42,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:28:42,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 05:28:51,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:28:53,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=263493.3333333333, ans=0.125 2023-09-29 05:28:55,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:28:58,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:28:58,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:28:58,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:29:01,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 05:29:01,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:29:03,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:29:03,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 05:29:07,062 INFO [train.py:1039] (3/4) Epoch 8, batch 2350, loss[loss=0.2263, simple_loss=0.2956, pruned_loss=0.07853, over 23394.00 frames. ], tot_loss[loss=0.2225, simple_loss=0.2872, pruned_loss=0.07894, over 4682835.21 frames. ], batch size: 93, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:29:07,946 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.44 vs. limit=10.0 2023-09-29 05:29:10,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:29:10,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 05:29:16,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 05:29:18,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:29:22,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:29:22,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:29:22,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:29:23,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:29:25,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 05:29:29,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:29:35,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 05:29:37,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:29:39,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:29:39,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:29:43,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:29:46,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 05:29:47,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:29:49,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:29:50,704 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:29:50,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:29:55,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:29:58,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 05:29:58,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:30:01,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:30:01,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:30:03,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 05:30:03,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:30:03,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=263760.0, ans=0.125 2023-09-29 05:30:06,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 05:30:08,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:30:08,838 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.84 vs. limit=22.5 2023-09-29 05:30:11,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 05:30:12,532 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.90 vs. limit=12.0 2023-09-29 05:30:12,636 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.65 vs. limit=15.0 2023-09-29 05:30:14,764 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 2.176e+02 2.529e+02 2.945e+02 4.428e+02, threshold=5.058e+02, percent-clipped=1.0 2023-09-29 05:30:17,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 05:30:17,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:30:17,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 05:30:17,175 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 05:30:19,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 05:30:20,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 05:30:24,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:30:28,201 INFO [train.py:1039] (3/4) Epoch 8, batch 2400, loss[loss=0.2145, simple_loss=0.2884, pruned_loss=0.07028, over 24502.00 frames. ], tot_loss[loss=0.221, simple_loss=0.2861, pruned_loss=0.07793, over 4697521.70 frames. ], batch size: 66, lr: 1.30e-02, grad_scale: 16.0 2023-09-29 05:30:28,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:30:32,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:30:34,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:30:35,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 05:30:35,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 05:30:39,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=263893.3333333333, ans=0.07 2023-09-29 05:30:42,041 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.82 vs. limit=15.0 2023-09-29 05:30:43,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 05:30:43,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:30:46,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 05:30:46,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:30:47,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:30:49,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 05:30:54,294 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.41 vs. limit=6.0 2023-09-29 05:30:56,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:30:57,589 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.91 vs. limit=10.0 2023-09-29 05:30:58,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 05:31:02,249 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.02 vs. limit=22.5 2023-09-29 05:31:03,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:31:07,648 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 05:31:09,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:31:10,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:31:16,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:31:18,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 05:31:18,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:31:24,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:31:28,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:31:30,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=264093.3333333333, ans=0.2 2023-09-29 05:31:31,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:31:31,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=264093.3333333333, ans=0.04949747468305833 2023-09-29 05:31:33,052 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:31:33,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 05:31:33,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:31:33,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:31:33,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:31:33,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 05:31:37,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:31:37,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:31:38,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=264160.0, ans=0.125 2023-09-29 05:31:39,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 05:31:39,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 05:31:40,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:31:40,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:31:41,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 05:31:41,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 05:31:41,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 05:31:41,179 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 05:31:44,108 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 05:31:44,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:31:45,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:31:45,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:31:47,854 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 05:31:49,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:31:49,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 05:31:51,827 INFO [train.py:1039] (3/4) Epoch 8, batch 2450, loss[loss=0.2042, simple_loss=0.2683, pruned_loss=0.07004, over 23325.00 frames. ], tot_loss[loss=0.2196, simple_loss=0.2853, pruned_loss=0.07696, over 4707502.59 frames. ], batch size: 119, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:31:55,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:31:55,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:31:59,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:31:59,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:32:01,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 05:32:06,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:32:06,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:32:09,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:32:09,524 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=264293.3333333333, ans=0.09899494936611666 2023-09-29 05:32:10,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:32:10,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:32:10,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 05:32:13,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:32:16,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:32:17,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:32:21,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 05:32:23,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:32:25,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:32:25,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:32:27,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 05:32:27,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:32:35,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:32:37,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:32:37,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:32:38,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:32:39,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:32:39,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:32:40,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 05:32:43,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:32:43,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:32:48,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:32:48,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:32:53,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:32:53,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 05:32:54,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:32:55,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:32:55,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 05:32:57,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:32:59,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:33:01,899 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 2.058e+02 2.397e+02 2.730e+02 4.175e+02, threshold=4.793e+02, percent-clipped=0.0 2023-09-29 05:33:05,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:33:08,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:33:08,482 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:33:08,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=264493.3333333333, ans=0.0 2023-09-29 05:33:11,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 05:33:11,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:33:13,061 INFO [train.py:1039] (3/4) Epoch 8, batch 2500, loss[loss=0.2162, simple_loss=0.2781, pruned_loss=0.07719, over 23225.00 frames. ], tot_loss[loss=0.2184, simple_loss=0.2845, pruned_loss=0.07617, over 4714488.22 frames. ], batch size: 105, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:33:19,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:33:28,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:33:28,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:33:31,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:33:31,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 05:33:37,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:33:39,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:33:40,143 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.50 vs. limit=22.5 2023-09-29 05:33:40,242 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.91 vs. limit=22.5 2023-09-29 05:33:40,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 05:33:40,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 05:33:42,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 05:33:43,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:33:43,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:33:45,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 05:33:45,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:33:45,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 05:33:45,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:33:47,287 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=264693.3333333333, ans=10.0 2023-09-29 05:33:50,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:33:51,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:33:54,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:33:54,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 05:33:54,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:33:57,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:34:00,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=264760.0, ans=0.125 2023-09-29 05:34:02,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:34:05,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:34:11,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:34:16,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 05:34:19,828 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=264826.6666666667, ans=0.125 2023-09-29 05:34:21,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 05:34:21,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:34:21,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 05:34:22,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:34:22,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:34:24,198 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 05:34:24,199 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 05:34:24,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 05:34:27,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:34:30,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 05:34:30,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 05:34:31,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:34:33,171 INFO [train.py:1039] (3/4) Epoch 8, batch 2550, loss[loss=0.2398, simple_loss=0.3041, pruned_loss=0.08778, over 23751.00 frames. ], tot_loss[loss=0.219, simple_loss=0.2851, pruned_loss=0.07639, over 4726354.28 frames. ], batch size: 164, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:34:33,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 05:34:33,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=264893.3333333333, ans=0.125 2023-09-29 05:34:36,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 05:34:39,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:34:41,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:34:42,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:34:44,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:34:47,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 05:34:47,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:34:51,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 05:34:52,221 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.99 vs. limit=15.0 2023-09-29 05:34:53,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:34:54,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:34:57,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:34:57,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 05:34:57,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:34:57,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:34:59,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:35:02,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:35:02,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 05:35:02,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 05:35:02,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:35:02,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 05:35:06,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=265026.6666666667, ans=0.0 2023-09-29 05:35:11,885 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=265026.6666666667, ans=0.1 2023-09-29 05:35:16,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:35:20,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:35:20,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:35:20,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:35:22,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:35:29,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:35:32,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:35:32,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:35:32,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:35:33,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:35:33,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:35:35,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:35:36,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:35:40,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:35:40,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 05:35:40,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:35:41,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:35:42,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=265160.0, ans=0.125 2023-09-29 05:35:43,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:35:45,165 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.988e+02 2.217e+02 2.595e+02 3.453e+02, threshold=4.435e+02, percent-clipped=0.0 2023-09-29 05:35:45,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:35:46,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:35:54,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:35:55,491 INFO [train.py:1039] (3/4) Epoch 8, batch 2600, loss[loss=0.2504, simple_loss=0.3035, pruned_loss=0.09869, over 22778.00 frames. ], tot_loss[loss=0.2201, simple_loss=0.2861, pruned_loss=0.07704, over 4719084.61 frames. ], batch size: 322, lr: 1.30e-02, grad_scale: 8.0 2023-09-29 05:35:57,712 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:36:00,684 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 05:36:02,322 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 05:36:02,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:36:03,782 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 05:36:03,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 05:36:03,955 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 05:36:06,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:36:07,024 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 05:36:08,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 05:36:09,968 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 05:36:12,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:36:13,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=265293.3333333333, ans=0.0 2023-09-29 05:36:14,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 05:36:16,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 05:36:17,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:36:17,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 05:36:21,535 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 05:36:21,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 05:36:23,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=265293.3333333333, ans=0.125 2023-09-29 05:36:31,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:36:31,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:36:31,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:36:31,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 05:36:33,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:36:38,487 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=265360.0, ans=0.125 2023-09-29 05:36:39,644 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 05:36:44,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:36:44,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:36:45,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 05:36:45,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:36:45,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:36:47,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 05:36:50,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:36:50,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:36:54,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:36:55,989 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 05:36:56,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:36:56,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:37:03,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:37:04,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:37:04,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 05:37:04,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=265493.3333333333, ans=0.0 2023-09-29 05:37:06,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:37:08,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:37:09,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:37:11,580 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=265493.3333333333, ans=0.125 2023-09-29 05:37:15,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 05:37:16,980 INFO [train.py:1039] (3/4) Epoch 8, batch 2650, loss[loss=0.3247, simple_loss=0.3561, pruned_loss=0.1466, over 19276.00 frames. ], tot_loss[loss=0.2212, simple_loss=0.287, pruned_loss=0.07774, over 4706029.16 frames. ], batch size: 388, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:37:17,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:37:18,712 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 05:37:22,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 05:37:23,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:37:25,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 05:37:26,503 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 05:37:26,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:37:28,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:37:31,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 05:37:33,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:37:35,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:37:36,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 05:37:36,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:37:36,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:37:41,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 05:37:41,734 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 05:37:45,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:37:48,418 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 05:37:48,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:37:48,536 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 05:37:49,404 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.00 vs. limit=15.0 2023-09-29 05:37:51,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:37:53,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 05:37:53,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:37:54,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:37:57,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 05:37:57,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 05:38:01,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:38:05,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 05:38:05,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:38:06,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:38:06,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:38:06,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:38:08,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:38:10,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:38:10,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:38:11,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:38:13,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:38:14,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:38:17,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:38:18,291 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=265760.0, ans=0.125 2023-09-29 05:38:19,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:38:19,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:38:19,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=265760.0, ans=0.125 2023-09-29 05:38:21,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:38:21,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 05:38:26,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:38:28,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:38:28,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:38:29,351 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.037e+02 2.259e+02 2.622e+02 3.986e+02, threshold=4.518e+02, percent-clipped=0.0 2023-09-29 05:38:29,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 05:38:33,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:38:34,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:38:35,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:38:36,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:38:37,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 05:38:39,337 INFO [train.py:1039] (3/4) Epoch 8, batch 2700, loss[loss=0.2209, simple_loss=0.2851, pruned_loss=0.07835, over 23352.00 frames. ], tot_loss[loss=0.2226, simple_loss=0.2886, pruned_loss=0.07827, over 4698205.11 frames. ], batch size: 105, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:38:39,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:38:41,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:38:41,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 05:38:45,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:38:46,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 05:38:48,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:38:48,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:38:48,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:38:50,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:38:50,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:38:51,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:38:51,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 05:38:51,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 05:38:51,724 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:38:54,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:38:54,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:38:56,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:38:58,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=265960.0, ans=0.0 2023-09-29 05:38:59,155 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.84 vs. limit=15.0 2023-09-29 05:39:01,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:39:01,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 05:39:03,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:39:09,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:39:09,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:39:15,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:39:15,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:39:15,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:39:15,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:39:18,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=266026.6666666667, ans=0.0 2023-09-29 05:39:20,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:39:23,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:39:23,485 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:39:23,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:39:29,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:39:29,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:39:36,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:39:38,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:39:41,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:39:41,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:39:44,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:39:45,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=266160.0, ans=0.125 2023-09-29 05:39:46,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:39:46,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:39:48,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:39:48,468 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:39:50,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:39:52,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:39:53,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:39:53,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:39:56,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 05:39:56,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:39:59,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:39:59,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 05:40:01,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 05:40:02,946 INFO [train.py:1039] (3/4) Epoch 8, batch 2750, loss[loss=0.2032, simple_loss=0.2588, pruned_loss=0.07379, over 23443.00 frames. ], tot_loss[loss=0.2224, simple_loss=0.2879, pruned_loss=0.07845, over 4689660.29 frames. ], batch size: 285, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:40:03,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:40:04,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:40:04,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:40:08,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:08,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:40:08,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:12,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:40:13,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 05:40:13,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:40:13,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:13,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 05:40:13,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:40:13,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:40:22,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 05:40:24,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:40:25,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:25,868 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:40:25,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 05:40:26,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=266293.3333333333, ans=0.125 2023-09-29 05:40:26,551 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.34 vs. limit=15.0 2023-09-29 05:40:27,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:40:28,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:40:29,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:40:30,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:40:32,302 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=266293.3333333333, ans=0.1 2023-09-29 05:40:33,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 05:40:35,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 05:40:36,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:40:36,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:38,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:40:45,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:40:48,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 05:40:48,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=266360.0, ans=0.07 2023-09-29 05:40:49,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:40:51,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:40:51,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:40:51,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=266426.6666666667, ans=0.09899494936611666 2023-09-29 05:40:52,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=266426.6666666667, ans=0.125 2023-09-29 05:40:52,320 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.90 vs. limit=22.5 2023-09-29 05:40:53,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:40:59,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:41:00,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:41:00,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 05:41:04,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:41:06,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 05:41:11,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 05:41:13,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:41:14,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 05:41:14,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:41:15,922 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 2.077e+02 2.428e+02 2.871e+02 4.392e+02, threshold=4.857e+02, percent-clipped=0.0 2023-09-29 05:41:17,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:41:17,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 05:41:17,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:41:21,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 05:41:21,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:41:21,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:41:23,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 05:41:23,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:41:23,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:41:25,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:41:25,226 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 05:41:25,227 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 05:41:26,547 INFO [train.py:1039] (3/4) Epoch 8, batch 2800, loss[loss=0.2282, simple_loss=0.2851, pruned_loss=0.08565, over 23883.00 frames. ], tot_loss[loss=0.2211, simple_loss=0.2866, pruned_loss=0.07775, over 4697012.70 frames. ], batch size: 212, lr: 1.29e-02, grad_scale: 16.0 2023-09-29 05:41:30,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=266560.0, ans=0.125 2023-09-29 05:41:31,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:41:33,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:41:33,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:41:37,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:41:39,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 05:41:39,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=266560.0, ans=0.1 2023-09-29 05:41:42,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 05:41:43,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 05:41:45,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:41:45,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=266626.6666666667, ans=0.125 2023-09-29 05:41:46,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:41:46,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:41:50,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:41:50,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:41:50,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:41:55,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:42:01,547 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=266693.3333333333, ans=0.125 2023-09-29 05:42:04,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:42:07,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:42:09,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:42:11,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:42:11,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:42:17,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:42:17,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 05:42:17,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:42:18,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=266760.0, ans=0.2 2023-09-29 05:42:19,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:42:19,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:42:23,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:42:25,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:42:27,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:42:30,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:42:30,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:42:30,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:42:32,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 05:42:33,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:42:34,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:42:34,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 05:42:34,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:42:36,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:42:36,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:42:37,039 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 05:42:38,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 05:42:38,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:42:38,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:42:39,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:42:41,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 05:42:48,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:42:48,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:42:48,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=266826.6666666667, ans=0.1 2023-09-29 05:42:49,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:42:51,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:42:51,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=266893.3333333333, ans=0.0 2023-09-29 05:42:52,724 INFO [train.py:1039] (3/4) Epoch 8, batch 2850, loss[loss=0.2381, simple_loss=0.295, pruned_loss=0.09062, over 23405.00 frames. ], tot_loss[loss=0.2196, simple_loss=0.2854, pruned_loss=0.07692, over 4692049.50 frames. ], batch size: 120, lr: 1.29e-02, grad_scale: 16.0 2023-09-29 05:42:57,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:42:57,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:42:57,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:42:58,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:42:59,145 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=266893.3333333333, ans=0.125 2023-09-29 05:43:00,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:43:02,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:43:04,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 05:43:11,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 05:43:11,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:43:13,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 05:43:13,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:43:16,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 05:43:16,454 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 05:43:20,051 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:43:26,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=267026.6666666667, ans=0.125 2023-09-29 05:43:32,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:43:33,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:43:34,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=267026.6666666667, ans=0.125 2023-09-29 05:43:35,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:43:35,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 05:43:35,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=267026.6666666667, ans=0.125 2023-09-29 05:43:36,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 05:43:37,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:43:40,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:43:40,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 05:43:42,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:43:42,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:43:44,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:43:44,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:43:44,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=267093.3333333333, ans=0.125 2023-09-29 05:43:46,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:43:46,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:43:48,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:43:49,716 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:43:51,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=267093.3333333333, ans=0.125 2023-09-29 05:43:52,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:43:52,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:43:54,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:43:57,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:44:02,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:44:02,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=267160.0, ans=0.125 2023-09-29 05:44:04,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 05:44:04,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 05:44:05,437 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 2.097e+02 2.299e+02 2.689e+02 7.485e+02, threshold=4.599e+02, percent-clipped=2.0 2023-09-29 05:44:07,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 05:44:07,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:44:07,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 05:44:07,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=267160.0, ans=0.125 2023-09-29 05:44:08,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:44:10,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:44:10,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:44:10,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:44:10,232 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 05:44:11,628 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 05:44:11,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:44:13,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:44:15,345 INFO [train.py:1039] (3/4) Epoch 8, batch 2900, loss[loss=0.1962, simple_loss=0.28, pruned_loss=0.0562, over 24635.00 frames. ], tot_loss[loss=0.2197, simple_loss=0.2858, pruned_loss=0.07681, over 4703852.50 frames. ], batch size: 73, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:44:16,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:44:17,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:44:17,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:44:19,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 05:44:22,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=267226.6666666667, ans=0.125 2023-09-29 05:44:23,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:44:23,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 05:44:24,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 05:44:25,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 05:44:25,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:44:27,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:44:27,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=267226.6666666667, ans=0.0 2023-09-29 05:44:29,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:44:30,234 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.24 vs. limit=15.0 2023-09-29 05:44:33,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 05:44:33,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:44:37,484 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.95 vs. limit=12.0 2023-09-29 05:44:37,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:44:38,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 05:44:39,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:44:41,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:44:42,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 05:44:43,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=267293.3333333333, ans=10.0 2023-09-29 05:44:44,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 05:44:45,868 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:44:45,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 05:44:45,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:44:50,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:44:50,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 05:44:52,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:44:56,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:44:59,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:45:03,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:45:04,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 05:45:04,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 05:45:04,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:45:09,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:45:11,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=267426.6666666667, ans=0.125 2023-09-29 05:45:12,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 05:45:14,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:45:14,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=267426.6666666667, ans=0.2 2023-09-29 05:45:19,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:45:28,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:45:28,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 05:45:30,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 05:45:30,508 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=267493.3333333333, ans=0.125 2023-09-29 05:45:33,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:45:33,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 05:45:34,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:45:34,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:45:35,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=267560.0, ans=0.1 2023-09-29 05:45:36,177 INFO [train.py:1039] (3/4) Epoch 8, batch 2950, loss[loss=0.2118, simple_loss=0.2859, pruned_loss=0.06892, over 24654.00 frames. ], tot_loss[loss=0.2199, simple_loss=0.2864, pruned_loss=0.07667, over 4706349.74 frames. ], batch size: 65, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:45:41,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:45:43,459 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 05:45:43,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:45:44,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:45:46,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:45:46,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=267560.0, ans=0.0 2023-09-29 05:45:48,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:45:49,619 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 05:45:51,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 05:45:51,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:45:51,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:45:56,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:45:59,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:46:01,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:46:01,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:46:04,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:46:05,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:46:06,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:46:08,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:46:08,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:46:09,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 05:46:16,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 05:46:18,107 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 05:46:19,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:46:21,173 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 05:46:22,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 05:46:22,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:46:24,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:46:24,243 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 05:46:24,261 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 05:46:27,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 05:46:28,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:46:30,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 05:46:33,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:46:35,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:46:35,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:46:35,127 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 05:46:36,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:46:36,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 05:46:43,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:46:45,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:46:45,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 05:46:45,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:46:47,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 05:46:50,560 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.933e+02 2.152e+02 2.474e+02 4.181e+02, threshold=4.303e+02, percent-clipped=1.0 2023-09-29 05:46:50,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:46:52,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:46:52,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:46:52,919 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.30 vs. limit=10.0 2023-09-29 05:46:55,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:46:55,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 05:46:55,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:46:57,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:46:57,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:46:57,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:46:57,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:46:58,574 INFO [train.py:1039] (3/4) Epoch 8, batch 3000, loss[loss=0.2441, simple_loss=0.2978, pruned_loss=0.09521, over 23782.00 frames. ], tot_loss[loss=0.2201, simple_loss=0.2868, pruned_loss=0.07668, over 4715472.57 frames. ], batch size: 212, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:46:58,575 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 05:47:08,094 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.4.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([2.0811, 3.2626, 4.6447, 3.7620], device='cuda:3') 2023-09-29 05:47:12,763 INFO [train.py:1071] (3/4) Epoch 8, validation: loss=0.3012, simple_loss=0.2865, pruned_loss=0.1579, over 1125622.00 frames. 2023-09-29 05:47:12,764 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 05:47:14,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:47:15,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:47:15,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 05:47:16,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:47:18,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:47:18,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=267893.3333333333, ans=0.0 2023-09-29 05:47:20,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 05:47:23,418 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 05:47:23,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 05:47:24,438 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.77 vs. limit=12.0 2023-09-29 05:47:25,111 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:47:26,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:47:26,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 05:47:26,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:47:32,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:47:45,859 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:47:51,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 05:47:52,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:47:56,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:47:56,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=268026.6666666667, ans=0.0 2023-09-29 05:47:57,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:47:57,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:47:59,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:47:59,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 05:48:01,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 05:48:04,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:48:04,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 05:48:05,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:48:06,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:48:07,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:48:07,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:48:11,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 05:48:12,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:48:12,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:48:13,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:48:16,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 05:48:16,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 05:48:16,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:48:19,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:48:22,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:48:22,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:48:25,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 05:48:25,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 05:48:25,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:48:26,687 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 05:48:26,764 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:48:31,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 05:48:34,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 05:48:34,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 05:48:34,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 05:48:35,790 INFO [train.py:1039] (3/4) Epoch 8, batch 3050, loss[loss=0.1921, simple_loss=0.263, pruned_loss=0.06058, over 20064.00 frames. ], tot_loss[loss=0.221, simple_loss=0.2879, pruned_loss=0.07702, over 4722754.80 frames. ], batch size: 43, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:48:37,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 05:48:37,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 05:48:37,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:48:38,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:48:38,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 05:48:38,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:48:39,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:48:42,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 05:48:44,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:48:45,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=268226.6666666667, ans=0.0 2023-09-29 05:48:47,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:48:47,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:48:51,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:48:51,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=268293.3333333333, ans=0.1 2023-09-29 05:48:55,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 05:49:03,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 05:49:03,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 05:49:03,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:49:06,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 05:49:10,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:49:10,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:49:10,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:49:11,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:49:13,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 05:49:13,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:49:13,552 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=268360.0, ans=0.125 2023-09-29 05:49:14,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:49:14,724 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:49:14,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:49:17,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:49:20,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:49:20,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 05:49:22,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:49:22,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 05:49:27,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:49:27,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 05:49:27,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:49:28,245 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.40 vs. limit=15.0 2023-09-29 05:49:29,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:49:34,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:49:36,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:49:42,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:49:42,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:49:42,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:49:44,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:49:44,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 05:49:44,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 05:49:45,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 05:49:47,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:49:48,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:49:50,196 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 2.057e+02 2.311e+02 2.662e+02 5.288e+02, threshold=4.621e+02, percent-clipped=1.0 2023-09-29 05:49:50,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 05:49:51,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:49:55,729 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.83 vs. limit=15.0 2023-09-29 05:49:56,334 INFO [train.py:1039] (3/4) Epoch 8, batch 3100, loss[loss=0.2244, simple_loss=0.2815, pruned_loss=0.08369, over 23555.00 frames. ], tot_loss[loss=0.2211, simple_loss=0.2878, pruned_loss=0.0772, over 4723191.70 frames. ], batch size: 149, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:49:56,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=268560.0, ans=0.0 2023-09-29 05:49:58,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:49:58,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 05:50:00,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 05:50:01,394 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.16 vs. limit=15.0 2023-09-29 05:50:02,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 05:50:06,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 05:50:06,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 05:50:08,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 05:50:13,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:50:13,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:50:15,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 05:50:18,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:50:24,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 05:50:30,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 05:50:30,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:50:31,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:50:31,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:50:33,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 05:50:35,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:50:35,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 05:50:35,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:50:37,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:50:39,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 05:50:41,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:50:44,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:50:44,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 05:50:46,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 05:50:47,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:50:47,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:50:49,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:50:49,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:50:49,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:50:51,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:50:51,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:50:54,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:50:54,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:50:54,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:50:54,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 05:50:57,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:50:59,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 05:51:00,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:51:02,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 05:51:02,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:51:02,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:51:03,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 05:51:16,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 05:51:19,611 INFO [train.py:1039] (3/4) Epoch 8, batch 3150, loss[loss=0.2231, simple_loss=0.2769, pruned_loss=0.08469, over 23768.00 frames. ], tot_loss[loss=0.2201, simple_loss=0.2866, pruned_loss=0.07683, over 4727479.04 frames. ], batch size: 179, lr: 1.29e-02, grad_scale: 8.0 2023-09-29 05:51:21,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:51:21,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:51:24,258 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:51:24,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:51:25,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 05:51:27,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:51:27,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 05:51:28,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 05:51:30,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:51:32,113 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 05:51:34,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=268960.0, ans=0.2 2023-09-29 05:51:35,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 05:51:35,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:51:36,703 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 05:51:38,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 05:51:38,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 05:51:38,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=268960.0, ans=0.0 2023-09-29 05:51:40,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 05:51:40,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 05:51:40,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:51:40,399 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:51:41,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:51:42,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=268960.0, ans=0.1 2023-09-29 05:51:42,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=268960.0, ans=0.2 2023-09-29 05:51:44,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 05:51:45,932 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.20 vs. limit=12.0 2023-09-29 05:51:47,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:51:47,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:51:47,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:51:51,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 05:51:54,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 05:51:54,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:51:55,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 05:51:57,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:51:57,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 05:51:59,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 05:52:00,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:52:02,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 05:52:02,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 05:52:02,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:52:02,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 05:52:03,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 05:52:03,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 05:52:05,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 05:52:05,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 05:52:05,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:07,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:52:07,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:52:09,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 05:52:09,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:52:12,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 05:52:12,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:13,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 05:52:15,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 05:52:15,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:52:17,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:52:17,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 05:52:19,704 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 05:52:21,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:52:23,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:52:25,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:26,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:52:30,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=269160.0, ans=10.0 2023-09-29 05:52:31,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:52:31,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:33,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=269160.0, ans=0.125 2023-09-29 05:52:34,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 05:52:35,510 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.692e+02 2.026e+02 2.271e+02 2.794e+02 4.211e+02, threshold=4.543e+02, percent-clipped=0.0 2023-09-29 05:52:38,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:52:38,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 05:52:42,196 INFO [train.py:1039] (3/4) Epoch 8, batch 3200, loss[loss=0.1841, simple_loss=0.2586, pruned_loss=0.0548, over 24622.00 frames. ], tot_loss[loss=0.2183, simple_loss=0.2845, pruned_loss=0.07603, over 4718547.69 frames. ], batch size: 60, lr: 1.29e-02, grad_scale: 16.0 2023-09-29 05:52:43,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:52:46,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:52:46,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 05:52:50,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:52:57,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:53:02,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:53:04,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=269293.3333333333, ans=0.125 2023-09-29 05:53:11,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:53:22,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 05:53:25,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:53:28,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 05:53:29,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 05:53:31,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:53:31,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:53:32,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:53:38,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 05:53:39,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 05:53:43,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 05:53:46,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 05:53:47,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:53:51,681 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=16.19 vs. limit=15.0 2023-09-29 05:53:52,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:53:53,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 05:53:53,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:53:54,003 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 05:53:54,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 05:53:59,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:54:00,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 05:54:00,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 05:54:02,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 05:54:03,747 INFO [train.py:1039] (3/4) Epoch 8, batch 3250, loss[loss=0.1985, simple_loss=0.275, pruned_loss=0.06094, over 24456.00 frames. ], tot_loss[loss=0.2185, simple_loss=0.2849, pruned_loss=0.07609, over 4725243.88 frames. ], batch size: 63, lr: 1.29e-02, grad_scale: 16.0 2023-09-29 05:54:03,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 05:54:05,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 05:54:09,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:54:11,080 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 05:54:11,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:54:11,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:12,749 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 05:54:17,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 05:54:19,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:54:20,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=269626.6666666667, ans=0.09899494936611666 2023-09-29 05:54:22,900 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.43 vs. limit=15.0 2023-09-29 05:54:26,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:54:26,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 05:54:28,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:54:28,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:54:28,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:54:29,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:54:30,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 05:54:34,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:34,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 05:54:34,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:54:34,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:34,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:35,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:54:38,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:54:39,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 05:54:42,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:54:42,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:54:44,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:54:45,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:54:45,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:54:49,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 05:54:49,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:54:49,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 05:54:52,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:54:52,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 05:55:00,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:55:08,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:55:09,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:55:09,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 05:55:09,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:55:09,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 05:55:09,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:55:13,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 05:55:13,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 05:55:13,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:55:14,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:55:16,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:55:16,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 05:55:16,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.min_positive, batch_count=269826.6666666667, ans=0.025 2023-09-29 05:55:18,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:55:19,807 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.498e+02 1.998e+02 2.232e+02 2.545e+02 3.931e+02, threshold=4.463e+02, percent-clipped=0.0 2023-09-29 05:55:21,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:55:23,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:55:23,394 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=269826.6666666667, ans=0.125 2023-09-29 05:55:24,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 05:55:24,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:55:25,839 INFO [train.py:1039] (3/4) Epoch 8, batch 3300, loss[loss=0.2303, simple_loss=0.2865, pruned_loss=0.08706, over 23588.00 frames. ], tot_loss[loss=0.2189, simple_loss=0.2853, pruned_loss=0.07621, over 4726844.29 frames. ], batch size: 256, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 05:55:27,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 05:55:27,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 05:55:30,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:55:30,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 05:55:32,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 05:55:33,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 05:55:33,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:55:34,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=269893.3333333333, ans=0.125 2023-09-29 05:55:38,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:55:40,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:55:40,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:55:43,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 05:55:43,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 05:55:45,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:55:46,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:55:50,202 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 05:55:50,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:55:51,652 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:55:51,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:55:53,872 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 05:55:55,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:55:55,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 05:55:57,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 05:55:57,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:55:57,145 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 05:55:57,593 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=270026.6666666667, ans=0.125 2023-09-29 05:56:00,895 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.35 vs. limit=6.0 2023-09-29 05:56:01,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:56:01,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 05:56:04,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:56:04,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 05:56:06,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 05:56:06,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:56:07,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 05:56:09,576 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 05:56:09,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=270026.6666666667, ans=0.125 2023-09-29 05:56:11,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 05:56:11,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:56:15,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 05:56:18,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:56:19,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 05:56:21,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:56:23,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:56:23,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:56:23,148 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:56:23,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:56:26,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 05:56:26,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:56:26,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:56:28,421 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 05:56:29,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 05:56:31,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 05:56:32,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:56:32,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:56:34,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:56:34,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:56:36,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 05:56:37,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:56:37,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 05:56:37,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:56:39,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 05:56:43,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 05:56:44,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:56:44,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:56:46,922 INFO [train.py:1039] (3/4) Epoch 8, batch 3350, loss[loss=0.203, simple_loss=0.2746, pruned_loss=0.0657, over 23763.00 frames. ], tot_loss[loss=0.2186, simple_loss=0.2854, pruned_loss=0.07583, over 4737182.97 frames. ], batch size: 85, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 05:56:47,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 05:56:47,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:56:49,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:56:52,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 05:56:52,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:56:52,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=270226.6666666667, ans=0.0 2023-09-29 05:56:55,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:56:57,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:56:58,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 05:56:59,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:02,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:57:04,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:57:04,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 05:57:05,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 05:57:08,667 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 05:57:08,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:57:11,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 05:57:11,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 05:57:14,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 05:57:14,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 05:57:16,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:57:16,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 05:57:16,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:16,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 05:57:16,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=270293.3333333333, ans=0.0 2023-09-29 05:57:18,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:22,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:57:22,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:23,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:57:25,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:57:28,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:57:28,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:57:34,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 05:57:36,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:57:37,900 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:57:37,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:57:39,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:57:40,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=270426.6666666667, ans=0.0 2023-09-29 05:57:40,396 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten.whitening_limit, batch_count=270426.6666666667, ans=15.0 2023-09-29 05:57:43,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 05:57:43,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 05:57:43,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 05:57:43,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 05:57:46,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 05:57:46,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:57:46,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:57:51,646 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.28 vs. limit=15.0 2023-09-29 05:57:53,954 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.28 vs. limit=15.0 2023-09-29 05:57:53,973 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.36 vs. limit=15.0 2023-09-29 05:57:55,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:57:56,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 05:57:56,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:57:58,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 05:57:59,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:58:02,776 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 2.038e+02 2.381e+02 2.842e+02 4.419e+02, threshold=4.763e+02, percent-clipped=0.0 2023-09-29 05:58:04,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:58:04,784 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=270493.3333333333, ans=0.125 2023-09-29 05:58:06,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 05:58:06,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 05:58:08,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 05:58:09,757 INFO [train.py:1039] (3/4) Epoch 8, batch 3400, loss[loss=0.2304, simple_loss=0.2874, pruned_loss=0.08675, over 23566.00 frames. ], tot_loss[loss=0.2205, simple_loss=0.287, pruned_loss=0.07705, over 4721489.69 frames. ], batch size: 256, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 05:58:09,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:58:09,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 05:58:11,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:58:12,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 05:58:13,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:58:14,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:58:14,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 05:58:16,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 05:58:16,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 05:58:21,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 05:58:22,528 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 05:58:22,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:58:25,915 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=270626.6666666667, ans=0.125 2023-09-29 05:58:27,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 05:58:27,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 05:58:29,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:58:29,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 05:58:37,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:58:39,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 05:58:44,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 05:58:46,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:58:46,345 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:58:48,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 05:58:54,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 05:58:56,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 05:59:03,845 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:59:05,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 05:59:05,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 05:59:07,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:59:07,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:59:09,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 05:59:09,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 05:59:11,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 05:59:14,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 05:59:14,350 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 05:59:20,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:59:21,803 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 05:59:26,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 05:59:31,894 INFO [train.py:1039] (3/4) Epoch 8, batch 3450, loss[loss=0.2067, simple_loss=0.268, pruned_loss=0.07272, over 22007.00 frames. ], tot_loss[loss=0.2196, simple_loss=0.2862, pruned_loss=0.07653, over 4723512.95 frames. ], batch size: 48, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 05:59:32,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 05:59:38,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 05:59:38,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 05:59:40,814 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 05:59:40,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 05:59:42,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 05:59:45,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 05:59:50,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 05:59:52,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 05:59:53,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 05:59:53,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 05:59:55,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:00:02,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 06:00:06,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 06:00:06,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 06:00:08,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:00:08,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:00:15,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 06:00:15,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:00:19,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=271026.6666666667, ans=0.1 2023-09-29 06:00:20,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:00:20,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:00:23,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:00:25,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:00:25,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=271093.3333333333, ans=0.1 2023-09-29 06:00:26,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 06:00:26,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:00:28,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:00:30,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=271093.3333333333, ans=0.0 2023-09-29 06:00:31,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:00:33,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 06:00:36,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:00:41,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:00:43,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:00:46,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:00:47,988 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 2.051e+02 2.293e+02 2.662e+02 4.290e+02, threshold=4.586e+02, percent-clipped=0.0 2023-09-29 06:00:51,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:00:51,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:00:52,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:00:52,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:00:54,251 INFO [train.py:1039] (3/4) Epoch 8, batch 3500, loss[loss=0.2439, simple_loss=0.3014, pruned_loss=0.09314, over 23603.00 frames. ], tot_loss[loss=0.2189, simple_loss=0.2848, pruned_loss=0.07647, over 4709272.58 frames. ], batch size: 149, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 06:00:56,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:00:59,619 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:00:59,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=271226.6666666667, ans=0.125 2023-09-29 06:01:01,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 06:01:03,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:01:06,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 06:01:09,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:01:09,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 06:01:14,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:01:15,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:01:16,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:01:17,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:01:17,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 06:01:17,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:19,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:01:19,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 06:01:22,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:22,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:01:24,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:01:28,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:28,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 06:01:29,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:01:32,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:01:34,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:01:34,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=271360.0, ans=0.125 2023-09-29 06:01:36,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:39,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:01:39,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:01:39,981 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.11 vs. limit=12.0 2023-09-29 06:01:40,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 06:01:42,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 06:01:42,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=271426.6666666667, ans=0.125 2023-09-29 06:01:43,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 06:01:43,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:01:45,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=271426.6666666667, ans=0.0 2023-09-29 06:01:46,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:01:46,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:01:46,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:01:47,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=271426.6666666667, ans=0.2 2023-09-29 06:01:49,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 06:01:52,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:01:57,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:01:57,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=271426.6666666667, ans=0.125 2023-09-29 06:01:58,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 06:01:58,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 06:01:58,782 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:02:02,625 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.24 vs. limit=15.0 2023-09-29 06:02:03,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:02:05,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:02:06,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:02:08,440 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 06:02:08,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:02:10,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:02:11,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 06:02:13,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 06:02:16,409 INFO [train.py:1039] (3/4) Epoch 8, batch 3550, loss[loss=0.1857, simple_loss=0.25, pruned_loss=0.0607, over 24371.00 frames. ], tot_loss[loss=0.2179, simple_loss=0.284, pruned_loss=0.07588, over 4713878.13 frames. ], batch size: 56, lr: 1.28e-02, grad_scale: 16.0 2023-09-29 06:02:16,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:02:18,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:02:18,256 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=271560.0, ans=0.125 2023-09-29 06:02:19,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:02:19,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:02:22,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:02:31,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:02:31,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=271626.6666666667, ans=0.125 2023-09-29 06:02:32,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 06:02:36,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:02:37,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:02:39,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:02:40,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:02:40,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:02:44,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:02:45,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:02:45,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:02:47,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 06:02:47,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:02:52,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:02:52,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:02:54,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:02:54,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:02:55,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:02:55,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 06:02:55,623 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:02:57,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:02:58,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 06:03:05,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:03:07,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:03:08,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:03:08,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 06:03:10,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:03:12,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 06:03:12,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:03:15,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:03:15,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:03:19,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 06:03:20,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:03:24,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:03:25,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 06:03:25,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:03:30,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:03:31,727 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 2.035e+02 2.264e+02 2.526e+02 3.446e+02, threshold=4.528e+02, percent-clipped=0.0 2023-09-29 06:03:31,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 06:03:39,312 INFO [train.py:1039] (3/4) Epoch 8, batch 3600, loss[loss=0.2335, simple_loss=0.2927, pruned_loss=0.0872, over 23677.00 frames. ], tot_loss[loss=0.2176, simple_loss=0.2837, pruned_loss=0.07576, over 4723949.15 frames. ], batch size: 232, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:03:40,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 06:03:41,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:03:41,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:03:41,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=271893.3333333333, ans=0.2 2023-09-29 06:03:44,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:03:44,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:03:46,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:03:48,694 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.91 vs. limit=22.5 2023-09-29 06:03:50,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:03:52,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:03:54,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:03:54,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:03:55,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:03:55,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 06:04:00,805 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:04:02,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:04:04,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:04:06,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:04:08,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:04:08,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:04:08,358 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 06:04:09,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:04:11,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:04:12,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=272026.6666666667, ans=0.0 2023-09-29 06:04:13,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:04:16,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:04:17,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:04:17,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=272026.6666666667, ans=0.125 2023-09-29 06:04:19,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:04:20,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 06:04:27,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:04:29,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:04:29,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 06:04:34,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:04:41,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:04:42,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:04:47,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 06:04:48,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:04:49,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 06:04:49,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 06:04:52,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 06:04:54,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:04:54,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:04:55,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 06:04:57,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:04:57,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:04:58,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:04:59,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 06:05:00,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 06:05:02,025 INFO [train.py:1039] (3/4) Epoch 8, batch 3650, loss[loss=0.2109, simple_loss=0.2892, pruned_loss=0.06632, over 24649.00 frames. ], tot_loss[loss=0.2186, simple_loss=0.2844, pruned_loss=0.07642, over 4707598.79 frames. ], batch size: 68, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:05:03,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:05:03,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 06:05:07,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 06:05:09,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:05:12,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 06:05:14,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 06:05:19,455 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:05:19,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:05:19,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:05:24,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 06:05:25,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:05:26,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 06:05:28,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:05:28,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:05:30,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 06:05:31,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 06:05:32,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:05:32,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:05:35,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:05:36,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 06:05:38,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 06:05:38,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=272360.0, ans=0.1 2023-09-29 06:05:39,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:05:41,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 06:05:43,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:05:43,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:05:48,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 06:05:49,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=272360.0, ans=0.2 2023-09-29 06:05:49,638 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.74 vs. limit=15.0 2023-09-29 06:05:50,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:05:50,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:05:51,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:05:53,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:05:55,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:05:57,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:05:58,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:05:58,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:06:00,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 06:06:01,023 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.92 vs. limit=15.0 2023-09-29 06:06:03,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:06:03,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:06:08,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=272493.3333333333, ans=0.125 2023-09-29 06:06:11,369 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 06:06:14,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:06:16,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:06:16,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:06:16,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:06:17,951 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.672e+02 2.110e+02 2.350e+02 2.595e+02 3.564e+02, threshold=4.700e+02, percent-clipped=0.0 2023-09-29 06:06:18,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 06:06:19,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:06:20,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=272493.3333333333, ans=0.125 2023-09-29 06:06:21,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 06:06:22,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:06:23,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 06:06:25,118 INFO [train.py:1039] (3/4) Epoch 8, batch 3700, loss[loss=0.2095, simple_loss=0.2923, pruned_loss=0.0633, over 24649.00 frames. ], tot_loss[loss=0.2206, simple_loss=0.2861, pruned_loss=0.07759, over 4697274.72 frames. ], batch size: 73, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:06:26,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:06:28,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:06:30,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:06:30,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 06:06:30,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:06:31,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:06:32,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:06:36,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:06:40,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:06:40,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=272626.6666666667, ans=0.2 2023-09-29 06:06:41,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:06:41,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:06:41,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:06:43,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 06:06:46,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:06:48,219 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 06:06:57,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:06:57,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 06:06:58,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:06:58,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 06:06:58,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:06:59,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=272693.3333333333, ans=0.0 2023-09-29 06:07:02,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:07:03,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 06:07:05,421 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:07:07,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:07:10,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:07:10,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:07:13,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:07:18,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:07:18,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 06:07:18,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:07:19,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 06:07:24,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:07:25,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:07:29,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:07:30,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 06:07:32,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:07:32,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 06:07:32,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:07:32,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:07:37,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:07:39,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 06:07:39,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 06:07:41,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:07:41,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:07:41,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=272826.6666666667, ans=0.125 2023-09-29 06:07:42,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:07:43,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:07:44,911 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=272826.6666666667, ans=0.125 2023-09-29 06:07:46,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:07:47,550 INFO [train.py:1039] (3/4) Epoch 8, batch 3750, loss[loss=0.2333, simple_loss=0.3001, pruned_loss=0.08327, over 23381.00 frames. ], tot_loss[loss=0.2201, simple_loss=0.286, pruned_loss=0.0771, over 4718298.48 frames. ], batch size: 93, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:07:47,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:07:47,921 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=272893.3333333333, ans=0.0 2023-09-29 06:07:49,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:07:52,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 06:07:54,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 06:07:56,820 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=13.07 vs. limit=22.5 2023-09-29 06:07:57,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:07:57,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 06:07:58,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:07:59,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=272893.3333333333, ans=0.125 2023-09-29 06:07:59,673 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.13 vs. limit=15.0 2023-09-29 06:08:00,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:08:01,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:08:03,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=272960.0, ans=0.125 2023-09-29 06:08:05,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:08:07,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:08:10,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:08:12,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:08:16,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:08:18,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:08:20,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 06:08:20,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:08:22,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:08:22,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:08:24,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=273026.6666666667, ans=0.125 2023-09-29 06:08:25,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 06:08:30,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 06:08:31,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:08:31,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:08:32,401 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=273026.6666666667, ans=0.125 2023-09-29 06:08:33,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:08:36,186 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=2.83 vs. limit=12.0 2023-09-29 06:08:37,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:08:39,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=273093.3333333333, ans=0.0 2023-09-29 06:08:41,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 06:08:44,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 06:08:45,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:08:49,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:08:51,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:08:54,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:08:57,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 06:08:59,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:09:01,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:09:02,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:09:04,243 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.271e+02 2.610e+02 3.277e+02 5.264e+02, threshold=5.220e+02, percent-clipped=1.0 2023-09-29 06:09:05,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 06:09:11,161 INFO [train.py:1039] (3/4) Epoch 8, batch 3800, loss[loss=0.2108, simple_loss=0.2852, pruned_loss=0.0682, over 24485.00 frames. ], tot_loss[loss=0.2197, simple_loss=0.2856, pruned_loss=0.07692, over 4719665.64 frames. ], batch size: 66, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:09:16,399 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:09:19,663 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=273226.6666666667, ans=0.2 2023-09-29 06:09:20,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:09:22,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 06:09:22,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 06:09:24,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:09:27,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:09:27,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 06:09:30,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 06:09:30,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:09:30,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:09:31,297 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:09:32,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:09:32,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:09:32,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:09:34,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 06:09:39,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 06:09:39,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:09:41,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:09:45,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:09:47,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:09:49,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 06:09:49,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:09:51,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:09:51,298 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=273360.0, ans=0.2 2023-09-29 06:09:52,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:09:52,823 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=273360.0, ans=0.125 2023-09-29 06:09:57,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 06:09:57,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 06:10:00,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:10:01,638 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=14.31 vs. limit=15.0 2023-09-29 06:10:05,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:10:11,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:10:14,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 06:10:15,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 06:10:15,933 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:10:17,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=273493.3333333333, ans=0.0 2023-09-29 06:10:19,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:10:20,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:10:21,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 06:10:23,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=273493.3333333333, ans=0.1 2023-09-29 06:10:24,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 06:10:24,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 06:10:26,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:10:27,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:10:30,817 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=9.83 vs. limit=15.0 2023-09-29 06:10:33,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:10:34,259 INFO [train.py:1039] (3/4) Epoch 8, batch 3850, loss[loss=0.2188, simple_loss=0.2955, pruned_loss=0.07108, over 24068.00 frames. ], tot_loss[loss=0.2184, simple_loss=0.2844, pruned_loss=0.07623, over 4714288.61 frames. ], batch size: 80, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:10:34,411 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:10:37,839 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:10:40,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:10:42,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 06:10:42,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:10:44,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:10:47,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 06:10:48,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:10:50,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 06:10:50,851 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=273626.6666666667, ans=0.1 2023-09-29 06:10:53,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 06:11:00,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:01,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:11:05,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:11:05,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:11:10,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:10,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:11:11,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:11:11,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:11:13,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:14,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:15,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:15,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:11:16,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 06:11:16,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 06:11:16,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=273693.3333333333, ans=0.2 2023-09-29 06:11:18,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:11:18,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:21,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:11:21,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:21,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 06:11:24,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 06:11:26,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:11:28,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 06:11:30,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 06:11:35,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:11:37,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:11:43,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:11:43,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 06:11:44,098 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=273826.6666666667, ans=0.07 2023-09-29 06:11:45,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 06:11:47,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=273826.6666666667, ans=0.0 2023-09-29 06:11:48,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:11:48,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:11:50,377 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.736e+02 2.044e+02 2.316e+02 2.829e+02 5.158e+02, threshold=4.631e+02, percent-clipped=0.0 2023-09-29 06:11:52,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:11:52,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 06:11:52,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:53,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:53,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:11:53,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 06:11:55,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:11:56,716 INFO [train.py:1039] (3/4) Epoch 8, batch 3900, loss[loss=0.2258, simple_loss=0.2966, pruned_loss=0.07748, over 24361.00 frames. ], tot_loss[loss=0.2179, simple_loss=0.2838, pruned_loss=0.076, over 4719684.24 frames. ], batch size: 77, lr: 1.28e-02, grad_scale: 32.0 2023-09-29 06:11:56,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 06:11:56,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:11:56,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:11:58,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:11:59,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:12:02,046 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:12:02,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:12:02,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:12:04,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:12:04,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 06:12:05,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:12:08,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:12:10,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 06:12:10,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:12:11,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:12:13,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 06:12:15,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:12:16,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:12:18,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 06:12:18,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:12:19,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 06:12:20,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:12:21,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 06:12:23,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 06:12:28,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:12:28,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=274026.6666666667, ans=0.125 2023-09-29 06:12:28,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=274026.6666666667, ans=0.1 2023-09-29 06:12:29,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:12:30,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:12:31,024 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.40 vs. limit=22.5 2023-09-29 06:12:31,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:12:34,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:12:37,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:12:39,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:12:39,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:12:39,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:12:45,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:12:45,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:12:53,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:12:54,417 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.07 vs. limit=10.0 2023-09-29 06:12:55,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:13:05,019 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:13:08,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:13:08,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 06:13:08,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 06:13:08,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:13:08,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=274160.0, ans=0.1 2023-09-29 06:13:10,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 06:13:12,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:13:13,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 06:13:20,400 INFO [train.py:1039] (3/4) Epoch 8, batch 3950, loss[loss=0.2235, simple_loss=0.2875, pruned_loss=0.07975, over 23642.00 frames. ], tot_loss[loss=0.2178, simple_loss=0.2834, pruned_loss=0.07605, over 4711264.73 frames. ], batch size: 135, lr: 1.27e-02, grad_scale: 32.0 2023-09-29 06:13:20,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=274226.6666666667, ans=0.125 2023-09-29 06:13:22,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:13:23,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 06:13:23,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:13:25,867 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.88 vs. limit=15.0 2023-09-29 06:13:26,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:13:29,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:13:35,293 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.38 vs. limit=15.0 2023-09-29 06:13:36,121 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 06:13:36,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:13:36,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 06:13:37,588 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 06:13:37,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:13:40,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:13:40,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:13:40,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:13:45,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 06:13:47,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:13:49,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:13:49,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:13:50,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:13:50,811 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=274360.0, ans=0.125 2023-09-29 06:13:52,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:13:53,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=274360.0, ans=0.125 2023-09-29 06:14:03,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:14:03,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:14:07,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 06:14:12,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 06:14:12,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 06:14:14,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:14:14,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=274426.6666666667, ans=0.0 2023-09-29 06:14:15,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:14:24,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:14:25,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:14:26,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:14:26,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:14:26,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 06:14:31,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:14:33,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:14:34,844 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=274493.3333333333, ans=0.125 2023-09-29 06:14:36,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 06:14:37,529 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.764e+02 2.191e+02 2.483e+02 2.950e+02 4.567e+02, threshold=4.966e+02, percent-clipped=0.0 2023-09-29 06:14:42,757 INFO [train.py:1039] (3/4) Epoch 8, batch 4000, loss[loss=0.2246, simple_loss=0.285, pruned_loss=0.0821, over 23805.00 frames. ], tot_loss[loss=0.2183, simple_loss=0.2839, pruned_loss=0.07634, over 4719158.19 frames. ], batch size: 212, lr: 1.27e-02, grad_scale: 32.0 2023-09-29 06:14:46,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:14:55,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:14:55,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=274560.0, ans=0.125 2023-09-29 06:15:01,476 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.83 vs. limit=15.0 2023-09-29 06:15:02,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:15:02,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:15:02,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:15:03,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 06:15:03,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 06:15:04,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 06:15:06,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:15:06,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 06:15:07,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:15:11,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:15:11,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:15:11,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:15:12,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:15:12,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 06:15:14,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:15:15,941 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 06:15:17,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:15:17,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:15:21,121 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 06:15:22,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 06:15:22,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:15:27,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=274693.3333333333, ans=0.125 2023-09-29 06:15:28,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 06:15:29,974 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=274693.3333333333, ans=0.0 2023-09-29 06:15:31,098 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:15:32,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:15:34,657 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 06:15:36,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:15:36,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 06:15:36,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:15:36,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:15:37,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:15:39,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:15:41,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:15:41,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:15:43,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 06:15:43,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:15:43,346 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=274760.0, ans=0.1 2023-09-29 06:15:46,069 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 06:15:52,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:15:54,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 06:15:54,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=274826.6666666667, ans=0.125 2023-09-29 06:15:57,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:15:57,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:15:58,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:16:01,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:16:04,760 INFO [train.py:1039] (3/4) Epoch 8, batch 4050, loss[loss=0.1833, simple_loss=0.259, pruned_loss=0.05374, over 24329.00 frames. ], tot_loss[loss=0.2174, simple_loss=0.2836, pruned_loss=0.07561, over 4729758.82 frames. ], batch size: 61, lr: 1.27e-02, grad_scale: 32.0 2023-09-29 06:16:08,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:16:11,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 06:16:11,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 06:16:13,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:16:15,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:16:17,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:16:18,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:16:18,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:16:22,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:16:26,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:16:26,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 06:16:30,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:16:30,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:16:32,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:16:35,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:16:37,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=275026.6666666667, ans=0.125 2023-09-29 06:16:38,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 06:16:38,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 06:16:39,946 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 06:16:41,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:16:44,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=275026.6666666667, ans=0.125 2023-09-29 06:16:48,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 06:16:50,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:16:55,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:16:58,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:17:00,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:17:00,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:17:02,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=275093.3333333333, ans=0.125 2023-09-29 06:17:03,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:17:05,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=275093.3333333333, ans=0.125 2023-09-29 06:17:06,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 06:17:06,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 06:17:08,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:17:08,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 06:17:13,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:17:21,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 06:17:21,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=275160.0, ans=0.035 2023-09-29 06:17:23,490 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 2.014e+02 2.168e+02 2.447e+02 3.458e+02, threshold=4.336e+02, percent-clipped=0.0 2023-09-29 06:17:23,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:17:23,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:17:27,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 06:17:27,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 06:17:27,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:17:28,553 INFO [train.py:1039] (3/4) Epoch 8, batch 4100, loss[loss=0.2381, simple_loss=0.2961, pruned_loss=0.09001, over 22686.00 frames. ], tot_loss[loss=0.2187, simple_loss=0.285, pruned_loss=0.0762, over 4727747.74 frames. ], batch size: 322, lr: 1.27e-02, grad_scale: 32.0 2023-09-29 06:17:28,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:17:30,284 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:17:30,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:17:36,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 06:17:36,883 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:17:38,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 06:17:38,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 06:17:40,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 06:17:40,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:17:40,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:17:40,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:17:40,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:17:42,189 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 06:17:44,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:17:46,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:17:46,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:17:47,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:17:51,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:17:53,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:17:53,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:17:53,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 06:17:55,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=275293.3333333333, ans=0.07 2023-09-29 06:17:57,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:17:57,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:17:57,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:17:57,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:17:59,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 06:18:00,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:18:02,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 06:18:03,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:18:06,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:18:06,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 06:18:08,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:18:08,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:18:08,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:18:11,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 06:18:14,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:18:14,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:18:16,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 06:18:18,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:18:18,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:18:23,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:18:23,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=275426.6666666667, ans=0.1 2023-09-29 06:18:27,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:18:31,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:18:31,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:18:32,235 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.00 vs. limit=10.0 2023-09-29 06:18:42,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:18:42,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:18:45,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:18:47,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:18:50,471 INFO [train.py:1039] (3/4) Epoch 8, batch 4150, loss[loss=0.2059, simple_loss=0.2867, pruned_loss=0.06256, over 24651.00 frames. ], tot_loss[loss=0.2192, simple_loss=0.2852, pruned_loss=0.07664, over 4717640.07 frames. ], batch size: 73, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:18:50,856 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:18:53,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:18:54,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:18:54,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:18:56,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=275560.0, ans=0.0 2023-09-29 06:18:57,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 06:18:58,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:18:58,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 06:18:59,183 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=275560.0, ans=0.125 2023-09-29 06:19:00,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 06:19:00,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 06:19:02,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:19:06,672 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.63 vs. limit=15.0 2023-09-29 06:19:08,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:19:08,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:19:12,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=275626.6666666667, ans=0.125 2023-09-29 06:19:13,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:19:14,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:19:14,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:19:16,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:19:16,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:19:16,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=275626.6666666667, ans=0.0 2023-09-29 06:19:17,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 06:19:21,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:19:21,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=275693.3333333333, ans=0.125 2023-09-29 06:19:25,079 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:19:28,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 06:19:30,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 06:19:30,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:19:30,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 06:19:30,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:19:30,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:19:33,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:19:34,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:19:39,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 06:19:42,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 06:19:43,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:19:45,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 06:19:45,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:19:47,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 06:19:48,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:19:50,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:19:51,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:19:51,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 06:19:51,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:19:51,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 06:19:55,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 06:19:56,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 06:19:56,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:19:56,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:19:57,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:19:57,635 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.02 vs. limit=22.5 2023-09-29 06:19:58,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 06:20:00,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:20:00,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 06:20:01,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:20:03,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:20:03,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 06:20:03,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 06:20:03,837 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_na.min_abs, batch_count=275826.6666666667, ans=0.02 2023-09-29 06:20:08,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=275826.6666666667, ans=0.125 2023-09-29 06:20:09,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:20:11,682 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 2.232e+02 3.048e+02 3.830e+02 6.363e+02, threshold=6.096e+02, percent-clipped=13.0 2023-09-29 06:20:13,768 INFO [train.py:1039] (3/4) Epoch 8, batch 4200, loss[loss=0.2186, simple_loss=0.2991, pruned_loss=0.069, over 24093.00 frames. ], tot_loss[loss=0.2178, simple_loss=0.2839, pruned_loss=0.07589, over 4710505.22 frames. ], batch size: 80, lr: 1.27e-02, grad_scale: 8.0 2023-09-29 06:20:13,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 06:20:15,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:20:15,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=275893.3333333333, ans=0.0 2023-09-29 06:20:17,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:20:19,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:20:20,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:20:20,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:20:23,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 06:20:26,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 06:20:26,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:20:29,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:20:31,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:20:35,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 06:20:38,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:20:38,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:20:38,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 06:20:38,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:20:38,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:20:40,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:20:40,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:20:42,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:20:43,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 06:20:43,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:20:48,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 06:20:50,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:20:53,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:20:53,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:20:55,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:20:55,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 06:20:55,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:20:58,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:21:03,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:21:05,764 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:21:13,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:21:16,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 06:21:18,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:21:23,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 06:21:25,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:21:27,387 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.94 vs. limit=15.0 2023-09-29 06:21:28,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 06:21:32,673 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:21:35,631 INFO [train.py:1039] (3/4) Epoch 8, batch 4250, loss[loss=0.2455, simple_loss=0.3114, pruned_loss=0.08983, over 23800.00 frames. ], tot_loss[loss=0.2173, simple_loss=0.2831, pruned_loss=0.0758, over 4711956.29 frames. ], batch size: 85, lr: 1.27e-02, grad_scale: 8.0 2023-09-29 06:21:37,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:21:37,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 06:21:41,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:21:41,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=276226.6666666667, ans=0.1 2023-09-29 06:21:45,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:21:47,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 06:21:47,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:21:52,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:21:55,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:21:59,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:21:59,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:22:01,630 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.71 vs. limit=15.0 2023-09-29 06:22:02,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:22:02,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:22:05,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:22:05,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:22:07,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:22:08,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:22:10,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:22:12,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 06:22:17,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 06:22:17,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:22:17,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:22:17,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:22:18,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:22:20,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:22:20,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:22:24,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 06:22:24,706 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.50 vs. limit=12.0 2023-09-29 06:22:25,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:22:28,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:22:29,094 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=276426.6666666667, ans=0.09899494936611666 2023-09-29 06:22:30,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:22:32,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 06:22:32,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:22:33,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 06:22:35,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:22:36,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:22:38,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:22:38,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:22:38,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=276426.6666666667, ans=0.2 2023-09-29 06:22:40,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=276493.3333333333, ans=0.0 2023-09-29 06:22:41,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 06:22:43,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 06:22:43,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:22:46,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:22:50,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:22:51,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:22:53,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:22:53,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:22:55,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:22:56,719 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 2.004e+02 2.225e+02 2.717e+02 4.251e+02, threshold=4.450e+02, percent-clipped=0.0 2023-09-29 06:22:56,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:22:56,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 06:22:57,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:22:58,439 INFO [train.py:1039] (3/4) Epoch 8, batch 4300, loss[loss=0.2189, simple_loss=0.2736, pruned_loss=0.08207, over 23663.00 frames. ], tot_loss[loss=0.2164, simple_loss=0.2824, pruned_loss=0.07524, over 4718760.74 frames. ], batch size: 232, lr: 1.27e-02, grad_scale: 8.0 2023-09-29 06:23:05,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:23:05,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:23:08,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:23:16,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:23:16,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 06:23:16,398 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:23:20,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:23:21,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:23:21,412 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 06:23:25,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 06:23:27,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:23:30,403 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.03 vs. limit=15.0 2023-09-29 06:23:30,995 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 06:23:31,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:23:31,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 06:23:31,245 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=276693.3333333333, ans=0.125 2023-09-29 06:23:32,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 06:23:35,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:23:38,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:23:38,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:23:40,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:23:43,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:23:45,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:23:45,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 06:23:45,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 06:23:47,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:23:51,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:23:51,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 06:23:51,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:23:51,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:23:51,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 06:23:51,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 06:23:52,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 06:23:53,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:23:53,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 06:23:53,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 06:23:57,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:23:59,278 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 06:24:00,828 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:24:04,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:24:04,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:24:05,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 06:24:07,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:24:07,305 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:24:08,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:24:08,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:24:08,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:24:11,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:24:13,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=276826.6666666667, ans=0.07 2023-09-29 06:24:15,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:24:15,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=276826.6666666667, ans=0.125 2023-09-29 06:24:16,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:24:16,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:24:20,328 INFO [train.py:1039] (3/4) Epoch 8, batch 4350, loss[loss=0.2026, simple_loss=0.2744, pruned_loss=0.06539, over 24444.00 frames. ], tot_loss[loss=0.2165, simple_loss=0.283, pruned_loss=0.07502, over 4726076.06 frames. ], batch size: 63, lr: 1.27e-02, grad_scale: 8.0 2023-09-29 06:24:22,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 06:24:22,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:24:26,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:24:31,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:24:33,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:24:33,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:24:39,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:24:41,881 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=276960.0, ans=0.125 2023-09-29 06:24:44,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:24:47,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:24:47,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:24:49,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:24:52,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:24:53,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:24:57,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 06:24:57,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:24:58,094 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=277026.6666666667, ans=0.0 2023-09-29 06:24:59,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:05,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:07,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 06:25:11,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:25:11,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:25:13,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=277093.3333333333, ans=0.125 2023-09-29 06:25:15,968 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 06:25:17,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:25:17,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:25:18,962 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 06:25:20,348 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 06:25:20,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:25:20,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:25:21,020 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.52 vs. limit=22.5 2023-09-29 06:25:21,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:25:23,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:25:23,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:25:23,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:25:27,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 06:25:27,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:27,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:25:28,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:28,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 06:25:30,087 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 06:25:30,094 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 06:25:30,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 06:25:33,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:25:33,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:25:35,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:25:35,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:25:38,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 06:25:39,666 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.760e+02 2.095e+02 2.321e+02 2.736e+02 4.922e+02, threshold=4.641e+02, percent-clipped=1.0 2023-09-29 06:25:41,152 INFO [train.py:1039] (3/4) Epoch 8, batch 4400, loss[loss=0.2192, simple_loss=0.2713, pruned_loss=0.08351, over 23666.00 frames. ], tot_loss[loss=0.2188, simple_loss=0.2844, pruned_loss=0.07658, over 4721731.52 frames. ], batch size: 149, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:25:41,268 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 06:25:41,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:46,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:25:46,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:48,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:25:51,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 06:25:51,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 06:25:51,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 06:25:51,602 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 06:25:53,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:25:53,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:25:54,713 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 06:25:56,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:25:57,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:25:59,296 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 06:26:00,959 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:26:00,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 06:26:02,784 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 06:26:06,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 06:26:06,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 06:26:06,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 06:26:06,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:26:08,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:26:08,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:26:08,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:26:10,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 06:26:11,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 06:26:12,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:26:15,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:26:15,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:26:18,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:26:18,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:26:18,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 06:26:18,787 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 06:26:23,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:26:29,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:26:33,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 06:26:38,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:26:40,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:26:42,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:26:42,278 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=277426.6666666667, ans=0.125 2023-09-29 06:26:44,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 06:26:44,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:26:44,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:26:44,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:26:45,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:26:46,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=277493.3333333333, ans=0.035 2023-09-29 06:26:50,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 06:26:54,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 06:26:55,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 06:26:56,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:26:56,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 06:26:56,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:26:59,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:27:00,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 06:27:03,788 INFO [train.py:1039] (3/4) Epoch 8, batch 4450, loss[loss=0.2411, simple_loss=0.3165, pruned_loss=0.08283, over 24069.00 frames. ], tot_loss[loss=0.2192, simple_loss=0.2855, pruned_loss=0.0764, over 4727461.38 frames. ], batch size: 80, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:27:03,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:27:06,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:27:08,411 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:27:15,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:27:16,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:27:19,405 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:27:20,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:27:22,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:27:24,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=277626.6666666667, ans=0.125 2023-09-29 06:27:27,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:27:27,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:27:27,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 06:27:27,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:27:27,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:27:27,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:27:27,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:27:29,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=277626.6666666667, ans=0.5 2023-09-29 06:27:30,889 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 06:27:36,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:27:36,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:27:38,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:27:38,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:27:40,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:27:45,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 06:27:45,606 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:27:46,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 06:27:46,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 06:27:46,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:27:48,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=277693.3333333333, ans=0.125 2023-09-29 06:27:51,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:27:53,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 06:27:54,463 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.53 vs. limit=15.0 2023-09-29 06:27:56,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:28:00,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:28:02,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 06:28:02,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:28:02,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:28:02,319 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:28:02,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:28:05,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:28:08,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:28:08,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 06:28:10,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:28:11,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:28:13,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:28:14,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:28:16,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 06:28:19,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:28:21,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 06:28:24,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=277826.6666666667, ans=0.125 2023-09-29 06:28:24,956 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 2.112e+02 2.512e+02 3.151e+02 6.272e+02, threshold=5.024e+02, percent-clipped=2.0 2023-09-29 06:28:25,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:28:26,472 INFO [train.py:1039] (3/4) Epoch 8, batch 4500, loss[loss=0.2942, simple_loss=0.3336, pruned_loss=0.1274, over 19540.00 frames. ], tot_loss[loss=0.2203, simple_loss=0.2862, pruned_loss=0.07722, over 4717327.10 frames. ], batch size: 388, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:28:28,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:28:29,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 06:28:29,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 06:28:32,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:28:39,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:28:39,902 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:28:41,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:28:41,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:28:41,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:28:42,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:28:49,874 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=277960.0, ans=0.125 2023-09-29 06:28:53,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:28:55,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:28:57,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:28:57,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:28:59,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:29:05,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 06:29:11,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:29:14,458 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:29:15,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:29:18,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:29:20,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 06:29:20,283 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:29:21,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:29:23,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:29:23,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:29:26,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:29:26,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 06:29:26,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:29:26,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:29:31,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:29:31,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:29:34,738 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:29:37,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:29:37,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:29:39,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 06:29:39,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 06:29:39,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 06:29:45,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 06:29:48,376 INFO [train.py:1039] (3/4) Epoch 8, batch 4550, loss[loss=0.2031, simple_loss=0.2836, pruned_loss=0.06131, over 24315.00 frames. ], tot_loss[loss=0.2189, simple_loss=0.2849, pruned_loss=0.07646, over 4712562.33 frames. ], batch size: 74, lr: 1.27e-02, grad_scale: 16.0 2023-09-29 06:29:48,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 06:29:49,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:29:50,914 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.12 vs. limit=15.0 2023-09-29 06:29:52,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:29:53,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:29:56,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:30:00,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:30:04,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:30:06,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:30:06,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:30:06,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:09,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:30:09,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:30:12,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:30:16,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 06:30:18,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 06:30:18,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:30:21,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 06:30:22,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 06:30:24,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:30:27,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 06:30:29,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:30:32,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:32,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:32,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:30:36,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 06:30:39,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:30:40,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:40,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:30:44,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:30:44,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 06:30:44,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 06:30:45,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:30:45,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 06:30:49,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 06:30:49,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:30:51,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:30:51,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:30:53,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:30:53,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:30:55,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:30:55,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 06:30:56,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:30:56,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 06:30:58,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 06:30:58,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:30:58,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 06:31:00,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=278493.3333333333, ans=0.0 2023-09-29 06:31:01,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:31:01,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:31:04,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:31:04,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:31:04,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 06:31:05,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:31:08,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=278493.3333333333, ans=0.0 2023-09-29 06:31:09,474 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.931e+02 2.102e+02 2.382e+02 3.783e+02, threshold=4.205e+02, percent-clipped=0.0 2023-09-29 06:31:09,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:31:11,125 INFO [train.py:1039] (3/4) Epoch 8, batch 4600, loss[loss=0.2107, simple_loss=0.2827, pruned_loss=0.06938, over 23413.00 frames. ], tot_loss[loss=0.217, simple_loss=0.2828, pruned_loss=0.0756, over 4705004.96 frames. ], batch size: 93, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:31:11,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:12,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:31:16,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:31:16,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:31:16,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:31:16,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 06:31:19,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:31:24,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:31:24,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:31:27,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:31,136 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=278626.6666666667, ans=0.0 2023-09-29 06:31:34,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 06:31:35,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:38,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:39,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=278626.6666666667, ans=0.125 2023-09-29 06:31:41,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:31:42,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:31:48,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 06:31:48,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 06:31:50,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:31:51,409 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.12 vs. limit=15.0 2023-09-29 06:31:55,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:31:55,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:31:58,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:32:01,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 06:32:02,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 06:32:07,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:09,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:32:09,353 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:32:11,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:11,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 06:32:12,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:32:12,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 06:32:12,295 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=278760.0, ans=0.2 2023-09-29 06:32:13,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:15,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:32:15,733 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=278826.6666666667, ans=0.125 2023-09-29 06:32:17,437 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=278826.6666666667, ans=0.07 2023-09-29 06:32:18,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:18,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:32:18,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:32:20,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 06:32:20,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 06:32:20,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 06:32:20,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:32:21,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:32:21,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:32:23,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:32:34,340 INFO [train.py:1039] (3/4) Epoch 8, batch 4650, loss[loss=0.241, simple_loss=0.2765, pruned_loss=0.1027, over 19219.00 frames. ], tot_loss[loss=0.2175, simple_loss=0.2835, pruned_loss=0.0758, over 4703257.71 frames. ], batch size: 388, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:32:35,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:32:38,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:32:40,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:32:40,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:32:41,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:32:41,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:32:43,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:32:46,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 06:32:50,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:32:52,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=278960.0, ans=0.125 2023-09-29 06:32:53,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 06:32:53,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:32:53,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 06:32:54,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:32:54,746 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 06:32:54,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 06:32:54,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:32:56,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:32:57,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:33:01,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:33:01,466 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 06:33:04,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:33:06,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 06:33:08,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:33:08,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:33:11,534 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 06:33:13,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:33:16,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:33:18,150 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:33:24,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:33:28,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:33:28,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:33:28,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:33:31,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 06:33:31,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 06:33:31,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 06:33:31,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 06:33:33,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:33:39,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:33:39,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:33:39,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 06:33:39,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:33:43,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:33:43,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:33:43,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:33:45,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=279160.0, ans=0.125 2023-09-29 06:33:46,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:33:46,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:33:47,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:33:52,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:33:52,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:33:52,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:33:53,553 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 2.046e+02 2.215e+02 2.491e+02 3.733e+02, threshold=4.429e+02, percent-clipped=0.0 2023-09-29 06:33:53,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 06:33:55,133 INFO [train.py:1039] (3/4) Epoch 8, batch 4700, loss[loss=0.2577, simple_loss=0.3063, pruned_loss=0.1046, over 19591.00 frames. ], tot_loss[loss=0.2179, simple_loss=0.2837, pruned_loss=0.07601, over 4692568.13 frames. ], batch size: 388, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:33:55,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:33:57,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 06:33:59,323 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=279226.6666666667, ans=0.125 2023-09-29 06:34:01,295 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.99 vs. limit=22.5 2023-09-29 06:34:05,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:34:07,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:34:07,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:34:09,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:34:09,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 06:34:16,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 06:34:17,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 06:34:19,259 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:34:19,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=279293.3333333333, ans=0.125 2023-09-29 06:34:22,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:34:22,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:34:23,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:34:28,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 06:34:30,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 06:34:33,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:34:43,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 06:34:43,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=279426.6666666667, ans=0.0 2023-09-29 06:34:44,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:34:46,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:34:51,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 06:34:53,032 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:34:56,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:34:57,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 06:34:58,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=279426.6666666667, ans=0.0 2023-09-29 06:34:59,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:34:59,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:35:02,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:35:02,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:35:02,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 06:35:02,462 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 06:35:04,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=279493.3333333333, ans=0.125 2023-09-29 06:35:05,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:35:07,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:35:07,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:35:07,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 06:35:08,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:35:12,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 06:35:15,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:35:15,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:35:17,385 INFO [train.py:1039] (3/4) Epoch 8, batch 4750, loss[loss=0.2176, simple_loss=0.295, pruned_loss=0.07008, over 24353.00 frames. ], tot_loss[loss=0.2179, simple_loss=0.2841, pruned_loss=0.07587, over 4702661.47 frames. ], batch size: 77, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:35:21,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:35:21,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:35:24,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 06:35:24,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:35:25,038 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.99 vs. limit=15.0 2023-09-29 06:35:26,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=279560.0, ans=0.04949747468305833 2023-09-29 06:35:27,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 06:35:29,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:35:30,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:35:31,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:35:38,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 06:35:42,460 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:35:45,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 06:35:46,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:35:50,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:35:50,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:35:50,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:35:51,638 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 06:35:51,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 06:36:00,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 06:36:04,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:36:06,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:36:09,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:36:09,723 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 06:36:09,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:36:12,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:36:15,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:36:16,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 06:36:17,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 06:36:17,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:36:18,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:36:18,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:36:21,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:36:22,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 06:36:22,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 06:36:25,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:36:26,618 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.40 vs. limit=22.5 2023-09-29 06:36:27,336 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:36:27,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 06:36:27,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:36:27,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=279826.6666666667, ans=0.125 2023-09-29 06:36:27,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=279826.6666666667, ans=0.125 2023-09-29 06:36:29,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:36:30,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:36:30,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=279826.6666666667, ans=0.0 2023-09-29 06:36:32,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:36:34,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 06:36:38,016 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 2.176e+02 2.410e+02 2.744e+02 3.912e+02, threshold=4.820e+02, percent-clipped=0.0 2023-09-29 06:36:38,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:36:39,645 INFO [train.py:1039] (3/4) Epoch 8, batch 4800, loss[loss=0.2403, simple_loss=0.3102, pruned_loss=0.08521, over 23996.00 frames. ], tot_loss[loss=0.2172, simple_loss=0.284, pruned_loss=0.0752, over 4721855.77 frames. ], batch size: 80, lr: 1.26e-02, grad_scale: 32.0 2023-09-29 06:36:39,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 06:36:41,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 06:36:42,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 06:36:44,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:36:44,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:36:45,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 06:36:51,740 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:36:52,720 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.37 vs. limit=6.0 2023-09-29 06:36:53,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:36:59,040 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.39 vs. limit=12.0 2023-09-29 06:36:59,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:36:59,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:36:59,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:37:01,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 06:37:01,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:37:03,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:37:04,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:37:08,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:10,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:37:10,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:37:12,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:37:12,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 06:37:12,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:37:12,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:37:16,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:37:17,595 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.00 vs. limit=15.0 2023-09-29 06:37:18,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:37:21,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:37:21,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:37:21,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 06:37:23,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:37:24,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 06:37:24,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 06:37:26,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:37:26,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:37:26,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:37:26,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:37:26,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:37:29,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:37:29,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:37:33,259 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:37:36,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:39,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:37:43,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 06:37:43,736 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:37:43,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=280160.0, ans=0.125 2023-09-29 06:37:45,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:45,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:37:46,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:37:49,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:37:50,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:37:50,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:51,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:37:51,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:37:53,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:37:54,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:37:54,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=280160.0, ans=0.0 2023-09-29 06:37:56,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:37:56,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:37:57,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 06:38:00,674 INFO [train.py:1039] (3/4) Epoch 8, batch 4850, loss[loss=0.2145, simple_loss=0.2974, pruned_loss=0.06581, over 24308.00 frames. ], tot_loss[loss=0.2176, simple_loss=0.2845, pruned_loss=0.07535, over 4710776.66 frames. ], batch size: 74, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:38:00,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 06:38:00,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:38:00,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:38:01,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:38:01,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:38:05,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:38:13,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 06:38:16,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:38:17,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=280293.3333333333, ans=0.0 2023-09-29 06:38:19,042 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.39 vs. limit=15.0 2023-09-29 06:38:21,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:38:22,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 06:38:22,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:38:26,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:38:28,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:38:29,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:38:29,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 06:38:29,428 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:38:32,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:38:34,530 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.04 vs. limit=15.0 2023-09-29 06:38:35,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:38:37,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 06:38:37,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:38:37,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 06:38:40,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:38:40,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:38:45,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:38:45,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 06:38:45,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 06:38:46,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 06:38:47,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=280360.0, ans=0.1 2023-09-29 06:38:55,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:38:55,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 06:38:57,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:38:57,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:38:58,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:39:01,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 06:39:01,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:39:03,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 06:39:03,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:39:03,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:39:04,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 06:39:14,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:39:19,609 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:39:19,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:39:23,660 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 2.212e+02 2.561e+02 3.191e+02 4.940e+02, threshold=5.123e+02, percent-clipped=1.0 2023-09-29 06:39:23,715 INFO [train.py:1039] (3/4) Epoch 8, batch 4900, loss[loss=0.2129, simple_loss=0.267, pruned_loss=0.07939, over 23702.00 frames. ], tot_loss[loss=0.2168, simple_loss=0.283, pruned_loss=0.07527, over 4706685.01 frames. ], batch size: 232, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:39:25,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 06:39:25,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:39:30,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:39:32,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:39:33,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:39:33,650 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:39:37,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 06:39:42,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 06:39:45,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 06:39:46,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 06:39:46,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:39:48,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:39:48,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:39:48,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:39:49,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:39:49,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 06:39:52,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 06:39:52,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:39:54,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:39:54,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:40:00,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:40:00,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:40:01,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:40:01,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 06:40:02,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=280693.3333333333, ans=0.0 2023-09-29 06:40:03,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:40:04,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:40:04,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 06:40:06,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 06:40:09,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 06:40:11,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:40:12,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:40:12,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:40:12,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:40:14,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 06:40:14,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:40:14,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 06:40:18,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:40:18,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=280760.0, ans=0.1 2023-09-29 06:40:19,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 06:40:21,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:40:24,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 06:40:24,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:40:24,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 06:40:24,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 06:40:35,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:40:36,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:40:38,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 06:40:38,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 06:40:38,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:40:38,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=280826.6666666667, ans=0.07 2023-09-29 06:40:41,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:40:44,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:40:44,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:40:44,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:40:44,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 06:40:45,618 INFO [train.py:1039] (3/4) Epoch 8, batch 4950, loss[loss=0.2139, simple_loss=0.2918, pruned_loss=0.06802, over 24525.00 frames. ], tot_loss[loss=0.215, simple_loss=0.2812, pruned_loss=0.07437, over 4711835.77 frames. ], batch size: 66, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:40:45,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:40:49,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:40:50,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 06:40:53,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 06:40:54,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 06:40:54,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:40:55,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 06:40:55,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:40:55,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:40:55,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:40:57,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:40:58,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:40:58,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:41:01,064 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:41:02,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:41:04,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:41:04,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:41:07,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 06:41:12,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:41:14,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:41:17,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:41:17,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:41:19,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:41:20,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 06:41:22,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 06:41:24,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:41:27,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:41:27,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:41:28,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:41:28,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:41:28,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 06:41:30,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:41:32,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:41:35,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:41:39,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:41:39,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:41:39,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 06:41:40,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:41:41,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:41:45,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:41:47,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:41:47,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:41:47,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=281093.3333333333, ans=0.0 2023-09-29 06:41:48,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:41:48,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:41:50,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:41:51,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:41:51,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:41:52,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:41:55,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 06:41:59,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:42:05,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 06:42:05,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 06:42:07,495 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.702e+02 2.069e+02 2.336e+02 2.676e+02 4.238e+02, threshold=4.671e+02, percent-clipped=0.0 2023-09-29 06:42:07,559 INFO [train.py:1039] (3/4) Epoch 8, batch 5000, loss[loss=0.2198, simple_loss=0.2939, pruned_loss=0.07282, over 24577.00 frames. ], tot_loss[loss=0.2148, simple_loss=0.2808, pruned_loss=0.07437, over 4708004.08 frames. ], batch size: 71, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:42:11,627 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.58 vs. limit=15.0 2023-09-29 06:42:12,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=281226.6666666667, ans=0.125 2023-09-29 06:42:13,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:42:13,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:42:15,913 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 06:42:16,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 06:42:17,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:42:19,568 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=281226.6666666667, ans=0.95 2023-09-29 06:42:20,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 06:42:20,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:42:20,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:42:23,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 06:42:23,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:42:25,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:42:25,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 06:42:25,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:42:25,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:42:27,017 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=281293.3333333333, ans=10.0 2023-09-29 06:42:28,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 06:42:29,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 06:42:29,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:42:31,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 06:42:31,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:42:31,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:42:32,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 06:42:32,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 06:42:33,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 06:42:34,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 06:42:34,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:42:34,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:42:36,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 06:42:36,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:42:38,335 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:42:39,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:42:40,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:42:41,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 06:42:43,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 06:42:44,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:42:47,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=281360.0, ans=0.2 2023-09-29 06:42:48,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:42:48,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=281360.0, ans=0.1 2023-09-29 06:42:48,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=281360.0, ans=0.0 2023-09-29 06:42:51,435 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 06:42:55,098 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:42:56,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:42:56,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:42:56,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=281426.6666666667, ans=0.125 2023-09-29 06:42:59,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 06:42:59,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:43:01,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:43:01,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:43:04,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 06:43:04,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:43:07,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:43:09,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:43:13,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 06:43:17,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:43:26,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:43:28,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:43:28,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:43:29,898 INFO [train.py:1039] (3/4) Epoch 8, batch 5050, loss[loss=0.2158, simple_loss=0.275, pruned_loss=0.07834, over 23786.00 frames. ], tot_loss[loss=0.2158, simple_loss=0.282, pruned_loss=0.07475, over 4707982.39 frames. ], batch size: 179, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:43:29,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:43:30,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:43:30,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:43:30,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:43:33,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:43:34,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 06:43:36,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:43:37,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:43:40,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:43:40,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 06:43:40,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=281560.0, ans=0.0 2023-09-29 06:43:41,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:43:41,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:43:44,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 06:43:46,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:43:46,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:43:55,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 06:43:55,721 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 06:43:57,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:43:58,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 06:43:58,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:44:01,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:44:02,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:44:03,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:44:03,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 06:44:03,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 06:44:05,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:44:08,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:44:11,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:44:11,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 06:44:14,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:44:17,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 06:44:18,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:44:19,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:44:21,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:44:21,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:44:22,058 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.20 vs. limit=10.0 2023-09-29 06:44:22,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:44:24,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:44:26,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:44:26,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:44:26,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:44:26,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 06:44:28,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:44:30,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:44:34,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:44:35,006 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 06:44:35,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 06:44:37,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:44:38,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:44:38,615 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 06:44:41,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:44:41,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 06:44:41,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:44:44,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:44:45,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=281826.6666666667, ans=0.125 2023-09-29 06:44:46,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:44:46,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 06:44:48,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 06:44:50,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:44:51,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:44:51,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 06:44:51,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=281893.3333333333, ans=0.0 2023-09-29 06:44:52,972 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.731e+02 2.260e+02 2.527e+02 2.886e+02 4.203e+02, threshold=5.054e+02, percent-clipped=0.0 2023-09-29 06:44:53,025 INFO [train.py:1039] (3/4) Epoch 8, batch 5100, loss[loss=0.192, simple_loss=0.2647, pruned_loss=0.05964, over 24451.00 frames. ], tot_loss[loss=0.2161, simple_loss=0.2827, pruned_loss=0.07474, over 4712514.55 frames. ], batch size: 63, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:44:53,348 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 06:44:56,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:44:59,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 06:44:59,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 06:44:59,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:45:02,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:45:06,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:45:06,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 06:45:06,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 06:45:06,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=281893.3333333333, ans=0.1 2023-09-29 06:45:13,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:45:13,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:45:15,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=281960.0, ans=0.125 2023-09-29 06:45:18,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:45:21,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 06:45:22,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:45:24,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:45:24,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 06:45:26,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:45:27,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:45:27,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 06:45:31,497 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 06:45:32,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:45:32,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 06:45:33,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 06:45:34,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=282026.6666666667, ans=0.015 2023-09-29 06:45:36,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:45:45,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:45:49,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 06:45:49,467 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 06:45:49,491 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 06:45:51,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 06:45:52,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:45:55,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 06:46:00,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 06:46:02,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 06:46:03,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 06:46:05,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 06:46:07,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 06:46:08,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 06:46:13,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:46:13,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:46:13,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:46:15,134 INFO [train.py:1039] (3/4) Epoch 8, batch 5150, loss[loss=0.2079, simple_loss=0.2879, pruned_loss=0.06393, over 24531.00 frames. ], tot_loss[loss=0.2161, simple_loss=0.2833, pruned_loss=0.07449, over 4733646.39 frames. ], batch size: 71, lr: 1.26e-02, grad_scale: 16.0 2023-09-29 06:46:15,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:46:15,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 06:46:17,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:46:18,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 06:46:18,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 06:46:18,931 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 06:46:18,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:46:18,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 06:46:19,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:46:21,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 06:46:22,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:46:24,400 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:46:28,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:46:28,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 06:46:30,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:46:30,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:46:32,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 06:46:32,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:46:32,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:46:33,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:46:33,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:46:33,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 06:46:37,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:46:37,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:46:39,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 06:46:41,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 06:46:42,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:46:49,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:46:52,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 06:46:57,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:46:57,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=282360.0, ans=0.125 2023-09-29 06:47:03,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:47:03,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:47:05,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=282426.6666666667, ans=0.2 2023-09-29 06:47:06,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:47:08,026 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:47:08,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=282426.6666666667, ans=0.0 2023-09-29 06:47:09,152 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.51 vs. limit=10.0 2023-09-29 06:47:11,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 06:47:16,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:47:19,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 06:47:19,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 06:47:21,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:47:23,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:47:24,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 06:47:29,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:47:30,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:47:31,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:47:31,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:47:33,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 06:47:33,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:47:33,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:47:35,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:47:38,005 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.727e+02 2.091e+02 2.433e+02 2.751e+02 4.119e+02, threshold=4.867e+02, percent-clipped=0.0 2023-09-29 06:47:38,062 INFO [train.py:1039] (3/4) Epoch 8, batch 5200, loss[loss=0.2013, simple_loss=0.2865, pruned_loss=0.05804, over 24315.00 frames. ], tot_loss[loss=0.2165, simple_loss=0.284, pruned_loss=0.07454, over 4741941.45 frames. ], batch size: 74, lr: 1.26e-02, grad_scale: 32.0 2023-09-29 06:47:39,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:47:41,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:47:44,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:47:50,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 06:47:50,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:47:51,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:47:54,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:47:56,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:47:56,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:47:58,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=282626.6666666667, ans=0.125 2023-09-29 06:47:59,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 06:48:01,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 06:48:02,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=282626.6666666667, ans=0.1 2023-09-29 06:48:03,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:48:05,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 06:48:05,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=282626.6666666667, ans=0.0 2023-09-29 06:48:08,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:48:08,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=282626.6666666667, ans=0.125 2023-09-29 06:48:08,589 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=282626.6666666667, ans=0.125 2023-09-29 06:48:09,082 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.63 vs. limit=15.0 2023-09-29 06:48:09,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:48:11,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 06:48:11,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 06:48:13,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 06:48:14,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:48:14,784 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 06:48:14,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:48:16,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:48:16,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:48:17,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 06:48:19,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:48:20,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=282693.3333333333, ans=0.125 2023-09-29 06:48:21,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:48:25,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 06:48:25,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 06:48:26,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 06:48:29,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 06:48:29,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:48:35,359 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=282760.0, ans=0.125 2023-09-29 06:48:36,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=282760.0, ans=0.0 2023-09-29 06:48:37,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:48:38,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:48:39,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 06:48:41,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:48:41,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 06:48:41,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:48:41,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:48:43,100 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=282826.6666666667, ans=0.2 2023-09-29 06:48:44,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:48:46,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:48:46,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=282826.6666666667, ans=0.125 2023-09-29 06:48:51,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:48:51,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:48:51,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:48:55,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:48:58,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 06:48:58,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:48:58,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:49:00,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:49:01,686 INFO [train.py:1039] (3/4) Epoch 8, batch 5250, loss[loss=0.222, simple_loss=0.2947, pruned_loss=0.0746, over 23582.00 frames. ], tot_loss[loss=0.2165, simple_loss=0.2835, pruned_loss=0.07478, over 4732423.42 frames. ], batch size: 93, lr: 1.26e-02, grad_scale: 32.0 2023-09-29 06:49:01,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 06:49:03,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:49:05,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:49:08,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:49:08,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:49:10,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:49:16,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:49:18,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:49:21,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:49:21,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:49:24,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 06:49:24,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:49:26,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:49:31,472 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=282960.0, ans=0.0 2023-09-29 06:50:04,774 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.50 vs. limit=15.0 2023-09-29 06:50:16,296 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.699e+02 2.079e+02 2.357e+02 2.633e+02 5.213e+02, threshold=4.714e+02, percent-clipped=2.0 2023-09-29 06:50:16,339 INFO [train.py:1039] (3/4) Epoch 8, batch 5300, loss[loss=0.2412, simple_loss=0.3176, pruned_loss=0.08235, over 24363.00 frames. ], tot_loss[loss=0.2154, simple_loss=0.2814, pruned_loss=0.07473, over 4709736.62 frames. ], batch size: 77, lr: 1.25e-02, grad_scale: 32.0 2023-09-29 06:50:31,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:50:31,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 06:50:31,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 06:50:31,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:50:32,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:50:32,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:50:32,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:50:32,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:50:32,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:50:32,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:50:32,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 06:50:33,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:50:33,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 06:50:33,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 06:50:33,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 06:50:33,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 06:50:33,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 06:50:33,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 06:50:34,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:50:35,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:50:35,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:50:35,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:50:35,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:50:35,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:50:35,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:50:35,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:50:36,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:50:36,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:50:36,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:50:36,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:50:36,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:50:37,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 06:50:37,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:50:37,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:50:37,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 06:50:37,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 06:50:37,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:50:38,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:50:38,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 06:50:38,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 06:50:38,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:50:39,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:50:39,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:50:39,949 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 06:50:40,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 06:50:40,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:50:40,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:50:40,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 06:50:40,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 06:50:40,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 06:50:40,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 06:50:50,905 INFO [train.py:1039] (3/4) Epoch 9, batch 0, loss[loss=0.2302, simple_loss=0.2896, pruned_loss=0.08536, over 23720.00 frames. ], tot_loss[loss=0.2302, simple_loss=0.2896, pruned_loss=0.08536, over 23720.00 frames. ], batch size: 232, lr: 1.19e-02, grad_scale: 32.0 2023-09-29 06:50:50,905 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 06:51:04,825 INFO [train.py:1071] (3/4) Epoch 9, validation: loss=0.2824, simple_loss=0.2767, pruned_loss=0.144, over 1125622.00 frames. 2023-09-29 06:51:04,826 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 06:51:06,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 06:51:06,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:51:08,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:51:09,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=283306.6666666667, ans=0.0 2023-09-29 06:51:14,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:51:14,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:51:14,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:51:16,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 06:51:17,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 06:51:19,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:51:20,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:51:24,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:51:24,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:51:26,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 06:51:26,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:51:29,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 06:51:30,704 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:51:34,921 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=283373.3333333333, ans=0.125 2023-09-29 06:51:40,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:51:40,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:51:42,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 06:51:45,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:51:47,267 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:51:48,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:51:51,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:51:51,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=283440.0, ans=0.0 2023-09-29 06:51:54,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=283506.6666666667, ans=0.04949747468305833 2023-09-29 06:51:56,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:52:02,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 06:52:06,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 06:52:06,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:52:06,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:52:06,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:52:07,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:52:10,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 06:52:13,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:52:13,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:52:17,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:52:20,490 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 06:52:23,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:52:25,791 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=283640.0, ans=0.05 2023-09-29 06:52:26,928 INFO [train.py:1039] (3/4) Epoch 9, batch 50, loss[loss=0.2097, simple_loss=0.2906, pruned_loss=0.06437, over 24479.00 frames. ], tot_loss[loss=0.2176, simple_loss=0.2842, pruned_loss=0.07543, over 1066028.80 frames. ], batch size: 69, lr: 1.19e-02, grad_scale: 32.0 2023-09-29 06:52:27,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:52:28,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:52:28,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 06:52:30,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 06:52:30,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:52:31,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:52:34,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:52:36,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:52:41,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 06:52:41,421 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:52:43,186 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=283706.6666666667, ans=0.125 2023-09-29 06:52:48,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:52:51,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 06:52:52,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 06:52:56,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:52:57,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:52:57,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:52:59,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:52:59,391 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 06:53:01,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 06:53:01,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 06:53:01,226 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:53:07,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=283773.3333333333, ans=0.2 2023-09-29 06:53:10,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:53:11,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=283773.3333333333, ans=0.0 2023-09-29 06:53:11,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=283773.3333333333, ans=0.0 2023-09-29 06:53:12,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:53:12,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 06:53:14,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 06:53:15,772 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 06:53:17,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:53:17,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 06:53:17,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:53:19,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 06:53:27,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:53:27,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:53:28,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:53:30,301 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 2.133e+02 2.436e+02 2.893e+02 4.514e+02, threshold=4.872e+02, percent-clipped=0.0 2023-09-29 06:53:30,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:53:30,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 06:53:34,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 06:53:34,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 06:53:36,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:53:36,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 06:53:37,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:53:38,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=283906.6666666667, ans=0.125 2023-09-29 06:53:39,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:53:39,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 06:53:39,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 06:53:40,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 06:53:42,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:53:43,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:53:43,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 06:53:43,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 06:53:44,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=283906.6666666667, ans=0.0 2023-09-29 06:53:46,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:53:47,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:53:48,936 INFO [train.py:1039] (3/4) Epoch 9, batch 100, loss[loss=0.2009, simple_loss=0.2734, pruned_loss=0.06414, over 24502.00 frames. ], tot_loss[loss=0.2162, simple_loss=0.284, pruned_loss=0.07426, over 1881937.95 frames. ], batch size: 63, lr: 1.19e-02, grad_scale: 16.0 2023-09-29 06:53:49,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:53:49,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:53:52,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:53:56,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:54:02,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:54:03,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 06:54:03,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:54:05,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=284040.0, ans=0.125 2023-09-29 06:54:06,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 06:54:06,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:54:06,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 06:54:08,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:54:08,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 06:54:09,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 06:54:10,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=284040.0, ans=0.125 2023-09-29 06:54:13,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 06:54:13,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:54:14,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:54:14,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:54:18,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 06:54:20,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:54:21,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:54:21,716 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 06:54:23,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 06:54:25,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=284106.6666666667, ans=0.0 2023-09-29 06:54:28,354 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 06:54:28,377 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 06:54:30,004 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:54:30,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:54:34,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 06:54:34,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=284106.6666666667, ans=0.125 2023-09-29 06:54:38,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:54:39,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:54:39,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=284173.3333333333, ans=0.1 2023-09-29 06:54:44,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:54:44,201 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 06:54:45,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 06:54:49,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:54:51,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:54:51,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:54:54,771 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:54:58,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:55:00,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:55:01,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=284240.0, ans=0.0 2023-09-29 06:55:03,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:55:04,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:55:04,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:55:04,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:55:04,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:55:06,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 06:55:06,299 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 06:55:06,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:55:08,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:55:08,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:08,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:55:10,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 06:55:10,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:55:11,521 INFO [train.py:1039] (3/4) Epoch 9, batch 150, loss[loss=0.2271, simple_loss=0.2845, pruned_loss=0.08486, over 23837.00 frames. ], tot_loss[loss=0.2172, simple_loss=0.2849, pruned_loss=0.07472, over 2506958.35 frames. ], batch size: 164, lr: 1.19e-02, grad_scale: 16.0 2023-09-29 06:55:11,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 06:55:11,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:11,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:55:13,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:55:14,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:55:14,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:55:16,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:55:21,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 06:55:21,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:55:23,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:26,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:55:27,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:30,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 06:55:30,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:35,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 06:55:35,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 06:55:35,427 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 06:55:37,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:55:37,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 06:55:39,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:55:41,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:55:41,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:55:41,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:43,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:55:43,968 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 06:55:45,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:55:50,739 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.50 vs. limit=15.0 2023-09-29 06:55:51,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:55:55,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=284440.0, ans=0.125 2023-09-29 06:55:56,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 06:55:58,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 06:56:03,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:56:03,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:56:03,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:56:06,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:56:08,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:56:08,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:56:10,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:56:11,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 06:56:16,143 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 2.032e+02 2.478e+02 3.173e+02 5.553e+02, threshold=4.955e+02, percent-clipped=3.0 2023-09-29 06:56:16,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:56:16,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:56:17,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 06:56:17,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 06:56:20,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:56:23,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 06:56:26,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 06:56:26,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:56:27,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:56:29,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:56:29,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 06:56:30,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:56:30,764 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 06:56:34,652 INFO [train.py:1039] (3/4) Epoch 9, batch 200, loss[loss=0.2291, simple_loss=0.2821, pruned_loss=0.08808, over 23706.00 frames. ], tot_loss[loss=0.218, simple_loss=0.2856, pruned_loss=0.07517, over 3005246.87 frames. ], batch size: 232, lr: 1.19e-02, grad_scale: 16.0 2023-09-29 06:56:36,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:56:36,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=284640.0, ans=0.2 2023-09-29 06:56:39,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:56:39,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:56:42,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 06:56:44,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:56:44,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:56:47,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 06:56:48,403 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.19 vs. limit=15.0 2023-09-29 06:56:49,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 06:56:50,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:56:52,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:56:52,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=284706.6666666667, ans=0.125 2023-09-29 06:56:56,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:56:57,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:56:57,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:56:59,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=284706.6666666667, ans=0.0 2023-09-29 06:57:06,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=284773.3333333333, ans=0.2 2023-09-29 06:57:14,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 06:57:14,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:57:16,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 06:57:17,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:57:18,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=284773.3333333333, ans=0.125 2023-09-29 06:57:19,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 06:57:19,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:57:20,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:57:22,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:57:23,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:57:23,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:57:25,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 06:57:25,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 06:57:25,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:57:25,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=284840.0, ans=0.09899494936611666 2023-09-29 06:57:29,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:57:36,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:57:42,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=284906.6666666667, ans=0.0 2023-09-29 06:57:44,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:57:44,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 06:57:52,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:57:55,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 06:57:55,546 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:57:56,912 INFO [train.py:1039] (3/4) Epoch 9, batch 250, loss[loss=0.2085, simple_loss=0.2835, pruned_loss=0.06671, over 24505.00 frames. ], tot_loss[loss=0.2177, simple_loss=0.2857, pruned_loss=0.07481, over 3388243.74 frames. ], batch size: 66, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 06:57:56,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 06:57:56,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:57:58,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 06:57:58,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 06:58:00,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:58:00,247 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 06:58:01,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:58:05,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 06:58:06,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:58:06,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:58:08,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:58:09,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 06:58:11,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:58:13,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=285040.0, ans=0.0 2023-09-29 06:58:14,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:58:20,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=285040.0, ans=0.2 2023-09-29 06:58:25,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:58:25,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=285040.0, ans=0.125 2023-09-29 06:58:28,490 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:58:28,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 06:58:34,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 06:58:36,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 06:58:36,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 06:58:37,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:58:39,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 06:58:39,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 06:58:39,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 06:58:41,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 06:58:43,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 06:58:43,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 06:58:45,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 06:58:46,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 06:58:46,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 06:58:46,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:58:48,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 06:58:48,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 06:58:50,313 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.05 vs. limit=15.0 2023-09-29 06:58:50,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:58:53,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 06:58:53,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:58:58,489 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 06:59:01,255 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 2.006e+02 2.267e+02 2.549e+02 3.617e+02, threshold=4.534e+02, percent-clipped=0.0 2023-09-29 06:59:04,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:59:07,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 06:59:10,136 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=285240.0, ans=0.2 2023-09-29 06:59:11,538 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=285240.0, ans=0.125 2023-09-29 06:59:12,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:59:12,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 06:59:14,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=285240.0, ans=0.125 2023-09-29 06:59:16,028 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 06:59:18,718 INFO [train.py:1039] (3/4) Epoch 9, batch 300, loss[loss=0.2074, simple_loss=0.244, pruned_loss=0.08543, over 19630.00 frames. ], tot_loss[loss=0.2157, simple_loss=0.2829, pruned_loss=0.07429, over 3667361.93 frames. ], batch size: 388, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 06:59:18,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 06:59:18,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 06:59:20,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 06:59:22,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 06:59:22,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 06:59:22,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 06:59:27,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 06:59:28,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=285306.6666666667, ans=0.04949747468305833 2023-09-29 06:59:29,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 06:59:29,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=285306.6666666667, ans=0.125 2023-09-29 06:59:34,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 06:59:34,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 06:59:36,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 06:59:37,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 06:59:37,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 06:59:37,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:59:41,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 06:59:46,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 06:59:46,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 06:59:50,873 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 06:59:52,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:59:53,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 06:59:55,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 06:59:55,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 06:59:55,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 06:59:57,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:00:00,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:00:02,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:00:07,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 07:00:07,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 07:00:07,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:00:09,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=285506.6666666667, ans=0.125 2023-09-29 07:00:10,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:00:12,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 07:00:12,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=285506.6666666667, ans=0.125 2023-09-29 07:00:14,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:00:19,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:00:22,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:00:22,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 07:00:25,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:00:25,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:00:27,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:00:28,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:00:29,788 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.58 vs. limit=6.0 2023-09-29 07:00:30,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 07:00:30,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 07:00:31,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=285573.3333333333, ans=0.025 2023-09-29 07:00:32,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:00:32,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 07:00:35,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:00:35,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:00:35,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:00:37,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:00:37,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:00:41,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=285640.0, ans=0.0 2023-09-29 07:00:41,552 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=285640.0, ans=0.0 2023-09-29 07:00:42,592 INFO [train.py:1039] (3/4) Epoch 9, batch 350, loss[loss=0.1935, simple_loss=0.2629, pruned_loss=0.06203, over 21043.00 frames. ], tot_loss[loss=0.2128, simple_loss=0.2801, pruned_loss=0.07273, over 3886138.72 frames. ], batch size: 46, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:00:43,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=285640.0, ans=0.1 2023-09-29 07:00:44,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:00:44,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 07:00:44,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=285640.0, ans=0.1 2023-09-29 07:00:47,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:00:49,390 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys.whitening_limit, batch_count=285640.0, ans=6.0 2023-09-29 07:00:49,554 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.17 vs. limit=15.0 2023-09-29 07:00:53,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:00:55,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:00:56,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:00:58,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 07:01:00,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:01:00,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 07:01:02,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:01:03,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 07:01:04,521 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.23 vs. limit=10.0 2023-09-29 07:01:05,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:01:08,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 07:01:09,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:01:12,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:01:14,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:01:15,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:01:15,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:01:16,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:01:16,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:01:16,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:01:19,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:01:19,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:01:27,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:01:27,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:01:27,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:01:28,692 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.91 vs. limit=12.0 2023-09-29 07:01:29,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:01:32,164 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=9.86 vs. limit=22.5 2023-09-29 07:01:35,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 07:01:35,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:01:40,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:01:40,928 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:01:40,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:01:42,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 07:01:45,354 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 1.981e+02 2.324e+02 2.780e+02 5.402e+02, threshold=4.648e+02, percent-clipped=1.0 2023-09-29 07:01:45,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:01:47,035 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 07:01:47,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 07:01:47,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:01:49,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=285906.6666666667, ans=0.09899494936611666 2023-09-29 07:01:51,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:01:51,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 07:01:54,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:01:58,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:01:59,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:02:01,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:02:01,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:02:03,930 INFO [train.py:1039] (3/4) Epoch 9, batch 400, loss[loss=0.211, simple_loss=0.2909, pruned_loss=0.06555, over 24069.00 frames. ], tot_loss[loss=0.2127, simple_loss=0.2796, pruned_loss=0.07291, over 4056418.43 frames. ], batch size: 80, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:02:04,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:02:07,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:02:08,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:02:09,007 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=285973.3333333333, ans=0.125 2023-09-29 07:02:11,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 07:02:11,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:02:11,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:02:13,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:02:13,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:02:16,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:02:18,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:02:20,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 07:02:21,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 07:02:21,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:02:23,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 07:02:25,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:02:28,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:02:28,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:02:28,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 07:02:28,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:02:28,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:02:28,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:02:29,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:02:31,539 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 07:02:31,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 07:02:36,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:02:38,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:02:38,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 07:02:39,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 07:02:40,025 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=286106.6666666667, ans=0.1 2023-09-29 07:02:43,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=286106.6666666667, ans=0.125 2023-09-29 07:02:44,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:02:46,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:02:52,842 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 07:02:58,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:02:58,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=286173.3333333333, ans=0.125 2023-09-29 07:02:59,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 07:03:01,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:03:02,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:03:02,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 07:03:06,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:03:09,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 07:03:11,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:03:14,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:03:16,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 07:03:19,149 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 07:03:19,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 07:03:20,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:03:20,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:03:22,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 07:03:24,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=286306.6666666667, ans=0.125 2023-09-29 07:03:25,684 INFO [train.py:1039] (3/4) Epoch 9, batch 450, loss[loss=0.2334, simple_loss=0.2861, pruned_loss=0.09032, over 23892.00 frames. ], tot_loss[loss=0.2123, simple_loss=0.2804, pruned_loss=0.07215, over 4225791.23 frames. ], batch size: 212, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:03:25,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:03:25,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:03:26,010 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:03:29,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 07:03:29,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:03:31,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:03:32,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:03:32,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 07:03:32,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:03:34,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:03:36,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:03:45,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:03:46,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:03:47,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 07:03:48,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=286373.3333333333, ans=0.125 2023-09-29 07:03:49,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 07:03:50,383 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.79 vs. limit=15.0 2023-09-29 07:03:51,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:03:54,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:03:56,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:03:59,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:03:59,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:04:02,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 07:04:03,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 07:04:04,749 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=286440.0, ans=0.1 2023-09-29 07:04:05,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 07:04:05,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:04:06,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=286440.0, ans=0.0 2023-09-29 07:04:07,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:04:07,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:04:10,845 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 07:04:10,860 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 07:04:12,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:04:12,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=286440.0, ans=0.125 2023-09-29 07:04:12,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=286440.0, ans=0.125 2023-09-29 07:04:13,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:04:15,309 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 07:04:20,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 07:04:20,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:04:21,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 07:04:23,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 07:04:26,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:04:29,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:04:29,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:04:31,306 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.979e+02 2.168e+02 2.458e+02 3.361e+02, threshold=4.337e+02, percent-clipped=0.0 2023-09-29 07:04:31,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 07:04:36,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:04:36,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 07:04:37,394 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=286573.3333333333, ans=0.125 2023-09-29 07:04:38,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 07:04:39,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:04:43,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:04:46,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:04:48,024 INFO [train.py:1039] (3/4) Epoch 9, batch 500, loss[loss=0.2154, simple_loss=0.2784, pruned_loss=0.07623, over 23669.00 frames. ], tot_loss[loss=0.2137, simple_loss=0.2818, pruned_loss=0.07284, over 4329247.03 frames. ], batch size: 232, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:04:48,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:04:48,210 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 07:04:51,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:04:52,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:04:52,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:04:53,012 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 07:04:55,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 07:04:55,046 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:04:59,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 07:05:03,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 07:05:04,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:05:07,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:05:07,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:05:07,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:15,101 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.97 vs. limit=22.5 2023-09-29 07:05:17,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:05:19,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:05:19,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 07:05:20,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:05:20,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 07:05:20,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:05:20,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=286773.3333333333, ans=0.0 2023-09-29 07:05:23,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=286773.3333333333, ans=0.125 2023-09-29 07:05:24,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:05:26,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:05:26,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:05:26,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:05:28,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 07:05:31,372 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 07:05:34,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:05:34,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:36,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:37,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:37,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:05:40,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 07:05:42,198 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.52 vs. limit=5.0 2023-09-29 07:05:42,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:05:44,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:05:45,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=286840.0, ans=0.2 2023-09-29 07:05:46,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:05:49,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:05:54,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:05:58,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 07:05:58,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:05:58,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:06:01,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 07:06:01,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:06:04,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:06:09,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 07:06:10,913 INFO [train.py:1039] (3/4) Epoch 9, batch 550, loss[loss=0.2174, simple_loss=0.2961, pruned_loss=0.06937, over 24591.00 frames. ], tot_loss[loss=0.2172, simple_loss=0.2845, pruned_loss=0.07496, over 4401931.19 frames. ], batch size: 71, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:06:12,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 07:06:12,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:06:13,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 07:06:15,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:06:15,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:06:17,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:17,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:17,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:06:18,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:06:20,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:06:21,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 07:06:21,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:06:25,685 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.07 vs. limit=15.0 2023-09-29 07:06:27,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:06:27,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:29,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=287040.0, ans=0.2 2023-09-29 07:06:31,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:06:32,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:36,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 07:06:38,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 07:06:39,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:06:43,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:06:44,163 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.40 vs. limit=22.5 2023-09-29 07:06:44,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:06:45,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=287106.6666666667, ans=0.125 2023-09-29 07:06:46,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:06:48,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:06:48,727 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 07:06:50,401 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=287106.6666666667, ans=0.125 2023-09-29 07:06:51,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:06:53,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 07:06:57,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:06:57,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:06:57,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:06:58,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:07:01,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 07:07:02,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 07:07:03,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:07:03,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:07:04,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:07:04,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:07:07,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:07:10,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:07:11,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:07:11,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:07:13,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 07:07:14,835 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 2.039e+02 2.212e+02 2.496e+02 3.392e+02, threshold=4.424e+02, percent-clipped=0.0 2023-09-29 07:07:14,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:07:17,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:07:17,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:07:17,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=287240.0, ans=0.0 2023-09-29 07:07:18,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:07:20,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 07:07:20,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 07:07:28,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 07:07:31,077 INFO [train.py:1039] (3/4) Epoch 9, batch 600, loss[loss=0.2209, simple_loss=0.2814, pruned_loss=0.08019, over 23745.00 frames. ], tot_loss[loss=0.2166, simple_loss=0.2836, pruned_loss=0.07478, over 4455538.19 frames. ], batch size: 179, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:07:31,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 07:07:34,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:07:34,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:07:34,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:07:40,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=287306.6666666667, ans=0.125 2023-09-29 07:07:41,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:07:42,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 07:07:45,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 07:07:46,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=287373.3333333333, ans=0.0 2023-09-29 07:07:47,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:07:50,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:07:52,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:07:54,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 07:07:54,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:08:01,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 07:08:05,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:08:05,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:08:07,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:08:11,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:08:11,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:08:13,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:08:22,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:08:25,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:08:25,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:08:25,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:08:33,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 07:08:38,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 07:08:38,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:08:42,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 07:08:44,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:08:46,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 07:08:46,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:08:46,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:08:52,988 INFO [train.py:1039] (3/4) Epoch 9, batch 650, loss[loss=0.2083, simple_loss=0.2853, pruned_loss=0.06563, over 24619.00 frames. ], tot_loss[loss=0.2157, simple_loss=0.2827, pruned_loss=0.07433, over 4520311.47 frames. ], batch size: 68, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:08:53,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 07:08:55,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 07:08:57,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=287640.0, ans=0.125 2023-09-29 07:08:58,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:08:58,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:09:00,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:04,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 07:09:05,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:09:07,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=287640.0, ans=0.1 2023-09-29 07:09:11,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:09:11,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:09:13,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=287706.6666666667, ans=0.1 2023-09-29 07:09:14,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:09:19,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 07:09:20,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:09:21,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:09:24,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:09:26,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 07:09:29,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:09:29,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:31,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 07:09:32,007 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.90 vs. limit=15.0 2023-09-29 07:09:32,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:34,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:09:35,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:09:36,007 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 07:09:36,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:09:36,050 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:09:39,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:42,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:09:42,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:09:42,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:09:43,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=287840.0, ans=0.0 2023-09-29 07:09:44,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 07:09:45,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:09:45,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:09:47,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:09:47,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:09:47,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:09:49,100 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 07:09:50,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 07:09:50,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:09:50,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:09:50,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:09:50,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:09:53,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:09:58,842 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.700e+02 2.023e+02 2.249e+02 2.557e+02 3.525e+02, threshold=4.498e+02, percent-clipped=0.0 2023-09-29 07:10:01,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:10:01,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:10:03,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:10:04,033 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.97 vs. limit=15.0 2023-09-29 07:10:06,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:10:07,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:10:07,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:10:15,899 INFO [train.py:1039] (3/4) Epoch 9, batch 700, loss[loss=0.2147, simple_loss=0.275, pruned_loss=0.07719, over 23299.00 frames. ], tot_loss[loss=0.2138, simple_loss=0.2811, pruned_loss=0.07327, over 4574813.11 frames. ], batch size: 119, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:10:15,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:10:15,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:10:16,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:10:16,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:10:20,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 07:10:22,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 07:10:25,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 07:10:25,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:10:28,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:10:30,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 07:10:33,980 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:10:37,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:10:39,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:10:39,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:10:41,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:10:41,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=288040.0, ans=0.0 2023-09-29 07:10:44,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:10:46,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=288040.0, ans=0.2 2023-09-29 07:10:47,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 07:10:47,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:10:49,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 07:10:52,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 07:10:55,638 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:10:57,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:10:58,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:11:03,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:11:05,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 07:11:10,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:11:10,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:11:10,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 07:11:15,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:11:15,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:11:19,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:11:25,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:11:25,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 07:11:29,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 07:11:29,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 07:11:31,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:11:31,788 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:11:33,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:11:34,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:11:36,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:11:36,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 07:11:38,254 INFO [train.py:1039] (3/4) Epoch 9, batch 750, loss[loss=0.225, simple_loss=0.284, pruned_loss=0.083, over 23724.00 frames. ], tot_loss[loss=0.2134, simple_loss=0.2805, pruned_loss=0.07314, over 4603619.13 frames. ], batch size: 212, lr: 1.18e-02, grad_scale: 16.0 2023-09-29 07:11:41,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 07:11:41,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 07:11:41,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 07:11:43,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 07:11:43,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 07:11:45,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:11:46,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 07:11:47,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=288306.6666666667, ans=0.125 2023-09-29 07:11:48,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:11:48,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:11:51,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:11:52,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:11:52,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:11:52,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:11:55,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:11:56,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=288373.3333333333, ans=0.125 2023-09-29 07:11:57,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:11:58,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:12:00,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:12:02,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:12:02,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 07:12:03,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:12:03,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:12:05,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:12:09,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:12:09,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 07:12:09,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:12:12,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 07:12:12,471 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 07:12:13,132 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.37 vs. limit=15.0 2023-09-29 07:12:14,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 07:12:14,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:12:14,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 07:12:16,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:12:17,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=288440.0, ans=0.0 2023-09-29 07:12:24,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:12:24,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:12:24,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:12:27,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:12:29,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:12:29,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 07:12:29,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:12:30,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 07:12:31,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=288506.6666666667, ans=0.1 2023-09-29 07:12:32,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:12:32,580 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=288506.6666666667, ans=0.1 2023-09-29 07:12:36,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:12:36,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 07:12:38,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:12:43,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=288573.3333333333, ans=0.0 2023-09-29 07:12:44,787 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.964e+02 2.224e+02 2.470e+02 4.454e+02, threshold=4.447e+02, percent-clipped=0.0 2023-09-29 07:12:44,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:12:45,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:12:46,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:12:48,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:12:52,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 07:12:53,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:12:53,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:12:58,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:12:58,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:13:01,772 INFO [train.py:1039] (3/4) Epoch 9, batch 800, loss[loss=0.1882, simple_loss=0.2568, pruned_loss=0.05978, over 15965.00 frames. ], tot_loss[loss=0.2137, simple_loss=0.2807, pruned_loss=0.07335, over 4627063.19 frames. ], batch size: 34, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:13:01,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:13:01,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:13:07,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=288640.0, ans=0.125 2023-09-29 07:13:09,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=288640.0, ans=0.09899494936611666 2023-09-29 07:13:11,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:13:11,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:12,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:13:12,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:13:14,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:14,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:13:15,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:19,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:13:21,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:13:24,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 07:13:24,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:13:25,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:13:25,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:13:25,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:13:25,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 07:13:27,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:13:28,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 07:13:31,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:35,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:13:35,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=288773.3333333333, ans=0.2 2023-09-29 07:13:38,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:13:38,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:13:39,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:13:39,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:13:44,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:13:44,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:13:45,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 07:13:47,565 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 07:13:47,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 07:13:47,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:13:47,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:13:51,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:13:51,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:13:56,007 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 07:13:56,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 07:13:57,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:13:59,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:14:01,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:14:07,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:14:07,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 07:14:08,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:14:13,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 07:14:18,796 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.14 vs. limit=10.0 2023-09-29 07:14:21,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:14:21,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=288906.6666666667, ans=0.1 2023-09-29 07:14:22,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:14:24,348 INFO [train.py:1039] (3/4) Epoch 9, batch 850, loss[loss=0.1933, simple_loss=0.2654, pruned_loss=0.06054, over 24590.00 frames. ], tot_loss[loss=0.2137, simple_loss=0.281, pruned_loss=0.07322, over 4654797.50 frames. ], batch size: 60, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:14:24,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 07:14:24,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:14:24,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:14:26,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 07:14:27,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:14:29,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:14:32,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:14:34,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:14:35,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:14:37,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 07:14:37,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 07:14:37,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 07:14:39,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:14:39,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:14:41,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:14:43,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:14:43,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:14:48,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:14:48,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:14:48,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 07:14:51,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 07:14:55,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:14:56,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 07:14:59,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 07:15:00,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 07:15:03,253 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 07:15:04,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:15:04,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:15:04,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 07:15:06,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:15:07,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:15:09,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 07:15:10,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:15:12,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:15:12,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:15:13,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:15:14,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:15:16,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 07:15:18,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 07:15:19,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=289173.3333333333, ans=0.125 2023-09-29 07:15:22,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:15:22,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:15:24,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:15:24,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:15:25,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:15:28,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:15:30,143 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 2.042e+02 2.331e+02 2.770e+02 4.715e+02, threshold=4.662e+02, percent-clipped=1.0 2023-09-29 07:15:30,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:15:31,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:15:31,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:15:33,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:15:40,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 07:15:41,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:15:42,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 07:15:42,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:15:42,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:15:45,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 07:15:47,734 INFO [train.py:1039] (3/4) Epoch 9, batch 900, loss[loss=0.2047, simple_loss=0.2848, pruned_loss=0.06233, over 24641.00 frames. ], tot_loss[loss=0.2144, simple_loss=0.282, pruned_loss=0.07342, over 4679093.29 frames. ], batch size: 68, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:15:53,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:15:57,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:15:57,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 07:15:58,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=289306.6666666667, ans=0.0 2023-09-29 07:16:00,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:16:02,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 07:16:03,912 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 07:16:04,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:16:04,064 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:16:04,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:16:04,908 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.65 vs. limit=15.0 2023-09-29 07:16:05,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:16:17,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:16:17,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:16:17,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:16:18,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=289440.0, ans=0.1 2023-09-29 07:16:20,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:16:25,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 07:16:27,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:16:32,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:16:32,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:16:32,499 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 07:16:33,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 07:16:41,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:16:41,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:16:41,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:16:49,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:16:49,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:16:51,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 07:16:51,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:16:55,258 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 07:16:57,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:16:57,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:16:59,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:17:00,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:17:02,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 07:17:02,806 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 07:17:05,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 07:17:05,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 07:17:07,423 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:17:10,225 INFO [train.py:1039] (3/4) Epoch 9, batch 950, loss[loss=0.2071, simple_loss=0.2892, pruned_loss=0.06254, over 24474.00 frames. ], tot_loss[loss=0.2134, simple_loss=0.2814, pruned_loss=0.07266, over 4701142.57 frames. ], batch size: 66, lr: 1.18e-02, grad_scale: 32.0 2023-09-29 07:17:11,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 07:17:16,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:17:20,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:17:21,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:17:21,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 07:17:23,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=289640.0, ans=0.0 2023-09-29 07:17:24,770 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 07:17:28,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:17:29,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:17:31,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:17:31,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:17:32,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 07:17:33,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 07:17:35,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:17:37,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 07:17:37,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:17:41,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:17:41,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:17:41,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:17:43,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 07:17:43,743 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 07:17:45,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:17:46,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:17:52,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:17:52,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:17:56,050 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 07:17:58,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 07:17:58,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:18:00,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:18:00,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:18:00,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:18:00,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=289840.0, ans=0.04949747468305833 2023-09-29 07:18:07,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 07:18:07,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:18:10,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:18:12,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:18:12,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 07:18:12,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:18:12,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:18:13,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 07:18:16,760 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 1.953e+02 2.303e+02 2.846e+02 4.844e+02, threshold=4.606e+02, percent-clipped=1.0 2023-09-29 07:18:16,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:18:17,253 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:18:20,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:18:24,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:18:26,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 07:18:26,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 07:18:30,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=289906.6666666667, ans=0.0 2023-09-29 07:18:31,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:18:31,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=289973.3333333333, ans=0.125 2023-09-29 07:18:32,671 INFO [train.py:1039] (3/4) Epoch 9, batch 1000, loss[loss=0.2241, simple_loss=0.301, pruned_loss=0.07362, over 24000.00 frames. ], tot_loss[loss=0.2129, simple_loss=0.281, pruned_loss=0.07237, over 4705535.04 frames. ], batch size: 80, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:18:33,523 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.65 vs. limit=15.0 2023-09-29 07:18:34,555 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=289973.3333333333, ans=0.1 2023-09-29 07:18:36,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 07:18:37,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:18:39,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=289973.3333333333, ans=0.125 2023-09-29 07:18:43,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:18:45,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 07:18:45,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 07:18:49,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:18:49,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:18:51,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:18:54,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 07:18:57,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 07:18:57,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 07:18:59,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:19:00,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 07:19:02,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 07:19:02,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 07:19:03,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:19:04,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:19:15,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:19:16,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:19:16,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:19:18,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:19:18,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 07:19:18,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:19:20,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:19:20,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:19:21,847 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 07:19:22,280 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=290173.3333333333, ans=0.0 2023-09-29 07:19:24,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 07:19:25,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 07:19:25,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=290173.3333333333, ans=0.125 2023-09-29 07:19:28,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 07:19:29,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:19:33,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=290173.3333333333, ans=0.1 2023-09-29 07:19:34,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:19:34,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:19:34,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:19:38,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:19:39,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 07:19:43,078 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:19:43,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 07:19:43,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 07:19:46,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:19:46,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:19:48,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:19:51,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 07:19:53,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:19:55,359 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=290306.6666666667, ans=0.125 2023-09-29 07:19:56,377 INFO [train.py:1039] (3/4) Epoch 9, batch 1050, loss[loss=0.2129, simple_loss=0.2954, pruned_loss=0.06516, over 24674.00 frames. ], tot_loss[loss=0.2119, simple_loss=0.2793, pruned_loss=0.0722, over 4693452.54 frames. ], batch size: 73, lr: 1.17e-02, grad_scale: 16.0 2023-09-29 07:19:56,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:19:58,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:19:58,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=290306.6666666667, ans=0.125 2023-09-29 07:20:01,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 07:20:01,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:20:02,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:20:03,116 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=290306.6666666667, ans=0.0 2023-09-29 07:20:05,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:20:05,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:20:06,532 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.14 vs. limit=6.0 2023-09-29 07:20:09,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:20:11,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:20:11,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:20:12,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:20:13,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 07:20:14,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:20:14,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 07:20:17,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:20:17,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 07:20:17,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:20:24,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:20:24,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:20:26,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:20:27,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 07:20:27,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 07:20:29,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:20:29,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=290440.0, ans=0.5 2023-09-29 07:20:32,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 07:20:36,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 07:20:37,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:20:41,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 07:20:43,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 07:20:43,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:20:43,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:20:48,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:20:53,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 07:20:54,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 07:20:56,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 07:20:56,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:20:56,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:20:56,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=290506.6666666667, ans=0.1 2023-09-29 07:20:57,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 07:21:02,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:21:03,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=290573.3333333333, ans=0.125 2023-09-29 07:21:04,054 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 2.054e+02 2.289e+02 2.734e+02 4.286e+02, threshold=4.577e+02, percent-clipped=0.0 2023-09-29 07:21:04,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:21:04,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:21:05,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:21:05,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:21:10,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:21:10,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 07:21:12,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:21:12,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 07:21:12,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 07:21:13,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:21:16,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:21:18,244 INFO [train.py:1039] (3/4) Epoch 9, batch 1100, loss[loss=0.1808, simple_loss=0.2535, pruned_loss=0.05404, over 24594.00 frames. ], tot_loss[loss=0.2117, simple_loss=0.279, pruned_loss=0.07221, over 4705573.08 frames. ], batch size: 60, lr: 1.17e-02, grad_scale: 16.0 2023-09-29 07:21:23,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:21:29,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:21:29,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=290640.0, ans=0.125 2023-09-29 07:21:31,415 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.26 vs. limit=10.0 2023-09-29 07:21:32,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:21:32,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:21:33,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 07:21:33,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:21:36,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:21:40,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:21:43,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:21:43,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 07:21:45,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 07:21:45,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:21:45,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:21:45,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=290706.6666666667, ans=0.07 2023-09-29 07:21:48,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:21:48,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=290706.6666666667, ans=0.2 2023-09-29 07:21:50,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:21:54,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:21:55,833 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=290773.3333333333, ans=0.125 2023-09-29 07:21:58,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 07:22:00,002 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 07:22:00,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:22:03,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:22:05,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:22:06,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:22:07,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 07:22:08,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:22:08,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:22:08,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:22:10,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:22:10,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 07:22:17,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:22:17,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 07:22:19,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:22:20,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=290840.0, ans=0.125 2023-09-29 07:22:24,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:22:27,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 07:22:27,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 07:22:29,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:22:32,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:22:32,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:22:34,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 07:22:35,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:22:37,202 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:22:37,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 07:22:39,406 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:22:39,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 07:22:41,387 INFO [train.py:1039] (3/4) Epoch 9, batch 1150, loss[loss=0.2406, simple_loss=0.3031, pruned_loss=0.08911, over 23226.00 frames. ], tot_loss[loss=0.2123, simple_loss=0.2801, pruned_loss=0.07227, over 4722896.76 frames. ], batch size: 93, lr: 1.17e-02, grad_scale: 16.0 2023-09-29 07:22:41,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:22:41,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:22:41,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:22:48,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:22:49,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:22:52,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:22:52,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:22:52,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 07:22:53,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:22:56,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 07:22:56,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:22:57,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 07:23:03,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 07:23:05,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:23:10,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:23:10,740 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:23:10,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 07:23:10,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:23:10,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:23:17,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 07:23:18,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:23:18,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=291106.6666666667, ans=0.125 2023-09-29 07:23:20,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:23:30,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:23:32,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=291173.3333333333, ans=0.0 2023-09-29 07:23:36,975 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:23:38,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 07:23:38,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:23:38,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:23:45,440 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 07:23:45,825 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=291240.0, ans=0.125 2023-09-29 07:23:47,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:23:48,993 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.654e+02 2.090e+02 2.381e+02 2.869e+02 4.983e+02, threshold=4.763e+02, percent-clipped=2.0 2023-09-29 07:23:54,724 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 07:23:57,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:23:59,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:23:59,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:24:00,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:24:02,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:24:03,994 INFO [train.py:1039] (3/4) Epoch 9, batch 1200, loss[loss=0.2016, simple_loss=0.2765, pruned_loss=0.06334, over 24656.00 frames. ], tot_loss[loss=0.214, simple_loss=0.2818, pruned_loss=0.07315, over 4725851.72 frames. ], batch size: 65, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:24:04,570 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=291306.6666666667, ans=0.125 2023-09-29 07:24:07,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:24:07,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:24:10,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:24:10,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:24:10,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:24:11,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:24:15,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 07:24:15,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:24:16,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:24:20,114 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 07:24:24,317 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 07:24:26,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:24:29,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:24:31,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:24:32,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:24:32,877 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 07:24:34,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:24:39,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=291440.0, ans=0.0 2023-09-29 07:24:43,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:24:43,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:24:43,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 07:24:44,995 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:24:50,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 07:24:53,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 07:24:54,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:24:54,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:24:58,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:24:58,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:25:00,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:25:00,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:25:02,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:25:02,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 07:25:02,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:25:02,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:25:02,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:25:04,057 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:25:05,462 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:25:05,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:25:07,819 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.25 vs. limit=15.0 2023-09-29 07:25:10,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 07:25:13,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:25:17,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 07:25:18,883 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.29 vs. limit=12.0 2023-09-29 07:25:20,912 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 07:25:22,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:25:26,016 INFO [train.py:1039] (3/4) Epoch 9, batch 1250, loss[loss=0.2192, simple_loss=0.2973, pruned_loss=0.07051, over 24430.00 frames. ], tot_loss[loss=0.2146, simple_loss=0.2822, pruned_loss=0.07351, over 4711892.98 frames. ], batch size: 77, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:25:26,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:25:27,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:25:29,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:25:29,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=291640.0, ans=0.125 2023-09-29 07:25:32,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 07:25:37,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:25:38,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:25:39,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 07:25:41,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:25:42,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:25:47,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 07:25:47,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:25:49,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:25:49,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:25:51,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:25:56,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 07:25:56,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:25:56,079 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:25:57,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:25:57,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:03,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:26:03,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:26:09,506 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.44 vs. limit=15.0 2023-09-29 07:26:10,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 07:26:10,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:26:13,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:26:14,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 07:26:15,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:26:15,558 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 07:26:15,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:15,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:20,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:26:20,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:26:21,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:26:22,297 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=291840.0, ans=0.1 2023-09-29 07:26:24,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 07:26:24,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 07:26:24,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 07:26:25,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=291840.0, ans=0.125 2023-09-29 07:26:27,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:26:28,126 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=291840.0, ans=0.0 2023-09-29 07:26:29,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 07:26:29,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:31,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 07:26:31,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:26:33,088 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 1.977e+02 2.179e+02 2.388e+02 3.416e+02, threshold=4.359e+02, percent-clipped=0.0 2023-09-29 07:26:33,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 07:26:33,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 07:26:34,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:26:34,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 07:26:34,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:26:36,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 07:26:39,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:26:42,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:26:42,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=291906.6666666667, ans=0.125 2023-09-29 07:26:44,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:26:45,740 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:26:48,705 INFO [train.py:1039] (3/4) Epoch 9, batch 1300, loss[loss=0.1774, simple_loss=0.2541, pruned_loss=0.05034, over 24445.00 frames. ], tot_loss[loss=0.2154, simple_loss=0.2829, pruned_loss=0.07394, over 4712726.88 frames. ], batch size: 58, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:26:48,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:26:48,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 07:26:53,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:26:55,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 07:26:56,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:26:58,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:26:59,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:26:59,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 07:27:05,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:27:06,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:27:06,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=292040.0, ans=0.125 2023-09-29 07:27:08,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 07:27:13,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:27:16,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:27:16,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:27:19,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:27:22,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:27:23,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:27:23,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 07:27:23,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 07:27:31,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:27:31,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:27:32,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 07:27:34,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 07:27:35,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:27:37,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:27:38,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 07:27:41,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:27:41,117 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 07:27:41,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:27:44,629 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:27:44,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:27:48,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 07:27:49,803 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 07:27:51,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 07:27:56,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:27:56,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=292240.0, ans=0.1 2023-09-29 07:27:59,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 07:28:01,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:28:10,010 INFO [train.py:1039] (3/4) Epoch 9, batch 1350, loss[loss=0.1926, simple_loss=0.2691, pruned_loss=0.05807, over 24271.00 frames. ], tot_loss[loss=0.214, simple_loss=0.2816, pruned_loss=0.07322, over 4714157.27 frames. ], batch size: 56, lr: 1.17e-02, grad_scale: 32.0 2023-09-29 07:28:10,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 07:28:13,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:28:16,434 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.61 vs. limit=15.0 2023-09-29 07:28:16,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:28:21,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:28:21,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:28:25,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:28:25,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:28:27,025 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:28:29,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:28:31,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 07:28:32,370 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.74 vs. limit=15.0 2023-09-29 07:28:33,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:28:33,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:28:33,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=292373.3333333333, ans=0.125 2023-09-29 07:28:36,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 07:28:37,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:28:39,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:28:39,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 07:28:40,486 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.71 vs. limit=15.0 2023-09-29 07:28:41,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 07:28:42,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 07:28:44,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:28:44,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 07:28:46,009 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=14.09 vs. limit=15.0 2023-09-29 07:28:46,881 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=292440.0, ans=0.2 2023-09-29 07:28:54,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=292440.0, ans=0.125 2023-09-29 07:28:56,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:29:04,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:29:04,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:29:06,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 07:29:10,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:29:10,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 07:29:11,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:29:13,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:29:14,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:29:17,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=292573.3333333333, ans=0.125 2023-09-29 07:29:18,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 07:29:20,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:29:20,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=292573.3333333333, ans=0.125 2023-09-29 07:29:21,320 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.985e+02 2.227e+02 2.562e+02 4.004e+02, threshold=4.454e+02, percent-clipped=0.0 2023-09-29 07:29:25,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 07:29:27,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 07:29:33,075 INFO [train.py:1039] (3/4) Epoch 9, batch 1400, loss[loss=0.1961, simple_loss=0.2661, pruned_loss=0.06304, over 24320.00 frames. ], tot_loss[loss=0.2133, simple_loss=0.2805, pruned_loss=0.07304, over 4707025.51 frames. ], batch size: 61, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:29:34,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 07:29:36,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:29:39,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:29:39,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:29:45,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=292640.0, ans=0.1 2023-09-29 07:29:46,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 07:29:46,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 07:29:48,504 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.85 vs. limit=15.0 2023-09-29 07:29:51,037 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.89 vs. limit=22.5 2023-09-29 07:29:54,221 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.28 vs. limit=10.0 2023-09-29 07:29:56,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:29:58,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:30:00,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:30:01,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:30:04,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:30:07,226 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 07:30:07,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=292773.3333333333, ans=0.1 2023-09-29 07:30:09,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=292773.3333333333, ans=0.125 2023-09-29 07:30:16,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:30:18,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:30:21,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 07:30:21,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=292840.0, ans=0.125 2023-09-29 07:30:22,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:30:23,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:30:23,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=292840.0, ans=0.125 2023-09-29 07:30:24,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:30:24,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:30:26,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:30:26,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:30:26,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:30:28,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 07:30:28,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:30:32,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:30:35,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:30:42,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 07:30:43,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 07:30:43,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:30:43,811 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=292906.6666666667, ans=0.1 2023-09-29 07:30:46,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 07:30:48,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:30:49,393 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.00 vs. limit=10.0 2023-09-29 07:30:50,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:30:51,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=292906.6666666667, ans=0.0 2023-09-29 07:30:53,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:30:56,704 INFO [train.py:1039] (3/4) Epoch 9, batch 1450, loss[loss=0.2323, simple_loss=0.2945, pruned_loss=0.08511, over 23800.00 frames. ], tot_loss[loss=0.2128, simple_loss=0.2798, pruned_loss=0.07293, over 4702156.48 frames. ], batch size: 179, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:30:57,357 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.30 vs. limit=6.0 2023-09-29 07:30:58,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:30:58,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:30:58,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 07:31:03,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:31:05,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:31:08,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:31:08,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 07:31:09,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:31:09,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 07:31:11,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:31:11,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:31:11,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 07:31:13,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:31:14,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:31:16,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 07:31:16,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:31:16,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:31:19,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:31:22,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:31:25,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:31:25,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:31:29,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:31:29,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:31:32,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:31:32,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:31:32,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:31:32,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:31:36,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 07:31:39,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:31:39,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=293106.6666666667, ans=0.125 2023-09-29 07:31:44,367 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 07:31:45,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:31:48,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:31:48,702 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.93 vs. limit=12.0 2023-09-29 07:31:49,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:31:49,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 07:31:54,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:31:55,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 07:31:57,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 07:31:57,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:31:58,358 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.37 vs. limit=15.0 2023-09-29 07:32:00,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:32:00,620 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:32:02,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 07:32:05,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 07:32:05,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 07:32:07,323 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.945e+02 2.236e+02 2.452e+02 4.458e+02, threshold=4.473e+02, percent-clipped=1.0 2023-09-29 07:32:07,542 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:32:09,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:32:11,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=293240.0, ans=0.125 2023-09-29 07:32:19,972 INFO [train.py:1039] (3/4) Epoch 9, batch 1500, loss[loss=0.2156, simple_loss=0.2716, pruned_loss=0.07984, over 23645.00 frames. ], tot_loss[loss=0.2112, simple_loss=0.2791, pruned_loss=0.07166, over 4718741.13 frames. ], batch size: 135, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:32:23,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 07:32:23,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:32:23,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:32:24,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:32:24,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:32:30,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:32:30,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 07:32:32,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:32:33,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:32:33,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:32:33,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:32:36,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:32:36,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=293306.6666666667, ans=0.0 2023-09-29 07:32:38,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:32:43,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:32:43,555 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=293373.3333333333, ans=0.2 2023-09-29 07:32:44,657 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 07:32:44,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:32:46,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:32:46,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:32:49,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 07:32:52,285 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=293373.3333333333, ans=0.2 2023-09-29 07:32:54,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 07:32:57,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:32:59,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 07:33:03,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 07:33:06,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:33:06,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:33:07,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:33:07,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 07:33:07,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:33:09,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:33:09,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 07:33:11,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:33:14,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:33:14,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 07:33:15,109 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.37 vs. limit=10.0 2023-09-29 07:33:19,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:33:23,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:33:27,648 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 07:33:27,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:33:27,756 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 07:33:28,043 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=293573.3333333333, ans=0.0 2023-09-29 07:33:29,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:33:31,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:33:31,460 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 07:33:31,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:33:32,188 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.89 vs. limit=15.0 2023-09-29 07:33:35,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 07:33:37,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:33:42,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:33:42,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:33:42,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:33:42,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:33:44,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:33:45,483 INFO [train.py:1039] (3/4) Epoch 9, batch 1550, loss[loss=0.1733, simple_loss=0.2389, pruned_loss=0.05389, over 24303.00 frames. ], tot_loss[loss=0.2122, simple_loss=0.2798, pruned_loss=0.07223, over 4719334.15 frames. ], batch size: 56, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:33:47,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 07:33:47,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 07:33:48,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:33:48,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 07:33:48,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 07:33:49,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=293640.0, ans=0.5 2023-09-29 07:33:50,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:33:52,499 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:33:52,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:33:52,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:33:54,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:33:54,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:33:57,956 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 07:33:59,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:33:59,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:34:00,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:34:02,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:34:02,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 07:34:03,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=293706.6666666667, ans=0.125 2023-09-29 07:34:04,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:34:04,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 07:34:06,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 07:34:06,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 07:34:07,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:34:09,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:34:11,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=293706.6666666667, ans=10.0 2023-09-29 07:34:12,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:34:15,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 07:34:15,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 07:34:17,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=293773.3333333333, ans=0.125 2023-09-29 07:34:21,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:34:24,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=293773.3333333333, ans=0.125 2023-09-29 07:34:26,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:34:28,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:34:28,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:34:28,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 07:34:28,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=293773.3333333333, ans=0.125 2023-09-29 07:34:33,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 07:34:35,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:34:38,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:34:40,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:34:41,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:34:41,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 07:34:41,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:34:43,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:34:43,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:34:45,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 07:34:45,479 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 07:34:46,314 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.03 vs. limit=15.0 2023-09-29 07:34:48,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:34:53,350 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=293906.6666666667, ans=0.0 2023-09-29 07:34:55,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 07:34:56,626 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.971e+02 2.191e+02 2.523e+02 4.378e+02, threshold=4.383e+02, percent-clipped=0.0 2023-09-29 07:34:58,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:35:00,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:35:01,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 07:35:03,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:35:04,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:35:04,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:35:04,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:35:06,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:35:08,777 INFO [train.py:1039] (3/4) Epoch 9, batch 1600, loss[loss=0.2063, simple_loss=0.29, pruned_loss=0.0613, over 24658.00 frames. ], tot_loss[loss=0.2121, simple_loss=0.2802, pruned_loss=0.07195, over 4723392.30 frames. ], batch size: 68, lr: 1.17e-02, grad_scale: 16.0 2023-09-29 07:35:10,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:35:12,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 07:35:13,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 07:35:15,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 07:35:16,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:35:19,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 07:35:21,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:35:23,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:35:28,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:35:32,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 07:35:35,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:35:36,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 07:35:37,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:35:37,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 07:35:39,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=294106.6666666667, ans=0.0 2023-09-29 07:35:44,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 07:35:52,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:35:52,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 07:35:52,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:35:53,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:35:53,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:35:57,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 07:36:01,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 07:36:04,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:36:04,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:36:04,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:36:06,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:36:08,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:36:10,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:36:10,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=294173.3333333333, ans=0.0 2023-09-29 07:36:11,704 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:36:11,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=294240.0, ans=0.125 2023-09-29 07:36:17,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:36:18,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:36:19,421 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.61 vs. limit=15.0 2023-09-29 07:36:20,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 07:36:20,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:36:21,020 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=294240.0, ans=0.1 2023-09-29 07:36:21,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=294240.0, ans=0.125 2023-09-29 07:36:22,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 07:36:28,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:36:30,238 INFO [train.py:1039] (3/4) Epoch 9, batch 1650, loss[loss=0.1769, simple_loss=0.2425, pruned_loss=0.05564, over 24437.00 frames. ], tot_loss[loss=0.215, simple_loss=0.2827, pruned_loss=0.07365, over 4712747.69 frames. ], batch size: 58, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:36:31,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:36:31,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:36:31,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 07:36:31,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 07:36:32,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 07:36:33,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 07:36:35,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:36:36,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:36:38,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:36:38,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:36:39,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:36:41,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 07:36:44,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:36:44,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:36:44,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:36:44,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:36:44,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 07:36:46,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 07:36:46,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=294373.3333333333, ans=0.125 2023-09-29 07:36:52,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:36:55,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:37:07,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 07:37:09,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:37:11,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 07:37:14,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:37:15,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:37:17,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:37:17,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:37:18,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:37:18,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:37:21,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:37:23,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:37:24,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:37:24,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:37:26,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:37:26,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:37:31,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:37:31,679 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:37:32,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 07:37:34,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:37:34,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 07:37:35,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 07:37:35,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 07:37:35,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:37:37,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:37:37,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:37:38,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:37:38,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 07:37:42,540 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 2.030e+02 2.311e+02 2.647e+02 4.475e+02, threshold=4.622e+02, percent-clipped=1.0 2023-09-29 07:37:42,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:37:44,305 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:37:44,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:37:44,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=294573.3333333333, ans=0.0 2023-09-29 07:37:47,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 07:37:51,798 INFO [train.py:1039] (3/4) Epoch 9, batch 1700, loss[loss=0.1955, simple_loss=0.2622, pruned_loss=0.06438, over 18850.00 frames. ], tot_loss[loss=0.2142, simple_loss=0.2818, pruned_loss=0.07335, over 4706202.12 frames. ], batch size: 41, lr: 1.17e-02, grad_scale: 8.0 2023-09-29 07:37:51,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:37:51,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:37:52,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 07:37:53,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:37:53,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:37:53,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:37:55,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:37:55,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:37:55,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 07:37:59,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:38:09,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:38:12,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:38:15,558 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.03 vs. limit=22.5 2023-09-29 07:38:17,204 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.00 vs. limit=22.5 2023-09-29 07:38:19,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:38:19,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:38:19,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:38:19,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:38:22,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 07:38:25,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:38:25,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:38:27,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:38:29,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:38:30,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 07:38:32,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 07:38:34,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:38:36,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 07:38:38,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:38:45,356 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.50 vs. limit=6.0 2023-09-29 07:38:47,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:38:47,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:38:49,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:38:52,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 07:38:52,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 07:38:52,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:38:54,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:38:54,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 07:38:55,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:38:55,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:38:55,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:38:55,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:39:00,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:39:00,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:39:01,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:39:01,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:39:03,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:39:05,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:39:07,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 07:39:07,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=294906.6666666667, ans=0.125 2023-09-29 07:39:09,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:39:12,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:39:12,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 07:39:15,870 INFO [train.py:1039] (3/4) Epoch 9, batch 1750, loss[loss=0.2006, simple_loss=0.2338, pruned_loss=0.08372, over 19099.00 frames. ], tot_loss[loss=0.2122, simple_loss=0.2797, pruned_loss=0.07237, over 4698620.41 frames. ], batch size: 390, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:39:16,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=294973.3333333333, ans=0.07 2023-09-29 07:39:19,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:39:22,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:39:22,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 07:39:22,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 07:39:23,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:39:27,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:39:28,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:39:31,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 07:39:32,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=295040.0, ans=0.0 2023-09-29 07:39:34,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:39:36,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 07:39:36,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:39:38,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:39:41,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=295040.0, ans=0.125 2023-09-29 07:39:42,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 07:39:42,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 07:39:45,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:39:45,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 07:39:49,419 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=295106.6666666667, ans=0.125 2023-09-29 07:39:53,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:39:56,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:39:56,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:40:01,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:40:01,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:40:03,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:40:03,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=295173.3333333333, ans=0.0 2023-09-29 07:40:06,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:40:07,304 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.79 vs. limit=15.0 2023-09-29 07:40:08,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:40:09,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:40:09,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 07:40:13,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:40:15,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 07:40:17,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:40:18,196 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.09 vs. limit=15.0 2023-09-29 07:40:18,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:40:20,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:40:20,885 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=295240.0, ans=0.2 2023-09-29 07:40:23,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:40:23,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 07:40:25,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:40:25,751 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.09 vs. limit=22.5 2023-09-29 07:40:26,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:40:27,934 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 2.058e+02 2.373e+02 2.670e+02 4.900e+02, threshold=4.746e+02, percent-clipped=2.0 2023-09-29 07:40:28,391 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:40:31,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:40:34,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:40:35,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:40:35,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 07:40:35,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:40:37,726 INFO [train.py:1039] (3/4) Epoch 9, batch 1800, loss[loss=0.2037, simple_loss=0.2416, pruned_loss=0.08291, over 19340.00 frames. ], tot_loss[loss=0.2114, simple_loss=0.2784, pruned_loss=0.07224, over 4690228.14 frames. ], batch size: 389, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:40:37,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:40:37,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:40:37,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:40:37,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:40:39,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:40:42,511 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:40:43,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:40:46,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 07:40:47,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:40:50,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 07:40:52,980 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:40:56,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:40:59,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:40:59,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:41:00,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:41:02,511 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:41:02,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 07:41:02,673 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:41:05,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:41:07,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=295373.3333333333, ans=0.125 2023-09-29 07:41:10,518 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 07:41:12,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 07:41:12,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 07:41:12,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:41:16,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:41:16,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:41:17,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:41:24,697 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 07:41:26,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:41:28,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:41:28,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=295506.6666666667, ans=0.0 2023-09-29 07:41:30,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 07:41:30,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 07:41:30,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:41:31,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:41:31,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:41:32,487 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.82 vs. limit=22.5 2023-09-29 07:41:37,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 07:41:43,701 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.61 vs. limit=15.0 2023-09-29 07:41:44,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:41:44,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 07:41:46,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:41:46,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:41:47,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:41:47,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 07:41:49,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:41:49,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:41:53,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 07:41:53,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:41:56,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:41:56,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:41:56,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:41:58,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:42:00,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:42:02,016 INFO [train.py:1039] (3/4) Epoch 9, batch 1850, loss[loss=0.2108, simple_loss=0.2703, pruned_loss=0.07567, over 23486.00 frames. ], tot_loss[loss=0.2121, simple_loss=0.2792, pruned_loss=0.07245, over 4693203.79 frames. ], batch size: 134, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:42:02,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:42:03,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:42:03,936 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=295640.0, ans=0.0 2023-09-29 07:42:06,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:42:08,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:42:11,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=295640.0, ans=0.09899494936611666 2023-09-29 07:42:11,922 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:42:14,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:42:16,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 07:42:20,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 07:42:20,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=295706.6666666667, ans=0.125 2023-09-29 07:42:23,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 07:42:28,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:42:28,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 07:42:28,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 07:42:39,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:42:41,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 07:42:44,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:42:44,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:42:45,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=295773.3333333333, ans=0.125 2023-09-29 07:42:49,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 07:42:49,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:42:49,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 07:42:50,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:42:52,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:42:55,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:42:57,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:42:57,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:42:59,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 07:42:59,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:43:01,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:43:02,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:43:06,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 07:43:07,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:43:09,096 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=295906.6666666667, ans=0.125 2023-09-29 07:43:13,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:43:13,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:43:13,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 07:43:13,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 07:43:13,674 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=295906.6666666667, ans=0.0 2023-09-29 07:43:14,729 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 2.027e+02 2.265e+02 2.527e+02 4.357e+02, threshold=4.531e+02, percent-clipped=0.0 2023-09-29 07:43:15,063 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 07:43:16,574 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 07:43:18,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:43:19,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:43:19,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:43:19,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:43:19,674 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 07:43:19,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:43:20,007 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=295906.6666666667, ans=0.125 2023-09-29 07:43:21,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:43:21,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:43:22,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:43:24,108 INFO [train.py:1039] (3/4) Epoch 9, batch 1900, loss[loss=0.2318, simple_loss=0.3019, pruned_loss=0.08085, over 23696.00 frames. ], tot_loss[loss=0.2123, simple_loss=0.2801, pruned_loss=0.07222, over 4701884.91 frames. ], batch size: 85, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:43:24,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:43:24,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 07:43:25,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:43:25,903 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 07:43:25,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:43:27,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:43:29,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=295973.3333333333, ans=0.0 2023-09-29 07:43:32,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:43:35,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:43:37,511 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 07:43:37,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 07:43:39,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:43:41,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:43:41,486 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 07:43:41,530 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 07:43:45,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 07:43:46,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:43:50,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 07:43:53,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 07:44:05,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 07:44:08,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 07:44:08,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:44:09,722 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 07:44:09,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 07:44:09,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 07:44:11,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 07:44:11,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:44:15,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 07:44:20,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:44:21,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:44:21,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 07:44:23,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:44:26,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 07:44:28,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:44:34,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:44:34,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:44:34,445 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:44:34,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=296240.0, ans=0.5 2023-09-29 07:44:36,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:44:37,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 07:44:37,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 07:44:41,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:44:44,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:44:44,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:44:46,174 INFO [train.py:1039] (3/4) Epoch 9, batch 1950, loss[loss=0.2069, simple_loss=0.286, pruned_loss=0.06391, over 24582.00 frames. ], tot_loss[loss=0.2124, simple_loss=0.2805, pruned_loss=0.07212, over 4714103.26 frames. ], batch size: 71, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:44:47,138 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.21 vs. limit=22.5 2023-09-29 07:44:47,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:44:47,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:44:47,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 07:44:49,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:44:50,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:44:55,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:44:56,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:44:56,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:44:58,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 07:44:58,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 07:44:58,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:45:00,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:45:04,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:45:04,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:04,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:45:05,134 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=296373.3333333333, ans=0.125 2023-09-29 07:45:06,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:45:10,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:45:10,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:45:10,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 07:45:11,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:45:15,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:45:19,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:45:19,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:19,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 07:45:19,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 07:45:19,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 07:45:19,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:45:21,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:45:24,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:45:26,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:45:30,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:45:33,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:45:33,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:45:34,638 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 07:45:34,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:45:39,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:45:39,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:45:40,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:45:48,293 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.20 vs. limit=12.0 2023-09-29 07:45:49,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:49,879 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.30 vs. limit=15.0 2023-09-29 07:45:50,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:53,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:45:55,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:45:59,266 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.955e+02 2.207e+02 2.649e+02 3.533e+02, threshold=4.414e+02, percent-clipped=0.0 2023-09-29 07:45:59,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:45:59,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:46:00,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 07:46:00,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:46:03,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:46:04,594 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 07:46:04,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=296573.3333333333, ans=0.125 2023-09-29 07:46:06,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:46:09,268 INFO [train.py:1039] (3/4) Epoch 9, batch 2000, loss[loss=0.2053, simple_loss=0.2612, pruned_loss=0.07475, over 22709.00 frames. ], tot_loss[loss=0.2137, simple_loss=0.2816, pruned_loss=0.07284, over 4703450.81 frames. ], batch size: 322, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:46:09,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:46:10,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:46:10,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:46:11,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:46:12,740 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:46:17,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 07:46:17,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 07:46:23,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:46:25,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 07:46:26,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 07:46:26,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:46:30,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:46:31,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 07:46:35,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:46:36,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:46:36,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:46:39,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 07:46:39,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 07:46:42,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 07:46:42,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:46:44,189 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:46:45,255 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:46:46,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 07:46:46,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:46:48,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:46:49,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:46:49,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 07:46:50,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=296773.3333333333, ans=0.125 2023-09-29 07:46:52,158 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=19.67 vs. limit=22.5 2023-09-29 07:46:53,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 07:46:53,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:46:53,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:47:01,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:47:02,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:47:02,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:47:02,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:47:04,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:47:04,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:47:06,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:47:06,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:47:07,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:10,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:47:12,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 07:47:15,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:47:16,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:47:21,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:47:21,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:47:24,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:24,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=296906.6666666667, ans=0.0 2023-09-29 07:47:25,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:47:25,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:27,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:47:27,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:47:29,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:47:31,099 INFO [train.py:1039] (3/4) Epoch 9, batch 2050, loss[loss=0.2155, simple_loss=0.2654, pruned_loss=0.08277, over 23368.00 frames. ], tot_loss[loss=0.2122, simple_loss=0.2801, pruned_loss=0.07214, over 4706416.52 frames. ], batch size: 285, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:47:31,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:34,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:47:34,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:36,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=296973.3333333333, ans=0.125 2023-09-29 07:47:41,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:47:43,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=296973.3333333333, ans=0.1 2023-09-29 07:47:43,494 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.56 vs. limit=10.0 2023-09-29 07:47:46,061 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:47:46,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:47:48,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:47:49,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 07:47:49,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:47:50,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:47:51,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:48:00,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:48:00,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:48:03,214 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.22 vs. limit=15.0 2023-09-29 07:48:03,913 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 07:48:06,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:48:07,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 07:48:07,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:48:09,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:48:11,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:48:13,276 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:48:13,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:48:14,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:48:16,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:48:16,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:48:18,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=297106.6666666667, ans=0.125 2023-09-29 07:48:21,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:48:24,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 07:48:25,912 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:48:26,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:48:32,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:48:32,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=297173.3333333333, ans=0.1 2023-09-29 07:48:36,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:48:36,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 07:48:42,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:48:44,026 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 2.106e+02 2.256e+02 2.757e+02 3.895e+02, threshold=4.512e+02, percent-clipped=0.0 2023-09-29 07:48:44,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:48:47,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:48:48,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 07:48:52,661 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 07:48:52,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:48:54,315 INFO [train.py:1039] (3/4) Epoch 9, batch 2100, loss[loss=0.2125, simple_loss=0.2654, pruned_loss=0.07982, over 23386.00 frames. ], tot_loss[loss=0.2114, simple_loss=0.2787, pruned_loss=0.07205, over 4699922.66 frames. ], batch size: 285, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:48:54,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:48:54,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:48:56,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:48:56,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 07:48:57,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 07:48:59,010 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 07:49:00,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=297306.6666666667, ans=0.125 2023-09-29 07:49:02,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:49:02,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:49:05,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:49:06,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:49:06,718 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 07:49:06,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 07:49:07,017 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=297306.6666666667, ans=0.0 2023-09-29 07:49:08,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 07:49:08,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 07:49:08,646 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=297373.3333333333, ans=0.2 2023-09-29 07:49:09,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:49:09,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:49:10,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 07:49:11,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 07:49:18,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 07:49:18,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:49:18,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=297373.3333333333, ans=0.125 2023-09-29 07:49:23,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:49:23,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:49:28,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:49:29,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 07:49:29,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:49:29,922 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 07:49:32,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 07:49:32,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:49:32,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 07:49:33,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 07:49:33,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 07:49:35,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:49:36,978 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:49:40,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=297440.0, ans=0.125 2023-09-29 07:49:41,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:49:41,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 07:49:42,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:49:44,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:49:44,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 07:49:44,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:49:44,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:49:44,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:49:46,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 07:49:48,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 07:49:48,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 07:49:53,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:49:56,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:49:56,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 07:50:03,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:50:05,264 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=297573.3333333333, ans=0.125 2023-09-29 07:50:06,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:50:06,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:50:06,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:50:06,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 07:50:08,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:50:08,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:50:09,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:50:10,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:50:11,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:50:13,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 07:50:14,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 07:50:14,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:50:16,068 INFO [train.py:1039] (3/4) Epoch 9, batch 2150, loss[loss=0.1837, simple_loss=0.2513, pruned_loss=0.05805, over 24307.00 frames. ], tot_loss[loss=0.2108, simple_loss=0.2786, pruned_loss=0.07149, over 4711808.61 frames. ], batch size: 56, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:50:16,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=297640.0, ans=0.05 2023-09-29 07:50:19,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:50:19,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:50:19,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:50:19,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:50:25,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 07:50:26,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:50:26,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:50:28,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:50:28,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:50:28,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:50:33,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:50:33,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:50:33,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:50:38,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:50:38,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 07:50:44,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:50:46,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:50:48,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:50:48,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:50:48,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:50:48,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:50:49,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:50:49,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:50:51,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:50:52,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 07:50:54,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 07:50:56,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:50:57,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:50:57,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:50:59,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:51:01,581 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:51:01,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:51:03,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:51:03,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 07:51:03,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 07:51:07,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:51:07,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:09,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:51:09,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 07:51:11,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:12,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:12,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 07:51:13,882 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.24 vs. limit=10.0 2023-09-29 07:51:15,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 07:51:15,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 07:51:16,532 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 07:51:17,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:17,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:51:19,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 07:51:19,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:51:19,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 07:51:19,490 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 07:51:19,490 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 07:51:20,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 07:51:22,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:22,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:51:22,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:51:24,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:25,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 07:51:27,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:27,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:28,409 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 2.066e+02 2.283e+02 2.527e+02 4.333e+02, threshold=4.566e+02, percent-clipped=0.0 2023-09-29 07:51:37,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:51:37,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 07:51:39,051 INFO [train.py:1039] (3/4) Epoch 9, batch 2200, loss[loss=0.2339, simple_loss=0.317, pruned_loss=0.07545, over 24566.00 frames. ], tot_loss[loss=0.2115, simple_loss=0.2792, pruned_loss=0.07194, over 4708326.66 frames. ], batch size: 71, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:51:41,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:51:46,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:51:47,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:51:47,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:51:48,329 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.85 vs. limit=22.5 2023-09-29 07:51:49,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 07:51:52,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:51:54,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:51:54,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 07:51:58,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 07:52:00,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 07:52:07,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 07:52:07,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=298040.0, ans=0.1 2023-09-29 07:52:10,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:52:11,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:52:13,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 07:52:15,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:52:15,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=298106.6666666667, ans=0.125 2023-09-29 07:52:16,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 07:52:20,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 07:52:21,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:52:23,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 07:52:26,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:52:28,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:52:30,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:52:31,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:52:33,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 07:52:34,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:52:36,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 07:52:38,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:52:38,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 07:52:39,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:52:41,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=298173.3333333333, ans=0.0 2023-09-29 07:52:42,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 07:52:42,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:52:42,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:52:42,948 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:52:43,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=298240.0, ans=0.0 2023-09-29 07:52:44,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 07:52:44,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:52:48,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 07:52:53,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 07:52:53,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:52:56,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:52:57,862 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 07:53:00,818 INFO [train.py:1039] (3/4) Epoch 9, batch 2250, loss[loss=0.2155, simple_loss=0.2966, pruned_loss=0.06723, over 24466.00 frames. ], tot_loss[loss=0.213, simple_loss=0.2806, pruned_loss=0.07272, over 4709846.31 frames. ], batch size: 69, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:53:00,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:53:00,992 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 07:53:01,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=298306.6666666667, ans=0.125 2023-09-29 07:53:03,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 07:53:03,096 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 07:53:03,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=298306.6666666667, ans=0.125 2023-09-29 07:53:04,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:53:04,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=298306.6666666667, ans=0.125 2023-09-29 07:53:06,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 07:53:06,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:53:07,821 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 07:53:08,546 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.23 vs. limit=15.0 2023-09-29 07:53:09,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:53:12,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:53:18,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:53:20,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 07:53:22,295 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=298373.3333333333, ans=0.025 2023-09-29 07:53:24,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:53:24,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:53:24,870 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.07 vs. limit=15.0 2023-09-29 07:53:25,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 07:53:27,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 07:53:27,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:53:29,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:53:30,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 07:53:32,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:53:32,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:53:32,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=298440.0, ans=0.1 2023-09-29 07:53:33,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 07:53:37,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:53:40,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 07:53:40,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 07:53:40,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 07:53:41,713 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.30 vs. limit=15.0 2023-09-29 07:53:42,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:53:45,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:53:47,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=298440.0, ans=0.125 2023-09-29 07:53:51,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:53:53,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:53:54,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:53:54,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:53:58,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:54:00,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:54:05,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:54:05,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 07:54:12,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 07:54:12,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 07:54:13,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:54:15,291 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.931e+02 2.128e+02 2.517e+02 4.313e+02, threshold=4.256e+02, percent-clipped=0.0 2023-09-29 07:54:17,510 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 07:54:18,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 07:54:20,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:54:20,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 07:54:21,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:54:21,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 07:54:23,827 INFO [train.py:1039] (3/4) Epoch 9, batch 2300, loss[loss=0.2318, simple_loss=0.2893, pruned_loss=0.08713, over 23750.00 frames. ], tot_loss[loss=0.2142, simple_loss=0.2816, pruned_loss=0.07339, over 4705003.82 frames. ], batch size: 164, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:54:24,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=298640.0, ans=0.0 2023-09-29 07:54:24,709 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.79 vs. limit=12.0 2023-09-29 07:54:25,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 07:54:28,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:54:30,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:54:35,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:54:37,912 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:54:40,011 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.08 vs. limit=15.0 2023-09-29 07:54:40,869 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 07:54:42,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:54:49,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:54:49,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:54:49,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:54:50,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:54:50,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 07:54:50,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:54:53,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:54:53,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:54:57,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 07:55:00,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:55:03,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:55:07,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:55:07,868 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:55:10,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:55:11,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=298773.3333333333, ans=0.025 2023-09-29 07:55:14,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:55:17,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:55:19,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 07:55:19,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:55:19,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 07:55:24,108 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 07:55:24,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:55:24,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:55:24,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:55:25,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:55:27,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 07:55:27,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 07:55:27,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 07:55:27,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:55:27,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:55:27,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 07:55:34,400 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:55:36,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=298906.6666666667, ans=0.125 2023-09-29 07:55:40,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:55:44,061 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:55:44,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:55:44,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 07:55:47,459 INFO [train.py:1039] (3/4) Epoch 9, batch 2350, loss[loss=0.2008, simple_loss=0.2859, pruned_loss=0.05788, over 24340.00 frames. ], tot_loss[loss=0.2162, simple_loss=0.2832, pruned_loss=0.07466, over 4700244.82 frames. ], batch size: 74, lr: 1.16e-02, grad_scale: 8.0 2023-09-29 07:55:47,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 07:55:47,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:55:47,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 07:55:47,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 07:55:55,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:55:55,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 07:55:59,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=298973.3333333333, ans=0.0 2023-09-29 07:56:02,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 07:56:07,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:56:10,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:56:10,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:56:10,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:56:12,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:56:12,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 07:56:15,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:56:20,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 07:56:22,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:56:25,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:56:25,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 07:56:28,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 07:56:30,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 07:56:30,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 07:56:33,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:56:33,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:56:33,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:56:37,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 07:56:40,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 07:56:40,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:56:43,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:56:43,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:56:45,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 07:56:46,287 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=299173.3333333333, ans=0.5 2023-09-29 07:56:47,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:56:49,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 07:56:50,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 07:56:54,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 07:56:58,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 07:57:00,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:57:00,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 07:57:00,313 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 07:57:00,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 07:57:01,783 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.694e+02 2.074e+02 2.290e+02 2.554e+02 3.364e+02, threshold=4.579e+02, percent-clipped=0.0 2023-09-29 07:57:04,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 07:57:05,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:57:10,738 INFO [train.py:1039] (3/4) Epoch 9, batch 2400, loss[loss=0.2136, simple_loss=0.2935, pruned_loss=0.06684, over 24630.00 frames. ], tot_loss[loss=0.2148, simple_loss=0.2821, pruned_loss=0.07373, over 4708710.61 frames. ], batch size: 73, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:57:10,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:57:13,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:57:16,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 07:57:17,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 07:57:17,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 07:57:26,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 07:57:26,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:57:28,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 07:57:28,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:57:29,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:57:31,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 07:57:38,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:57:38,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 07:57:43,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 07:57:47,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 07:57:50,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:57:51,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:57:53,842 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.50 vs. limit=15.0 2023-09-29 07:57:57,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:57:58,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 07:57:58,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 07:58:03,226 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:58:06,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:58:12,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:58:13,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 07:58:14,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 07:58:14,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 07:58:14,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:58:15,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:58:15,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 07:58:18,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:58:20,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 07:58:20,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 07:58:21,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 07:58:22,017 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=299573.3333333333, ans=0.1 2023-09-29 07:58:23,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:58:24,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:58:24,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 07:58:26,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 07:58:26,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 07:58:26,179 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 07:58:26,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 07:58:26,568 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=299573.3333333333, ans=0.125 2023-09-29 07:58:27,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 07:58:30,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:58:30,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:58:31,572 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 07:58:31,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:58:31,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 07:58:33,120 INFO [train.py:1039] (3/4) Epoch 9, batch 2450, loss[loss=0.2048, simple_loss=0.2395, pruned_loss=0.08511, over 19256.00 frames. ], tot_loss[loss=0.2123, simple_loss=0.2797, pruned_loss=0.07243, over 4697917.35 frames. ], batch size: 388, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 07:58:36,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 07:58:36,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:58:42,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:58:42,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:58:44,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 07:58:45,276 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=299640.0, ans=0.1 2023-09-29 07:58:50,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:58:50,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:58:54,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 07:58:54,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 07:58:54,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 07:58:54,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 07:58:58,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:58:59,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 07:59:02,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 07:59:04,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 07:59:05,182 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=299773.3333333333, ans=0.125 2023-09-29 07:59:06,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:59:06,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:59:07,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 07:59:10,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 07:59:12,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 07:59:18,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=299773.3333333333, ans=0.125 2023-09-29 07:59:19,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:59:19,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=299773.3333333333, ans=0.1 2023-09-29 07:59:21,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 07:59:21,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:59:21,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 07:59:21,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:59:23,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 07:59:24,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 07:59:26,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 07:59:27,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 07:59:28,613 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.92 vs. limit=15.0 2023-09-29 07:59:29,706 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=299840.0, ans=0.125 2023-09-29 07:59:30,017 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.94 vs. limit=15.0 2023-09-29 07:59:30,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 07:59:30,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 07:59:34,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=299840.0, ans=0.0 2023-09-29 07:59:36,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 07:59:36,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 07:59:37,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 07:59:39,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 07:59:39,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 07:59:40,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 07:59:42,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 07:59:43,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=299906.6666666667, ans=0.125 2023-09-29 07:59:46,461 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 2.171e+02 2.453e+02 2.838e+02 4.289e+02, threshold=4.906e+02, percent-clipped=0.0 2023-09-29 07:59:46,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 07:59:48,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 07:59:49,642 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.43 vs. limit=22.5 2023-09-29 07:59:50,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 07:59:53,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 07:59:54,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 07:59:56,033 INFO [train.py:1039] (3/4) Epoch 9, batch 2500, loss[loss=0.1928, simple_loss=0.261, pruned_loss=0.06234, over 24342.00 frames. ], tot_loss[loss=0.2113, simple_loss=0.2788, pruned_loss=0.07188, over 4711348.60 frames. ], batch size: 56, lr: 1.16e-02, grad_scale: 16.0 2023-09-29 08:00:00,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:00:05,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=299973.3333333333, ans=0.125 2023-09-29 08:00:12,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:00:12,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:00:14,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:00:14,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 08:00:20,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=300040.0, ans=0.125 2023-09-29 08:00:21,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:00:21,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:00:23,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 08:00:23,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:00:23,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 08:00:23,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:00:25,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:00:25,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 08:00:25,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:00:27,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=300106.6666666667, ans=0.0 2023-09-29 08:00:28,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 08:00:28,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:00:33,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:00:33,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:00:36,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:00:38,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 08:00:38,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:00:41,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:00:44,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:00:49,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:00:52,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:00:57,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 08:01:01,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 08:01:01,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:01:01,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 08:01:04,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:01:04,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:01:04,611 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 08:01:04,612 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 08:01:04,620 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 08:01:08,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:01:09,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 08:01:09,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 08:01:11,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:01:11,271 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 08:01:14,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 08:01:16,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:01:18,152 INFO [train.py:1039] (3/4) Epoch 9, batch 2550, loss[loss=0.1925, simple_loss=0.2678, pruned_loss=0.0586, over 24475.00 frames. ], tot_loss[loss=0.212, simple_loss=0.2794, pruned_loss=0.07232, over 4710677.54 frames. ], batch size: 66, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:01:18,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=300306.6666666667, ans=0.125 2023-09-29 08:01:19,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:01:19,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:01:22,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:01:22,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 08:01:22,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:01:23,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=300306.6666666667, ans=0.1 2023-09-29 08:01:27,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 08:01:27,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:01:31,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:01:34,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:01:34,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 08:01:35,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:01:35,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:01:37,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:01:40,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:01:41,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 08:01:42,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 08:01:42,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:01:42,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 08:01:42,644 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=300373.3333333333, ans=0.125 2023-09-29 08:01:56,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:01:57,771 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.68 vs. limit=15.0 2023-09-29 08:01:58,598 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=300440.0, ans=0.07 2023-09-29 08:02:02,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:02:04,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:02:04,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:02:04,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:02:04,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=300506.6666666667, ans=0.0 2023-09-29 08:02:11,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:02:11,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=300506.6666666667, ans=0.0 2023-09-29 08:02:13,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:02:15,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:02:15,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:02:15,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 08:02:16,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:02:19,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:02:19,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:02:25,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:02:26,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 08:02:26,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:02:26,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:02:26,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 08:02:28,492 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.38 vs. limit=12.0 2023-09-29 08:02:29,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 08:02:29,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:02:30,791 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.416e+02 1.909e+02 2.105e+02 2.404e+02 4.394e+02, threshold=4.210e+02, percent-clipped=0.0 2023-09-29 08:02:35,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:02:37,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:02:39,470 INFO [train.py:1039] (3/4) Epoch 9, batch 2600, loss[loss=0.2166, simple_loss=0.2813, pruned_loss=0.07596, over 23664.00 frames. ], tot_loss[loss=0.2124, simple_loss=0.2801, pruned_loss=0.07239, over 4718355.90 frames. ], batch size: 149, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:02:39,679 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 08:02:40,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=300640.0, ans=0.125 2023-09-29 08:02:42,750 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 08:02:42,775 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:02:42,835 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 08:02:42,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 08:02:44,256 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 08:02:46,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:02:46,569 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 08:02:48,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 08:02:50,067 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 08:02:53,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:02:54,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 08:02:56,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 08:02:58,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 08:02:58,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 08:03:01,293 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 08:03:01,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 08:03:03,843 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.35 vs. limit=15.0 2023-09-29 08:03:07,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:03:07,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:03:07,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:03:07,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 08:03:09,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:03:17,387 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 08:03:17,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=300773.3333333333, ans=10.0 2023-09-29 08:03:24,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:03:24,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:03:26,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 08:03:27,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:03:27,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:03:27,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 08:03:29,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=300840.0, ans=0.125 2023-09-29 08:03:30,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:03:30,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:03:34,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:03:39,109 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 08:03:39,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:03:40,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:03:42,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=300840.0, ans=0.0 2023-09-29 08:03:45,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:03:47,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:03:47,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 08:03:47,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:03:49,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:03:50,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:03:57,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 08:03:58,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:04:00,373 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:04:01,776 INFO [train.py:1039] (3/4) Epoch 9, batch 2650, loss[loss=0.2183, simple_loss=0.2879, pruned_loss=0.07435, over 23330.00 frames. ], tot_loss[loss=0.2135, simple_loss=0.2809, pruned_loss=0.07308, over 4707441.17 frames. ], batch size: 105, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:04:03,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 08:04:03,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:04:05,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:04:05,247 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 08:04:05,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:04:07,644 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=300973.3333333333, ans=0.125 2023-09-29 08:04:08,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:04:09,834 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=13.70 vs. limit=15.0 2023-09-29 08:04:11,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:04:12,414 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.84 vs. limit=22.5 2023-09-29 08:04:13,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:04:16,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:04:16,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 08:04:16,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:04:17,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:04:20,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 08:04:23,134 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 08:04:26,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:04:27,850 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 08:04:27,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:04:29,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 08:04:34,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=301106.6666666667, ans=0.5 2023-09-29 08:04:35,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:04:35,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:04:35,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:04:35,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:04:36,974 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=301106.6666666667, ans=0.0 2023-09-29 08:04:39,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 08:04:39,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 08:04:43,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:04:46,667 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=301106.6666666667, ans=0.125 2023-09-29 08:04:48,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 08:04:48,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:04:49,459 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:04:50,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:04:50,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:04:52,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:04:52,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=301173.3333333333, ans=0.0 2023-09-29 08:04:53,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:04:55,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:04:55,648 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:04:55,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:04:56,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=301173.3333333333, ans=0.125 2023-09-29 08:04:57,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:04:59,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:04:59,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:04:59,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=301173.3333333333, ans=0.0 2023-09-29 08:05:01,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:05:02,042 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.38 vs. limit=10.0 2023-09-29 08:05:02,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:05:02,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 08:05:06,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:05:07,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:05:07,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:05:07,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 08:05:13,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:05:13,948 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:05:15,910 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.708e+02 2.153e+02 2.553e+02 3.125e+02 4.988e+02, threshold=5.107e+02, percent-clipped=5.0 2023-09-29 08:05:17,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:05:17,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:05:19,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:05:20,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:05:22,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:05:22,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 08:05:23,521 INFO [train.py:1039] (3/4) Epoch 9, batch 2700, loss[loss=0.2909, simple_loss=0.3388, pruned_loss=0.1216, over 19743.00 frames. ], tot_loss[loss=0.2151, simple_loss=0.2824, pruned_loss=0.07386, over 4703107.41 frames. ], batch size: 389, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:05:26,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:05:28,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 08:05:30,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:05:30,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:05:30,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:05:31,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:05:31,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:05:31,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:05:31,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 08:05:31,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 08:05:33,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:05:36,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:05:38,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:05:38,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:05:40,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=301373.3333333333, ans=0.05 2023-09-29 08:05:43,008 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.24 vs. limit=6.0 2023-09-29 08:05:45,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:05:45,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 08:05:46,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:05:47,641 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.14 vs. limit=15.0 2023-09-29 08:05:52,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:05:52,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:05:58,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:05:58,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:05:58,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:05:58,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:05:58,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=301440.0, ans=0.125 2023-09-29 08:06:02,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:06:05,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:06:06,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:06:06,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:06:10,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:06:10,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:06:14,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=301506.6666666667, ans=0.04949747468305833 2023-09-29 08:06:19,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:06:21,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:06:25,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:06:25,427 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:06:27,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:06:29,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:06:30,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:06:32,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:06:33,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:06:33,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:06:34,564 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.02 vs. limit=15.0 2023-09-29 08:06:36,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:06:38,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:06:38,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:06:40,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 08:06:42,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:06:43,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:06:43,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 08:06:44,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=301640.0, ans=0.125 2023-09-29 08:06:45,124 INFO [train.py:1039] (3/4) Epoch 9, batch 2750, loss[loss=0.191, simple_loss=0.2658, pruned_loss=0.0581, over 24488.00 frames. ], tot_loss[loss=0.2144, simple_loss=0.2819, pruned_loss=0.0735, over 4712554.06 frames. ], batch size: 66, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:06:45,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 08:06:46,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:06:50,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:06:50,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:06:54,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:06:54,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:06:54,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:06:57,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:06:58,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 08:07:00,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:07:00,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:07:00,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 08:07:00,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:07:00,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:07:05,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 08:07:08,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:07:08,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:07:10,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:07:10,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:07:10,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:07:11,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:07:11,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:07:13,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:07:18,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:07:18,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:07:18,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:07:20,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:07:22,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 08:07:25,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=301773.3333333333, ans=0.125 2023-09-29 08:07:30,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:07:31,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:07:33,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:07:38,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:07:38,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:07:38,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:07:43,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:07:43,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:07:43,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 08:07:48,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:07:50,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 08:07:51,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=301906.6666666667, ans=0.0 2023-09-29 08:07:56,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 08:07:59,502 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 1.994e+02 2.310e+02 2.732e+02 5.086e+02, threshold=4.620e+02, percent-clipped=0.0 2023-09-29 08:08:01,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:08:01,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 08:08:03,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:08:03,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:08:04,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 08:08:04,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:08:06,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 08:08:06,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:08:06,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer_ff3.min_abs, batch_count=301973.3333333333, ans=0.2 2023-09-29 08:08:08,087 INFO [train.py:1039] (3/4) Epoch 9, batch 2800, loss[loss=0.1943, simple_loss=0.2827, pruned_loss=0.05295, over 24312.00 frames. ], tot_loss[loss=0.2129, simple_loss=0.2803, pruned_loss=0.07271, over 4717019.42 frames. ], batch size: 74, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:08:08,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:08:08,366 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=301973.3333333333, ans=0.1 2023-09-29 08:08:10,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 08:08:10,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:08:11,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:08:13,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:08:14,765 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 08:08:14,766 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 08:08:18,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:08:21,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:08:21,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:08:26,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:08:26,417 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 08:08:26,737 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=302040.0, ans=0.07 2023-09-29 08:08:29,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 08:08:30,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 08:08:31,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:08:31,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:08:31,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:08:36,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:08:37,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:08:37,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 08:08:37,855 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.93 vs. limit=15.0 2023-09-29 08:08:38,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:08:38,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=302040.0, ans=0.2 2023-09-29 08:08:45,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=302106.6666666667, ans=0.125 2023-09-29 08:08:47,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:08:49,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:08:51,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:08:52,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:08:54,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:08:59,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:08:59,310 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 08:08:59,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:09:01,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:09:01,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:09:04,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:09:05,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:09:09,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=302173.3333333333, ans=0.0 2023-09-29 08:09:10,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:09:12,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:09:12,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:09:12,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:09:12,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 08:09:13,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:09:15,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:09:15,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 08:09:15,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:09:17,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:09:17,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:09:19,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 08:09:21,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:09:21,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:09:22,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:09:23,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 08:09:30,426 INFO [train.py:1039] (3/4) Epoch 9, batch 2850, loss[loss=0.2091, simple_loss=0.2928, pruned_loss=0.06265, over 24562.00 frames. ], tot_loss[loss=0.2122, simple_loss=0.2795, pruned_loss=0.07245, over 4724578.07 frames. ], batch size: 71, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:09:30,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:09:30,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 08:09:32,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:09:33,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:09:37,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:09:37,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:09:38,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:09:40,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:09:42,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:09:44,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:09:44,989 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 08:09:50,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 08:09:50,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:09:53,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 08:09:55,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:09:56,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 08:09:59,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 08:10:00,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:10:11,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=302440.0, ans=0.0 2023-09-29 08:10:13,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:10:14,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:10:14,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:10:16,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 08:10:16,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:10:16,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:10:18,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:10:18,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 08:10:19,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:10:21,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:10:21,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:10:22,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:10:24,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:10:24,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:10:26,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:10:28,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:10:29,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:10:31,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:10:34,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:10:36,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:10:40,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:10:43,751 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 1.988e+02 2.184e+02 2.463e+02 3.940e+02, threshold=4.369e+02, percent-clipped=0.0 2023-09-29 08:10:43,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 08:10:43,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 08:10:45,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:10:47,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:10:47,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 08:10:49,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:10:49,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:10:49,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:10:50,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:10:50,579 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 08:10:50,674 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 08:10:50,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:10:51,988 INFO [train.py:1039] (3/4) Epoch 9, batch 2900, loss[loss=0.1892, simple_loss=0.2542, pruned_loss=0.06212, over 24491.00 frames. ], tot_loss[loss=0.2127, simple_loss=0.2794, pruned_loss=0.07297, over 4707281.29 frames. ], batch size: 58, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:10:52,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:10:56,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 08:10:56,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:11:00,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:11:00,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 08:11:04,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:11:04,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 08:11:06,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 08:11:06,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:11:06,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:11:06,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=302640.0, ans=0.125 2023-09-29 08:11:07,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:11:10,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:11:14,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:11:14,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:11:17,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 08:11:18,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 08:11:20,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:11:20,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:11:23,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 08:11:25,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 08:11:28,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:11:28,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 08:11:28,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:11:31,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:11:31,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 08:11:33,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:11:35,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:11:39,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:11:41,980 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:11:42,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 08:11:44,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 08:11:44,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:11:48,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:11:51,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 08:11:53,208 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:11:57,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:12:08,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:12:08,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:12:09,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 08:12:14,627 INFO [train.py:1039] (3/4) Epoch 9, batch 2950, loss[loss=0.2266, simple_loss=0.2945, pruned_loss=0.07939, over 23997.00 frames. ], tot_loss[loss=0.212, simple_loss=0.2794, pruned_loss=0.07237, over 4718411.28 frames. ], batch size: 86, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:12:14,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:12:14,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 08:12:14,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:12:16,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:12:20,384 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:12:21,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:12:22,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=302973.3333333333, ans=0.125 2023-09-29 08:12:23,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 08:12:24,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:12:24,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:12:26,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:12:27,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:12:29,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 08:12:29,908 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.97 vs. limit=22.5 2023-09-29 08:12:30,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 08:12:30,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:12:30,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:12:37,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:12:39,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:12:41,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:12:41,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:12:45,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=303040.0, ans=0.2 2023-09-29 08:12:46,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:12:46,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:12:47,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:12:49,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:12:49,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:12:52,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 08:12:57,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 08:12:57,993 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 08:12:59,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:13:00,864 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 08:13:02,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 08:13:02,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:13:03,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:13:03,926 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 08:13:03,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:13:08,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 08:13:08,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:13:10,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:13:12,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=303173.3333333333, ans=0.125 2023-09-29 08:13:14,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:13:15,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:13:16,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:13:16,463 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 08:13:16,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:13:16,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 08:13:20,737 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=303240.0, ans=0.125 2023-09-29 08:13:23,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:13:23,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:13:25,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 08:13:25,171 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:13:26,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 08:13:27,845 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.24 vs. limit=15.0 2023-09-29 08:13:28,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:13:29,731 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.974e+02 2.174e+02 2.569e+02 4.331e+02, threshold=4.348e+02, percent-clipped=0.0 2023-09-29 08:13:30,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:13:31,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:13:33,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:13:33,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 08:13:35,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:13:36,351 INFO [train.py:1039] (3/4) Epoch 9, batch 3000, loss[loss=0.2863, simple_loss=0.3294, pruned_loss=0.1216, over 19532.00 frames. ], tot_loss[loss=0.214, simple_loss=0.2809, pruned_loss=0.07357, over 4699434.66 frames. ], batch size: 389, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:13:36,351 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 08:13:47,680 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.1.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([4.6698, 4.1839, 3.9684, 3.8417], device='cuda:3') 2023-09-29 08:13:49,676 INFO [train.py:1071] (3/4) Epoch 9, validation: loss=0.2838, simple_loss=0.2753, pruned_loss=0.1462, over 1125622.00 frames. 2023-09-29 08:13:49,677 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 08:13:49,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:13:49,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:13:49,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:13:49,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:13:52,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:13:53,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:13:53,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 08:13:55,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:13:58,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:13:59,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:14:01,563 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 08:14:01,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 08:14:05,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:14:06,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:14:06,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 08:14:08,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:14:14,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 08:14:23,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=303440.0, ans=0.2 2023-09-29 08:14:25,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:14:30,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 08:14:31,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:14:33,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:14:33,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:14:33,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:14:35,194 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=303440.0, ans=0.0 2023-09-29 08:14:36,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:14:37,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 08:14:40,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 08:14:40,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:14:41,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 08:14:43,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:14:43,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:14:44,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:14:44,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:14:45,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=303506.6666666667, ans=0.125 2023-09-29 08:14:49,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:14:49,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:14:49,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:14:52,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:14:55,150 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 08:14:57,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:14:58,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:14:58,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:15:03,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:15:03,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:15:04,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 08:15:04,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 08:15:06,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:15:06,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 08:15:06,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:15:06,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=303573.3333333333, ans=0.125 2023-09-29 08:15:07,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 08:15:10,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:15:10,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:15:10,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 08:15:12,232 INFO [train.py:1039] (3/4) Epoch 9, batch 3050, loss[loss=0.2139, simple_loss=0.2749, pruned_loss=0.07647, over 23789.00 frames. ], tot_loss[loss=0.2158, simple_loss=0.2824, pruned_loss=0.07456, over 4701155.43 frames. ], batch size: 179, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:15:12,411 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 08:15:12,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 08:15:13,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:15:15,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:15:15,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 08:15:15,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:15:16,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:15:19,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 08:15:20,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:15:24,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:15:24,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:15:29,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:15:31,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 08:15:38,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 08:15:38,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 08:15:39,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:15:42,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:15:45,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:15:45,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:15:45,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:15:49,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:15:50,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:15:50,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:15:50,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:15:50,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:15:52,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:15:56,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:15:59,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:16:00,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 08:16:00,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:16:00,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:16:04,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:16:06,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:16:06,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:16:07,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:16:12,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:16:13,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:16:18,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:16:19,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:16:19,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:16:21,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:16:21,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:16:21,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:16:23,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 08:16:24,365 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.23 vs. limit=15.0 2023-09-29 08:16:25,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:16:26,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:16:26,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 08:16:28,062 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.681e+02 1.995e+02 2.261e+02 2.647e+02 3.760e+02, threshold=4.522e+02, percent-clipped=0.0 2023-09-29 08:16:28,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:16:35,201 INFO [train.py:1039] (3/4) Epoch 9, batch 3100, loss[loss=0.2301, simple_loss=0.2834, pruned_loss=0.08841, over 23728.00 frames. ], tot_loss[loss=0.2151, simple_loss=0.2821, pruned_loss=0.0741, over 4709633.23 frames. ], batch size: 212, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:16:35,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:16:37,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:16:40,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 08:16:40,851 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=303973.3333333333, ans=0.1 2023-09-29 08:16:42,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 08:16:44,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 08:16:44,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=303973.3333333333, ans=0.0 2023-09-29 08:16:45,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 08:16:47,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:16:50,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:16:50,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:16:53,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 08:16:53,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=304040.0, ans=0.05 2023-09-29 08:16:58,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:17:03,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 08:17:08,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 08:17:10,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:10,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:17:10,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:17:12,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 08:17:14,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:17:14,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 08:17:14,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:17:15,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:17:17,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 08:17:18,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:17:22,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:17:24,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 08:17:24,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 08:17:26,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:26,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:17:28,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=304173.3333333333, ans=0.09899494936611666 2023-09-29 08:17:29,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:17:29,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:29,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:17:29,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=304173.3333333333, ans=0.125 2023-09-29 08:17:31,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:17:31,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:17:33,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:17:33,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:17:33,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:33,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 08:17:38,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:17:39,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 08:17:42,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:17:43,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=304240.0, ans=0.125 2023-09-29 08:17:44,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 08:17:45,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:17:45,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:17:47,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 08:17:55,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 08:17:57,812 INFO [train.py:1039] (3/4) Epoch 9, batch 3150, loss[loss=0.2126, simple_loss=0.2922, pruned_loss=0.06654, over 24455.00 frames. ], tot_loss[loss=0.2135, simple_loss=0.2803, pruned_loss=0.07335, over 4703937.34 frames. ], batch size: 69, lr: 1.15e-02, grad_scale: 16.0 2023-09-29 08:17:58,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:17:59,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:18:01,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:18:01,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:18:02,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 08:18:04,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:18:04,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 08:18:07,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 08:18:09,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:18:11,194 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 08:18:14,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 08:18:14,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:18:14,466 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 08:18:15,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 08:18:17,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 08:18:18,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=304373.3333333333, ans=0.1 2023-09-29 08:18:19,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 08:18:19,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 08:18:19,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:18:19,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:18:20,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:18:23,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 08:18:26,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:18:26,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:18:27,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:18:29,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 08:18:32,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 08:18:33,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:18:36,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:18:37,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=304440.0, ans=0.0 2023-09-29 08:18:38,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:18:38,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 08:18:40,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 08:18:41,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:18:42,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 08:18:42,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 08:18:42,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:18:42,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:18:45,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:18:45,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 08:18:45,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 08:18:47,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:18:47,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:18:48,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:18:49,520 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.68 vs. limit=15.0 2023-09-29 08:18:49,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:18:51,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 08:18:52,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:18:53,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 08:18:53,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:18:55,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 08:18:57,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 08:18:57,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:18:58,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:19:00,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 08:19:01,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 08:19:01,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:19:03,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:19:05,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:19:06,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:19:13,058 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.470e+02 2.121e+02 2.392e+02 3.280e+02 6.565e+02, threshold=4.784e+02, percent-clipped=9.0 2023-09-29 08:19:13,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:19:13,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:19:16,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 08:19:19,797 INFO [train.py:1039] (3/4) Epoch 9, batch 3200, loss[loss=0.251, simple_loss=0.3202, pruned_loss=0.09094, over 23656.00 frames. ], tot_loss[loss=0.2121, simple_loss=0.2798, pruned_loss=0.07216, over 4712414.05 frames. ], batch size: 85, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:19:21,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:19:21,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 08:19:23,973 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.19 vs. limit=12.0 2023-09-29 08:19:26,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:19:28,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:19:28,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 08:19:30,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:19:36,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:19:38,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:19:47,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=304706.6666666667, ans=0.2 2023-09-29 08:19:47,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=304706.6666666667, ans=0.125 2023-09-29 08:19:48,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:19:58,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 08:19:58,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:20:01,338 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=10.36 vs. limit=15.0 2023-09-29 08:20:01,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 08:20:02,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 08:20:07,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:20:07,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:20:07,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=304773.3333333333, ans=0.2 2023-09-29 08:20:08,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:20:13,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 08:20:13,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 08:20:16,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 08:20:20,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 08:20:23,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:20:28,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:20:28,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:20:29,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:20:29,730 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 08:20:29,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:20:33,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:20:33,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=304906.6666666667, ans=0.125 2023-09-29 08:20:36,623 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 08:20:36,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 08:20:37,065 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=304906.6666666667, ans=0.2 2023-09-29 08:20:38,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 08:20:38,995 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.54 vs. limit=15.0 2023-09-29 08:20:39,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 08:20:41,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:20:43,055 INFO [train.py:1039] (3/4) Epoch 9, batch 3250, loss[loss=0.2263, simple_loss=0.3, pruned_loss=0.07634, over 24028.00 frames. ], tot_loss[loss=0.2114, simple_loss=0.279, pruned_loss=0.07195, over 4709393.30 frames. ], batch size: 80, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:20:43,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 08:20:44,817 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 08:20:44,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:20:44,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:20:45,075 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 08:20:45,460 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=1.037e-02 2023-09-29 08:20:49,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:20:52,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:20:56,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=304973.3333333333, ans=0.125 2023-09-29 08:21:02,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:21:02,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 08:21:02,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:21:04,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:21:04,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:21:05,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:21:05,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:21:09,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:21:09,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:21:11,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:21:11,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:21:11,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:21:11,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:21:13,285 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.95 vs. limit=15.0 2023-09-29 08:21:15,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:21:17,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:21:18,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:21:18,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:21:21,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:21:21,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:21:21,088 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:21:23,667 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.84 vs. limit=10.0 2023-09-29 08:21:26,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 08:21:26,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:21:26,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:21:28,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:21:28,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:21:34,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:21:46,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:21:46,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:21:46,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 08:21:46,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:21:46,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 08:21:47,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:21:49,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 08:21:49,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 08:21:50,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:21:51,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:21:52,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:21:52,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 08:21:54,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:21:57,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:21:58,884 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.015e+02 2.327e+02 2.716e+02 4.299e+02, threshold=4.655e+02, percent-clipped=0.0 2023-09-29 08:21:59,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:22:01,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 08:22:01,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:22:02,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:22:02,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 08:22:06,380 INFO [train.py:1039] (3/4) Epoch 9, batch 3300, loss[loss=0.2193, simple_loss=0.2848, pruned_loss=0.07692, over 23705.00 frames. ], tot_loss[loss=0.2125, simple_loss=0.2802, pruned_loss=0.07239, over 4709461.00 frames. ], batch size: 232, lr: 1.15e-02, grad_scale: 32.0 2023-09-29 08:22:06,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:22:06,581 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 08:22:09,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 08:22:11,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 08:22:11,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:22:17,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:22:18,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:22:18,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:22:18,721 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.25 vs. limit=12.0 2023-09-29 08:22:19,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 08:22:19,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=305306.6666666667, ans=0.125 2023-09-29 08:22:21,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:22:21,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:22:21,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=305373.3333333333, ans=0.125 2023-09-29 08:22:22,076 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.72 vs. limit=12.0 2023-09-29 08:22:22,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:22:27,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 08:22:29,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:22:29,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:22:30,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:22:32,118 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 08:22:33,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:22:33,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:22:35,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:22:35,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:22:35,962 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 08:22:39,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:22:39,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:22:42,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:22:42,704 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 08:22:44,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 08:22:44,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:22:44,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:22:46,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=305440.0, ans=0.125 2023-09-29 08:22:46,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=305440.0, ans=0.0 2023-09-29 08:22:47,406 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 08:22:48,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 08:22:49,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:22:52,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 08:22:54,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:22:57,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 08:22:58,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:23:00,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:23:00,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=305506.6666666667, ans=0.125 2023-09-29 08:23:01,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:23:01,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:23:01,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:23:02,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=305506.6666666667, ans=0.125 2023-09-29 08:23:03,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:23:03,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:23:05,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:23:07,123 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 08:23:09,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 08:23:10,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:23:10,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:23:10,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:23:13,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:23:13,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:23:15,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:23:17,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:17,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 08:23:18,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:23:19,516 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.90 vs. limit=15.0 2023-09-29 08:23:20,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 08:23:23,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 08:23:23,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:23:23,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:25,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:23:25,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:23:28,432 INFO [train.py:1039] (3/4) Epoch 9, batch 3350, loss[loss=0.2436, simple_loss=0.2953, pruned_loss=0.09599, over 23531.00 frames. ], tot_loss[loss=0.2139, simple_loss=0.2814, pruned_loss=0.07317, over 4721079.68 frames. ], batch size: 256, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:23:28,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:23:29,694 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.77 vs. limit=5.0 2023-09-29 08:23:30,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:23:30,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:23:33,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:23:34,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:23:36,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:23:38,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:38,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:23:40,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:23:42,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:23:44,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 08:23:45,655 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 08:23:45,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:23:49,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 08:23:49,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 08:23:49,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:23:49,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:23:52,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:23:52,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 08:23:53,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:53,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:23:55,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=305706.6666666667, ans=0.1 2023-09-29 08:23:56,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:23:59,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:23:59,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:24:00,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:24:02,697 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:24:05,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:24:06,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:24:07,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:24:11,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:24:13,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:24:16,546 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:24:16,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:24:19,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:24:21,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 08:24:23,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:24:23,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 08:24:23,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:24:24,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 08:24:25,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:24:27,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:24:33,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:24:34,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 08:24:34,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:24:37,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:24:37,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:24:43,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:24:44,739 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 2.029e+02 2.244e+02 2.615e+02 3.935e+02, threshold=4.489e+02, percent-clipped=0.0 2023-09-29 08:24:44,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 08:24:46,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:24:46,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:24:49,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:24:50,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 08:24:50,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:24:51,467 INFO [train.py:1039] (3/4) Epoch 9, batch 3400, loss[loss=0.2157, simple_loss=0.2865, pruned_loss=0.07241, over 24299.00 frames. ], tot_loss[loss=0.2136, simple_loss=0.2814, pruned_loss=0.07292, over 4722531.87 frames. ], batch size: 61, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:24:51,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 08:24:53,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:24:53,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:24:53,928 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:24:55,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:24:56,842 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 08:25:00,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 08:25:00,125 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 08:25:00,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:25:05,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:25:05,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:25:06,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:25:08,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:25:13,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:25:16,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 08:25:22,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:25:24,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:25:25,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:25:26,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 08:25:33,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=306106.6666666667, ans=0.2 2023-09-29 08:25:34,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:25:38,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 08:25:46,794 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:25:46,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:25:46,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 08:25:47,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:25:47,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:25:48,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:25:48,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:25:50,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=306173.3333333333, ans=0.125 2023-09-29 08:25:51,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:25:52,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=306173.3333333333, ans=0.0 2023-09-29 08:25:54,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:25:54,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:25:58,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=306240.0, ans=0.125 2023-09-29 08:26:01,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:26:03,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 08:26:09,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:26:12,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=306240.0, ans=0.0 2023-09-29 08:26:13,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 08:26:15,218 INFO [train.py:1039] (3/4) Epoch 9, batch 3450, loss[loss=0.174, simple_loss=0.2553, pruned_loss=0.04634, over 24432.00 frames. ], tot_loss[loss=0.2129, simple_loss=0.2808, pruned_loss=0.07249, over 4719148.98 frames. ], batch size: 63, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:26:17,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 08:26:18,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:26:19,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:26:19,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 08:26:20,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:26:24,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:26:31,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=306373.3333333333, ans=0.1 2023-09-29 08:26:32,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:26:33,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:26:33,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:26:33,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:26:38,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:26:39,093 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.42 vs. limit=22.5 2023-09-29 08:26:45,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 08:26:50,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 08:26:50,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 08:26:50,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:26:52,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:26:57,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 08:26:58,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:26:59,896 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.54 vs. limit=10.0 2023-09-29 08:27:00,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=306440.0, ans=0.125 2023-09-29 08:27:02,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:27:02,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:27:03,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:27:05,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:27:07,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 08:27:07,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:27:07,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:27:09,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=306506.6666666667, ans=0.95 2023-09-29 08:27:10,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:27:13,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 08:27:13,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=306506.6666666667, ans=0.07 2023-09-29 08:27:18,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:27:22,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:27:22,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:27:27,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:27:32,310 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.752e+02 2.104e+02 2.323e+02 2.853e+02 3.879e+02, threshold=4.645e+02, percent-clipped=0.0 2023-09-29 08:27:32,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:27:32,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:27:32,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:27:34,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:27:34,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=306573.3333333333, ans=0.07 2023-09-29 08:27:38,560 INFO [train.py:1039] (3/4) Epoch 9, batch 3500, loss[loss=0.2071, simple_loss=0.2505, pruned_loss=0.08186, over 22728.00 frames. ], tot_loss[loss=0.2116, simple_loss=0.2793, pruned_loss=0.07198, over 4726482.12 frames. ], batch size: 322, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:27:38,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:27:43,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:27:43,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 08:27:47,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:27:52,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 08:27:53,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:27:53,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 08:27:58,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:28:00,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:28:02,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:28:02,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:28:03,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 08:28:03,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:03,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:28:05,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 08:28:08,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:09,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 08:28:11,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:28:14,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:14,993 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=306773.3333333333, ans=0.125 2023-09-29 08:28:14,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=306773.3333333333, ans=0.0 2023-09-29 08:28:16,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 08:28:16,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:28:18,663 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.60 vs. limit=15.0 2023-09-29 08:28:20,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:28:20,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:28:21,019 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:23,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:28:25,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:28:25,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 08:28:28,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 08:28:28,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 08:28:28,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:28:29,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:30,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:28:30,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:28:31,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=306840.0, ans=0.125 2023-09-29 08:28:34,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=306840.0, ans=0.0 2023-09-29 08:28:35,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 08:28:35,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:28:40,079 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:28:41,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 08:28:41,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 08:28:41,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:28:45,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:28:45,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:28:47,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:28:49,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 08:28:50,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:28:52,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:28:54,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 08:28:56,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 08:28:58,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=306906.6666666667, ans=0.125 2023-09-29 08:29:01,017 INFO [train.py:1039] (3/4) Epoch 9, batch 3550, loss[loss=0.1898, simple_loss=0.2717, pruned_loss=0.05397, over 24511.00 frames. ], tot_loss[loss=0.2108, simple_loss=0.2784, pruned_loss=0.07162, over 4727167.60 frames. ], batch size: 63, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:29:01,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:29:02,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:29:02,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:29:02,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=306973.3333333333, ans=0.125 2023-09-29 08:29:04,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:29:05,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:29:07,014 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.13 vs. limit=15.0 2023-09-29 08:29:13,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:29:15,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 08:29:17,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:29:19,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:29:20,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:29:22,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:29:22,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:29:24,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=307040.0, ans=0.125 2023-09-29 08:29:25,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:29:25,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:29:25,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:29:27,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 08:29:27,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:29:34,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:29:34,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:29:35,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:29:35,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:29:35,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:29:36,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 08:29:37,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:29:39,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:29:39,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 08:29:44,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=307106.6666666667, ans=0.1 2023-09-29 08:29:47,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:29:47,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:29:47,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:29:49,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=307173.3333333333, ans=0.125 2023-09-29 08:29:50,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 08:29:50,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:29:50,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=307173.3333333333, ans=0.125 2023-09-29 08:29:52,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 08:29:53,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:29:56,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:29:56,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:30:01,170 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 08:30:01,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:30:08,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:30:10,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 08:30:10,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:30:14,194 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=307240.0, ans=0.125 2023-09-29 08:30:15,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:30:15,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 08:30:17,026 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.963e+02 2.242e+02 2.644e+02 4.260e+02, threshold=4.484e+02, percent-clipped=0.0 2023-09-29 08:30:21,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys.whitening_limit, batch_count=307240.0, ans=6.0 2023-09-29 08:30:21,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 08:30:22,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:30:22,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:30:23,422 INFO [train.py:1039] (3/4) Epoch 9, batch 3600, loss[loss=0.2758, simple_loss=0.3148, pruned_loss=0.1184, over 19304.00 frames. ], tot_loss[loss=0.2116, simple_loss=0.279, pruned_loss=0.07212, over 4722803.09 frames. ], batch size: 388, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:30:24,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:30:25,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:30:26,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:30:29,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:30:31,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:30:32,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:30:34,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:30:36,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:30:36,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 08:30:40,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:30:40,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:30:43,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:30:47,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:30:50,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:30:50,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:30:50,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 08:30:51,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:30:54,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:30:56,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:30:57,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:30:59,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:30:59,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:31:00,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 08:31:08,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:31:10,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:31:12,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 08:31:16,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:31:20,500 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.59 vs. limit=12.0 2023-09-29 08:31:22,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:31:25,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:31:31,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:31:31,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:31:31,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 08:31:33,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 08:31:35,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 08:31:36,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:31:38,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:31:38,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 08:31:38,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=307573.3333333333, ans=0.0 2023-09-29 08:31:39,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:31:39,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:31:40,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:31:41,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 08:31:41,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 08:31:45,445 INFO [train.py:1039] (3/4) Epoch 9, batch 3650, loss[loss=0.2174, simple_loss=0.2769, pruned_loss=0.07895, over 23360.00 frames. ], tot_loss[loss=0.212, simple_loss=0.2798, pruned_loss=0.07211, over 4730354.38 frames. ], batch size: 134, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:31:45,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:31:47,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 08:31:51,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 08:31:53,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:31:54,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=307640.0, ans=0.125 2023-09-29 08:31:55,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=307640.0, ans=0.2 2023-09-29 08:31:56,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 08:31:56,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=307640.0, ans=0.125 2023-09-29 08:31:57,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 08:32:00,902 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:32:02,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:32:02,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:32:06,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 08:32:08,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:32:09,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 08:32:09,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:32:09,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:32:11,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 08:32:11,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 08:32:11,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:32:11,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:32:14,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:32:16,536 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=307773.3333333333, ans=0.125 2023-09-29 08:32:17,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 08:32:17,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 08:32:19,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:32:21,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 08:32:24,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:32:25,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:32:30,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:32:33,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:32:33,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:32:35,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:32:35,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:32:36,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:32:41,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:32:43,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:32:43,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:32:44,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 08:32:44,983 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:32:46,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:32:46,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:32:54,758 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 08:33:00,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:33:00,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:33:01,739 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.073e+02 2.361e+02 2.805e+02 4.754e+02, threshold=4.723e+02, percent-clipped=2.0 2023-09-29 08:33:01,859 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:33:01,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:33:03,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:33:05,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:33:07,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 08:33:07,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:33:08,646 INFO [train.py:1039] (3/4) Epoch 9, batch 3700, loss[loss=0.2236, simple_loss=0.3021, pruned_loss=0.07252, over 24394.00 frames. ], tot_loss[loss=0.2131, simple_loss=0.281, pruned_loss=0.07262, over 4729256.57 frames. ], batch size: 77, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:33:10,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:33:11,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:33:11,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:33:12,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:33:12,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 08:33:12,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:33:13,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 08:33:13,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:33:17,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:33:19,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:33:20,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:33:21,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:33:21,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:33:22,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 08:33:24,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:33:26,595 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 08:33:35,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:33:35,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 08:33:38,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:33:38,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 08:33:40,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:33:43,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:33:44,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 08:33:46,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:33:47,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:33:48,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=308106.6666666667, ans=0.125 2023-09-29 08:33:50,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:33:50,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:33:52,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 08:33:54,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=308106.6666666667, ans=0.1 2023-09-29 08:33:55,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:33:56,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 08:33:57,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:33:57,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 08:34:03,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:34:04,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:34:07,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:34:09,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 08:34:12,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:34:12,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:34:13,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:34:13,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:34:16,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:34:18,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 08:34:18,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=308240.0, ans=0.125 2023-09-29 08:34:19,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 08:34:19,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:34:20,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:34:22,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:34:24,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:34:24,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=308240.0, ans=0.2 2023-09-29 08:34:25,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:34:27,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:34:28,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:34:30,327 INFO [train.py:1039] (3/4) Epoch 9, batch 3750, loss[loss=0.2047, simple_loss=0.28, pruned_loss=0.06465, over 24492.00 frames. ], tot_loss[loss=0.2146, simple_loss=0.2827, pruned_loss=0.07319, over 4729220.29 frames. ], batch size: 66, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:34:30,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 08:34:32,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 08:34:32,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=308306.6666666667, ans=0.125 2023-09-29 08:34:33,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 08:34:35,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 08:34:35,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:34:36,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=308306.6666666667, ans=0.125 2023-09-29 08:34:38,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:34:38,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:34:39,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:34:42,427 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.80 vs. limit=6.0 2023-09-29 08:34:43,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:34:48,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 08:34:49,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:34:49,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:34:54,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:34:54,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 08:34:57,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:34:59,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:34:59,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:35:02,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 08:35:05,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 08:35:07,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:35:08,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:35:11,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:35:17,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:35:18,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 08:35:23,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 08:35:26,843 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.02 vs. limit=15.0 2023-09-29 08:35:27,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:35:30,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:35:30,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:35:35,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:35:38,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=308573.3333333333, ans=0.2 2023-09-29 08:35:39,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 08:35:41,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 08:35:43,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:35:45,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:35:46,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:35:48,989 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.742e+02 2.167e+02 2.525e+02 3.168e+02 5.587e+02, threshold=5.051e+02, percent-clipped=3.0 2023-09-29 08:35:53,548 INFO [train.py:1039] (3/4) Epoch 9, batch 3800, loss[loss=0.1836, simple_loss=0.2611, pruned_loss=0.05304, over 24660.00 frames. ], tot_loss[loss=0.2139, simple_loss=0.2816, pruned_loss=0.07309, over 4728803.77 frames. ], batch size: 65, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:35:57,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:36:01,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:36:02,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 08:36:02,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 08:36:03,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=308640.0, ans=0.04949747468305833 2023-09-29 08:36:03,317 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.40 vs. limit=15.0 2023-09-29 08:36:04,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:36:07,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:36:08,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 08:36:10,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 08:36:10,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:36:11,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:36:12,072 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=308706.6666666667, ans=0.125 2023-09-29 08:36:12,367 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.91 vs. limit=15.0 2023-09-29 08:36:13,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:36:14,098 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=22.28 vs. limit=22.5 2023-09-29 08:36:14,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:36:14,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:36:16,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 08:36:19,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 08:36:21,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:36:24,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:36:25,601 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.48 vs. limit=22.5 2023-09-29 08:36:27,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:36:27,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:36:30,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:36:30,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:36:34,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:36:34,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:36:38,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=308773.3333333333, ans=15.0 2023-09-29 08:36:39,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:36:39,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 08:36:39,897 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 08:36:41,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:36:47,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:36:53,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:36:55,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 08:36:56,517 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.12 vs. limit=15.0 2023-09-29 08:36:57,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 08:36:59,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:37:00,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:37:02,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:37:02,858 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.70 vs. limit=22.5 2023-09-29 08:37:03,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 08:37:07,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 08:37:07,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 08:37:09,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:37:09,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:37:15,132 INFO [train.py:1039] (3/4) Epoch 9, batch 3850, loss[loss=0.2181, simple_loss=0.2956, pruned_loss=0.07036, over 24005.00 frames. ], tot_loss[loss=0.2128, simple_loss=0.2809, pruned_loss=0.07236, over 4718922.23 frames. ], batch size: 80, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:37:15,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:37:16,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:37:21,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:37:21,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 08:37:25,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:37:25,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:37:28,391 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:37:32,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:37:33,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 08:37:33,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 08:37:39,893 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.21 vs. limit=6.0 2023-09-29 08:37:40,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:37:43,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:37:46,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:37:47,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:37:51,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:37:51,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:37:51,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:37:53,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:37:53,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:37:54,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:37:55,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:37:56,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:37:58,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 08:37:58,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 08:37:59,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:38:01,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:38:04,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:06,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:38:06,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 08:38:08,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 08:38:09,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:11,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 08:38:15,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 08:38:16,208 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.37 vs. limit=15.0 2023-09-29 08:38:17,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=309173.3333333333, ans=0.1 2023-09-29 08:38:19,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:20,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:38:25,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:25,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 08:38:25,706 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=309240.0, ans=0.2 2023-09-29 08:38:28,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 08:38:30,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:38:31,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:38:33,498 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.661e+02 1.974e+02 2.231e+02 2.579e+02 4.458e+02, threshold=4.461e+02, percent-clipped=0.0 2023-09-29 08:38:33,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:38:33,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 08:38:35,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:35,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:35,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:38:35,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 08:38:37,823 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=7.42 vs. limit=15.0 2023-09-29 08:38:38,203 INFO [train.py:1039] (3/4) Epoch 9, batch 3900, loss[loss=0.2206, simple_loss=0.2896, pruned_loss=0.07582, over 24619.00 frames. ], tot_loss[loss=0.2111, simple_loss=0.2791, pruned_loss=0.07159, over 4713133.06 frames. ], batch size: 65, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:38:38,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:38:38,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 08:38:38,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:38,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:38:40,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:38:40,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:42,350 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:38:43,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:38:43,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:38:44,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:38:44,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 08:38:45,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:48,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:38:49,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:38:51,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:38:51,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:38:56,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:38:56,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:38:57,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:38:59,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 08:38:59,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:39:01,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 08:39:02,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:39:02,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 08:39:05,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 08:39:09,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:39:10,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:39:10,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:39:12,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:39:15,426 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.57 vs. limit=12.0 2023-09-29 08:39:17,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:39:19,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:39:22,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:39:22,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:39:23,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=309440.0, ans=0.2 2023-09-29 08:39:24,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:39:29,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=309506.6666666667, ans=0.2 2023-09-29 08:39:30,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:39:30,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:39:36,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 08:39:38,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:39:50,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:39:52,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:39:52,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 08:39:52,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 08:39:52,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 08:39:55,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 08:39:57,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:39:59,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 08:40:02,349 INFO [train.py:1039] (3/4) Epoch 9, batch 3950, loss[loss=0.2168, simple_loss=0.2965, pruned_loss=0.06851, over 24431.00 frames. ], tot_loss[loss=0.2106, simple_loss=0.2787, pruned_loss=0.07121, over 4719062.23 frames. ], batch size: 69, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:40:05,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:40:05,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=309640.0, ans=0.0 2023-09-29 08:40:07,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 08:40:07,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:40:10,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:40:10,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:40:15,835 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 08:40:17,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:40:17,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 08:40:18,770 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 08:40:18,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:40:22,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:40:23,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:40:23,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:40:26,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 08:40:30,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:40:31,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:40:31,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:40:32,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:40:33,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 08:40:37,386 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=309773.3333333333, ans=0.125 2023-09-29 08:40:37,548 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.44 vs. limit=15.0 2023-09-29 08:40:46,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:40:46,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:40:52,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 08:40:59,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 08:40:59,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 08:40:59,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:41:00,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=309840.0, ans=0.0 2023-09-29 08:41:01,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:41:11,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:41:11,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:41:11,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:41:11,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:41:13,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 08:41:14,178 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.08 vs. limit=15.0 2023-09-29 08:41:15,122 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=309906.6666666667, ans=0.125 2023-09-29 08:41:16,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:41:17,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:41:19,326 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.683e+02 2.102e+02 2.264e+02 2.656e+02 4.963e+02, threshold=4.527e+02, percent-clipped=1.0 2023-09-29 08:41:22,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 08:41:22,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=309973.3333333333, ans=0.125 2023-09-29 08:41:23,845 INFO [train.py:1039] (3/4) Epoch 9, batch 4000, loss[loss=0.217, simple_loss=0.2788, pruned_loss=0.07757, over 23755.00 frames. ], tot_loss[loss=0.2115, simple_loss=0.2789, pruned_loss=0.07198, over 4719495.78 frames. ], batch size: 232, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:41:32,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:41:33,128 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.09 vs. limit=15.0 2023-09-29 08:41:39,726 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=310040.0, ans=0.125 2023-09-29 08:41:41,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:41:45,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:41:47,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:41:47,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:41:47,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 08:41:49,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:41:49,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 08:41:49,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:41:49,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 08:41:51,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=310040.0, ans=0.1 2023-09-29 08:41:51,686 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.40 vs. limit=22.5 2023-09-29 08:41:52,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:41:55,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:41:55,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:41:55,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:41:57,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:41:57,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 08:41:59,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:41:59,430 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 08:41:59,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:42:00,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:42:03,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=310106.6666666667, ans=0.125 2023-09-29 08:42:04,745 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 08:42:04,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:42:04,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:42:13,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 08:42:13,637 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:42:15,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:42:16,740 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 08:42:18,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:42:19,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 08:42:19,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:42:19,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:42:21,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:42:23,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:42:25,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 08:42:25,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:42:25,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 08:42:26,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:42:28,140 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 08:42:32,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:42:37,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 08:42:39,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:42:39,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=310240.0, ans=0.1 2023-09-29 08:42:40,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:42:40,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:42:42,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:42:47,946 INFO [train.py:1039] (3/4) Epoch 9, batch 4050, loss[loss=0.2131, simple_loss=0.2694, pruned_loss=0.07837, over 23862.00 frames. ], tot_loss[loss=0.2123, simple_loss=0.28, pruned_loss=0.0723, over 4726671.01 frames. ], batch size: 164, lr: 1.14e-02, grad_scale: 32.0 2023-09-29 08:42:48,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:42:49,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 08:42:51,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 08:42:52,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:42:52,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:42:54,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:42:54,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:42:56,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:43:01,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:43:04,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:43:05,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 08:43:05,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=310373.3333333333, ans=0.125 2023-09-29 08:43:07,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:43:08,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:43:11,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:43:15,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:43:17,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 08:43:19,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 08:43:21,207 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 08:43:22,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:43:29,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 08:43:30,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:43:32,836 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.97 vs. limit=22.5 2023-09-29 08:43:34,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:43:37,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:43:38,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:43:40,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:43:41,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:43:46,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 08:43:46,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 08:43:47,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:43:50,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 08:43:50,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=310506.6666666667, ans=0.125 2023-09-29 08:43:56,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:44:00,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=310573.3333333333, ans=0.0 2023-09-29 08:44:02,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 08:44:04,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:44:04,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:44:05,842 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.662e+02 2.021e+02 2.194e+02 2.667e+02 4.003e+02, threshold=4.389e+02, percent-clipped=0.0 2023-09-29 08:44:06,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 08:44:06,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 08:44:06,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:44:09,645 INFO [train.py:1039] (3/4) Epoch 9, batch 4100, loss[loss=0.19, simple_loss=0.2661, pruned_loss=0.05697, over 21262.00 frames. ], tot_loss[loss=0.2124, simple_loss=0.28, pruned_loss=0.07241, over 4717781.74 frames. ], batch size: 46, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:44:09,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:44:09,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:44:09,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:44:17,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 08:44:19,345 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 08:44:23,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 08:44:23,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 08:44:23,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:44:25,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:44:25,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:44:25,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:44:26,777 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 08:44:30,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:44:31,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:44:31,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:44:33,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:44:35,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:44:36,816 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:44:36,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:44:36,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 08:44:38,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:44:38,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:44:38,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:44:38,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:44:38,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 08:44:41,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:44:45,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 08:44:46,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:44:46,911 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=310773.3333333333, ans=0.0 2023-09-29 08:44:49,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:44:49,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 08:44:53,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:44:53,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:44:53,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:44:56,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 08:44:57,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:44:57,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:45:00,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 08:45:01,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:45:01,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:45:05,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:45:07,517 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.04 vs. limit=15.0 2023-09-29 08:45:08,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=310840.0, ans=0.125 2023-09-29 08:45:09,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:45:14,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:45:14,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:45:22,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:45:22,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:45:28,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:45:31,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:45:32,100 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=310973.3333333333, ans=0.1 2023-09-29 08:45:33,037 INFO [train.py:1039] (3/4) Epoch 9, batch 4150, loss[loss=0.2643, simple_loss=0.3079, pruned_loss=0.1103, over 19820.00 frames. ], tot_loss[loss=0.2127, simple_loss=0.2807, pruned_loss=0.07239, over 4721053.58 frames. ], batch size: 389, lr: 1.14e-02, grad_scale: 16.0 2023-09-29 08:45:33,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:45:34,765 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:45:34,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:45:34,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:45:38,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 08:45:38,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:45:39,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 08:45:40,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 08:45:41,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 08:45:41,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:45:47,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:45:47,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:45:51,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=311040.0, ans=0.125 2023-09-29 08:45:52,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:45:53,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:45:54,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:45:56,768 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.13 vs. limit=15.0 2023-09-29 08:45:57,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 08:45:57,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:45:57,680 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=311040.0, ans=0.0 2023-09-29 08:45:58,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 08:46:04,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:46:07,857 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.94 vs. limit=22.5 2023-09-29 08:46:08,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:46:10,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 08:46:12,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 08:46:12,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:46:13,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 08:46:13,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:46:13,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:46:17,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:46:17,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:46:21,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 08:46:26,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:46:26,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:46:27,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 08:46:28,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:46:30,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 08:46:31,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:46:34,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:46:36,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:46:38,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 08:46:38,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:46:38,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 08:46:39,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 08:46:40,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=311240.0, ans=0.125 2023-09-29 08:46:42,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 08:46:43,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:46:43,681 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 08:46:43,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 08:46:43,842 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 08:46:43,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:46:45,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 08:46:46,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:46:46,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:46:47,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 08:46:48,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 08:46:51,309 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 2.017e+02 2.325e+02 2.759e+02 4.576e+02, threshold=4.650e+02, percent-clipped=1.0 2023-09-29 08:46:54,466 INFO [train.py:1039] (3/4) Epoch 9, batch 4200, loss[loss=0.2305, simple_loss=0.2796, pruned_loss=0.09068, over 23755.00 frames. ], tot_loss[loss=0.212, simple_loss=0.2796, pruned_loss=0.07219, over 4716440.08 frames. ], batch size: 164, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:46:54,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:46:56,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 08:46:56,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=311306.6666666667, ans=0.0 2023-09-29 08:46:57,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:46:59,429 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:46:59,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=311306.6666666667, ans=0.125 2023-09-29 08:46:59,764 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=311306.6666666667, ans=0.0 2023-09-29 08:47:00,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:47:02,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:47:02,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:47:05,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 08:47:09,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 08:47:10,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:47:12,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:47:16,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:47:21,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 08:47:21,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:47:21,432 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:47:22,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 08:47:22,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:47:24,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:47:24,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:47:24,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:47:25,052 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=311373.3333333333, ans=0.0 2023-09-29 08:47:26,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:47:28,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 08:47:28,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:47:32,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 08:47:32,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:47:34,915 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=311440.0, ans=0.0 2023-09-29 08:47:36,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:47:37,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:47:40,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:47:40,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 08:47:41,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:47:42,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:47:47,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 08:47:50,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:47:51,231 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=311506.6666666667, ans=0.125 2023-09-29 08:47:52,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=311506.6666666667, ans=0.125 2023-09-29 08:47:53,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:47:58,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 08:48:00,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:48:04,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 08:48:06,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:48:07,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 08:48:14,648 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 08:48:18,152 INFO [train.py:1039] (3/4) Epoch 9, batch 4250, loss[loss=0.1942, simple_loss=0.2789, pruned_loss=0.0548, over 24319.00 frames. ], tot_loss[loss=0.2105, simple_loss=0.2779, pruned_loss=0.07157, over 4701478.57 frames. ], batch size: 74, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 08:48:19,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 08:48:19,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 08:48:22,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:48:28,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 08:48:28,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 08:48:28,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:48:29,901 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.18 vs. limit=22.5 2023-09-29 08:48:33,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:48:36,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:48:41,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:48:41,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:48:44,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:48:44,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:48:45,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:48:46,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:48:48,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:48:49,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:48:51,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:48:54,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 08:48:57,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 08:48:57,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:48:59,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:48:59,472 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=311773.3333333333, ans=0.125 2023-09-29 08:49:00,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:49:00,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:49:00,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:49:00,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:49:05,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 08:49:05,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 08:49:09,421 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.17 vs. limit=15.0 2023-09-29 08:49:11,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:49:13,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:49:13,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 08:49:13,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:49:14,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 08:49:16,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:49:17,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=311840.0, ans=0.125 2023-09-29 08:49:17,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=311840.0, ans=0.09899494936611666 2023-09-29 08:49:17,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=311840.0, ans=0.125 2023-09-29 08:49:19,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:49:20,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:49:20,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:49:22,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 08:49:24,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 08:49:24,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:49:29,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:49:32,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:49:33,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:49:33,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:49:35,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:49:36,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:49:38,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:49:38,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 08:49:39,728 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 2.120e+02 2.435e+02 2.958e+02 4.592e+02, threshold=4.869e+02, percent-clipped=0.0 2023-09-29 08:49:40,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:49:41,355 INFO [train.py:1039] (3/4) Epoch 9, batch 4300, loss[loss=0.2093, simple_loss=0.2728, pruned_loss=0.07288, over 23378.00 frames. ], tot_loss[loss=0.2103, simple_loss=0.2778, pruned_loss=0.07146, over 4703961.29 frames. ], batch size: 285, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 08:49:46,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:49:46,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:49:50,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=311973.3333333333, ans=0.0 2023-09-29 08:49:51,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:49:59,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:49:59,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 08:50:01,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:50:05,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:50:05,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 08:50:05,405 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 08:50:08,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 08:50:10,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:50:13,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 08:50:14,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:50:14,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 08:50:16,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 08:50:17,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:50:21,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:50:21,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:50:22,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:50:24,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:50:25,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:50:25,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 08:50:25,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 08:50:28,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:50:30,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:50:30,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 08:50:30,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:50:30,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:50:31,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 08:50:31,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 08:50:32,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 08:50:34,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:50:36,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 08:50:36,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 08:50:40,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:50:42,270 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 08:50:43,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:50:46,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:50:46,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:50:48,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 08:50:49,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 08:50:49,828 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:50:49,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:50:49,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:50:51,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:50:52,117 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.01 vs. limit=22.5 2023-09-29 08:50:54,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:50:57,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:50:57,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:50:58,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:51:02,105 INFO [train.py:1039] (3/4) Epoch 9, batch 4350, loss[loss=0.2094, simple_loss=0.283, pruned_loss=0.06793, over 24666.00 frames. ], tot_loss[loss=0.2112, simple_loss=0.279, pruned_loss=0.07172, over 4713662.77 frames. ], batch size: 65, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 08:51:03,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 08:51:03,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 08:51:09,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:51:12,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:51:15,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 08:51:15,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:51:20,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 08:51:20,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=312373.3333333333, ans=0.1 2023-09-29 08:51:25,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:51:26,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:51:26,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:51:29,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:51:31,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:51:33,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:51:38,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 08:51:39,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:51:39,291 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=312440.0, ans=0.2 2023-09-29 08:51:41,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:51:46,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:51:49,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 08:51:51,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:51:54,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 08:51:54,971 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=312506.6666666667, ans=0.0 2023-09-29 08:51:59,117 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 08:52:00,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:52:00,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 08:52:02,281 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 08:52:02,396 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 08:52:02,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:52:03,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:52:03,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:52:05,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:52:07,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:52:07,117 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:52:08,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 08:52:08,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:08,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:52:10,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:10,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 08:52:11,780 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 08:52:11,787 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 08:52:11,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 08:52:17,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:52:17,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:52:17,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:52:19,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:52:20,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 08:52:22,650 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.989e+02 2.158e+02 2.516e+02 4.089e+02, threshold=4.315e+02, percent-clipped=0.0 2023-09-29 08:52:22,937 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 08:52:22,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:24,260 INFO [train.py:1039] (3/4) Epoch 9, batch 4400, loss[loss=0.2238, simple_loss=0.2813, pruned_loss=0.08313, over 23431.00 frames. ], tot_loss[loss=0.2124, simple_loss=0.2804, pruned_loss=0.07221, over 4715128.19 frames. ], batch size: 134, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:52:28,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:52:29,022 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:30,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:52:33,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 08:52:33,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 08:52:33,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 08:52:33,901 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 08:52:35,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 08:52:35,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:52:38,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 08:52:41,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:52:42,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:52:44,201 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 08:52:47,933 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:52:47,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 08:52:48,035 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 08:52:49,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=312706.6666666667, ans=0.1 2023-09-29 08:52:51,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 08:52:53,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 08:52:53,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 08:52:53,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:52:53,317 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:52:53,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:52:53,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=312706.6666666667, ans=0.0 2023-09-29 08:52:54,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:52:57,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 08:52:57,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 08:52:57,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:53:02,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 08:53:02,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:53:02,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:53:03,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:53:03,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 08:53:05,459 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 08:53:08,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:53:15,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:53:18,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 08:53:21,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:53:25,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:53:27,079 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 08:53:28,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=312840.0, ans=0.1 2023-09-29 08:53:29,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 08:53:29,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 08:53:30,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:53:30,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 08:53:30,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 08:53:33,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 08:53:34,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=312906.6666666667, ans=0.125 2023-09-29 08:53:38,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 08:53:39,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 08:53:39,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:53:39,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 08:53:40,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 08:53:43,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:53:45,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 08:53:46,749 INFO [train.py:1039] (3/4) Epoch 9, batch 4450, loss[loss=0.192, simple_loss=0.2568, pruned_loss=0.06354, over 17134.00 frames. ], tot_loss[loss=0.2136, simple_loss=0.2819, pruned_loss=0.07262, over 4703186.15 frames. ], batch size: 37, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:53:50,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:53:50,323 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=312973.3333333333, ans=0.0 2023-09-29 08:53:53,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:53:53,247 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 08:53:59,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:54:00,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:54:06,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:54:08,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:54:11,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:54:11,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:54:13,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 08:54:13,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:54:13,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:54:14,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:54:14,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 08:54:16,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 08:54:21,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:54:22,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:54:25,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:54:25,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:54:25,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:54:30,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 08:54:32,039 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 08:54:32,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 08:54:32,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:54:35,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:54:39,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 08:54:42,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 08:54:45,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:54:46,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 08:54:46,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:54:46,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:54:46,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:54:46,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:54:49,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:54:52,126 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 08:54:52,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 08:54:55,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 08:54:55,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:54:56,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:54:58,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:54:58,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 08:55:00,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 08:55:03,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 08:55:04,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:55:06,710 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.016e+02 2.496e+02 2.860e+02 5.111e+02, threshold=4.992e+02, percent-clipped=1.0 2023-09-29 08:55:08,151 INFO [train.py:1039] (3/4) Epoch 9, batch 4500, loss[loss=0.1936, simple_loss=0.2743, pruned_loss=0.05648, over 24488.00 frames. ], tot_loss[loss=0.2128, simple_loss=0.2809, pruned_loss=0.07231, over 4709902.15 frames. ], batch size: 63, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:55:09,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:55:12,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 08:55:12,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 08:55:13,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:55:20,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:55:20,850 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:55:20,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 08:55:22,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:55:22,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:55:23,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:55:30,571 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.06 vs. limit=15.0 2023-09-29 08:55:34,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:55:36,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:55:38,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:55:40,309 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 08:55:41,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 08:55:49,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:55:53,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 08:55:59,034 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=5.29 vs. limit=12.0 2023-09-29 08:55:59,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:56:01,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=313506.6666666667, ans=0.1 2023-09-29 08:56:02,652 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 08:56:02,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 08:56:02,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:56:04,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:56:05,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:56:05,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:56:07,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=313506.6666666667, ans=0.0 2023-09-29 08:56:08,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:56:09,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 08:56:09,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 08:56:09,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:56:11,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=313506.6666666667, ans=0.125 2023-09-29 08:56:16,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 08:56:16,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 08:56:17,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:56:21,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 08:56:21,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:56:22,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 08:56:25,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 08:56:25,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 08:56:30,960 INFO [train.py:1039] (3/4) Epoch 9, batch 4550, loss[loss=0.1834, simple_loss=0.2544, pruned_loss=0.0562, over 24311.00 frames. ], tot_loss[loss=0.2115, simple_loss=0.2798, pruned_loss=0.07158, over 4711293.26 frames. ], batch size: 56, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:56:31,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 08:56:34,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 08:56:35,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:56:37,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:56:38,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:56:39,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=313640.0, ans=0.0 2023-09-29 08:56:41,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:56:45,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:56:47,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:56:48,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:56:50,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:56:50,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:56:52,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:56:52,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:56:54,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=313706.6666666667, ans=0.0 2023-09-29 08:56:55,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=313706.6666666667, ans=0.125 2023-09-29 08:56:59,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:57:00,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 08:57:00,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 08:57:02,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 08:57:04,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 08:57:06,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 08:57:06,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:57:07,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=313773.3333333333, ans=0.125 2023-09-29 08:57:10,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 08:57:12,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 08:57:14,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:16,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:16,096 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 08:57:19,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 08:57:22,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:57:24,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:25,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 08:57:26,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:57:27,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 08:57:29,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 08:57:29,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 08:57:31,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 08:57:34,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 08:57:34,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 08:57:36,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:57:36,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:57:37,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:37,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 08:57:39,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 08:57:39,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 08:57:42,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:57:42,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 08:57:44,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 08:57:44,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 08:57:44,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 08:57:47,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 08:57:47,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:57:51,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:57:52,524 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.671e+02 2.108e+02 2.512e+02 3.014e+02 4.343e+02, threshold=5.024e+02, percent-clipped=0.0 2023-09-29 08:57:52,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:57:52,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 08:57:54,161 INFO [train.py:1039] (3/4) Epoch 9, batch 4600, loss[loss=0.1944, simple_loss=0.2818, pruned_loss=0.0535, over 24322.00 frames. ], tot_loss[loss=0.2108, simple_loss=0.2789, pruned_loss=0.07137, over 4721473.09 frames. ], batch size: 74, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 08:57:54,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:57:55,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 08:57:57,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:57:59,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:58:02,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 08:58:02,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 08:58:04,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:58:06,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 08:58:07,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 08:58:11,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:58:11,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:58:15,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:22,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 08:58:24,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:26,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:29,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:58:29,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:58:35,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 08:58:35,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 08:58:37,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:58:41,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:42,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 08:58:44,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 08:58:47,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 08:58:49,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 08:58:54,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:58:55,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:58:57,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:58:57,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 08:58:57,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:58:57,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 08:58:59,584 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:58:59,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:59:01,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:59:02,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 08:59:04,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:59:05,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 08:59:05,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 08:59:05,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 08:59:07,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:59:09,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:59:09,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=314240.0, ans=0.125 2023-09-29 08:59:11,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:59:11,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 08:59:17,961 INFO [train.py:1039] (3/4) Epoch 9, batch 4650, loss[loss=0.2145, simple_loss=0.2752, pruned_loss=0.07693, over 23484.00 frames. ], tot_loss[loss=0.2097, simple_loss=0.2774, pruned_loss=0.071, over 4703941.40 frames. ], batch size: 285, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 08:59:20,107 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=314306.6666666667, ans=0.125 2023-09-29 08:59:21,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 08:59:26,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:59:26,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:59:26,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 08:59:26,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 08:59:27,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 08:59:29,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 08:59:31,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 08:59:36,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 08:59:37,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 08:59:37,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 08:59:39,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 08:59:39,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 08:59:39,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 08:59:40,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 08:59:40,869 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:59:40,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 08:59:41,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=314373.3333333333, ans=0.125 2023-09-29 08:59:44,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 08:59:46,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:59:46,743 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 08:59:49,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 08:59:51,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 08:59:54,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 08:59:54,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 08:59:55,824 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 08:59:57,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:00:02,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:00:05,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:00:10,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:00:13,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:00:13,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:00:13,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:00:14,995 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=314506.6666666667, ans=0.0 2023-09-29 09:00:17,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 09:00:19,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 09:00:19,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 09:00:19,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 09:00:20,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:00:27,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:00:27,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:00:27,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 09:00:27,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:00:27,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=314573.3333333333, ans=0.125 2023-09-29 09:00:30,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:00:30,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:00:30,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:00:34,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:00:34,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:00:35,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:00:38,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:00:38,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:00:38,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:00:40,265 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 2.149e+02 2.483e+02 2.991e+02 4.624e+02, threshold=4.965e+02, percent-clipped=0.0 2023-09-29 09:00:40,309 INFO [train.py:1039] (3/4) Epoch 9, batch 4700, loss[loss=0.197, simple_loss=0.2618, pruned_loss=0.06615, over 14875.00 frames. ], tot_loss[loss=0.211, simple_loss=0.2785, pruned_loss=0.07178, over 4686066.36 frames. ], batch size: 32, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:00:40,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 09:00:42,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:00:42,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 09:00:50,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:00:52,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:00:52,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:00:53,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:00:57,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:01:01,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 09:01:03,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 09:01:05,991 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:01:06,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:01:08,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:01:11,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:01:16,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:01:16,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=314773.3333333333, ans=0.95 2023-09-29 09:01:17,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 09:01:21,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:01:26,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 09:01:28,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:01:30,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:33,726 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=314840.0, ans=0.0 2023-09-29 09:01:35,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 09:01:35,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:01:41,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:01:41,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 09:01:42,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:42,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:01:46,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:01:46,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:01:47,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 09:01:48,096 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 09:01:49,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:01:49,957 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=314906.6666666667, ans=0.125 2023-09-29 09:01:53,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:53,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:53,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 09:01:55,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:01:57,134 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=314906.6666666667, ans=0.2 2023-09-29 09:01:58,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 09:02:01,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:02:02,374 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.71 vs. limit=15.0 2023-09-29 09:02:03,561 INFO [train.py:1039] (3/4) Epoch 9, batch 4750, loss[loss=0.1983, simple_loss=0.2634, pruned_loss=0.06661, over 23331.00 frames. ], tot_loss[loss=0.2108, simple_loss=0.2787, pruned_loss=0.07138, over 4688487.92 frames. ], batch size: 119, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:02:03,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:02:08,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:02:09,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:02:11,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 09:02:11,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:02:14,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 09:02:16,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:02:16,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:02:16,541 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=314973.3333333333, ans=0.125 2023-09-29 09:02:16,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=314973.3333333333, ans=0.2 2023-09-29 09:02:17,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:02:19,552 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=315040.0, ans=0.5 2023-09-29 09:02:24,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 09:02:26,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=315040.0, ans=0.0 2023-09-29 09:02:28,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:02:30,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 09:02:31,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:02:33,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=315040.0, ans=0.125 2023-09-29 09:02:36,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:02:36,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:02:36,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:02:38,258 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 09:02:38,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 09:02:43,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 09:02:43,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=315106.6666666667, ans=0.2 2023-09-29 09:02:46,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:02:48,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:02:50,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:02:50,176 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 09:02:50,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:02:50,508 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=315106.6666666667, ans=0.1 2023-09-29 09:02:52,182 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=315173.3333333333, ans=0.125 2023-09-29 09:02:54,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:02:55,434 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.07 vs. limit=15.0 2023-09-29 09:02:58,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:03:00,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 09:03:01,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 09:03:01,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:03:01,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:03:03,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:03:03,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 09:03:05,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 09:03:07,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 09:03:10,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:03:11,743 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:03:11,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 09:03:11,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:03:13,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:03:15,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:03:17,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:03:17,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:03:20,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:03:20,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 09:03:21,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 09:03:23,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 09:03:26,100 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.000e+02 2.225e+02 2.502e+02 3.899e+02, threshold=4.449e+02, percent-clipped=0.0 2023-09-29 09:03:26,144 INFO [train.py:1039] (3/4) Epoch 9, batch 4800, loss[loss=0.3504, simple_loss=0.3711, pruned_loss=0.1649, over 19500.00 frames. ], tot_loss[loss=0.213, simple_loss=0.2805, pruned_loss=0.07277, over 4689913.19 frames. ], batch size: 388, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 09:03:26,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:03:26,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:03:27,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 09:03:29,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=315306.6666666667, ans=0.05 2023-09-29 09:03:35,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:03:36,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:03:42,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:03:43,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:03:43,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:03:45,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 09:03:45,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:03:45,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:03:48,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 09:03:49,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=315373.3333333333, ans=0.125 2023-09-29 09:03:53,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:03:56,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:03:56,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:03:57,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:03:57,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 09:03:57,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:03:59,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:04:03,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:04:05,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:04:06,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:04:06,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:04:08,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 09:04:08,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=315440.0, ans=0.125 2023-09-29 09:04:10,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:04:10,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 09:04:10,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=315440.0, ans=0.0 2023-09-29 09:04:12,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 09:04:12,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:04:12,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:04:13,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:04:13,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:04:13,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:04:15,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:04:17,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:04:19,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:04:20,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=315506.6666666667, ans=0.125 2023-09-29 09:04:22,713 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.13 vs. limit=15.0 2023-09-29 09:04:24,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:24,646 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=315506.6666666667, ans=0.125 2023-09-29 09:04:25,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:04:29,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 09:04:29,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:04:30,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:30,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:04:32,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:04:35,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:04:36,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:04:36,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:36,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:04:36,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:04:37,255 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=315573.3333333333, ans=0.0 2023-09-29 09:04:38,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:04:42,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:04:42,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:42,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:04:44,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 09:04:46,366 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=315573.3333333333, ans=0.125 2023-09-29 09:04:47,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 09:04:47,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:04:47,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:04:49,623 INFO [train.py:1039] (3/4) Epoch 9, batch 4850, loss[loss=0.2033, simple_loss=0.2794, pruned_loss=0.06362, over 24001.00 frames. ], tot_loss[loss=0.213, simple_loss=0.2804, pruned_loss=0.07282, over 4686306.43 frames. ], batch size: 86, lr: 1.13e-02, grad_scale: 16.0 2023-09-29 09:04:49,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:04:49,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:04:50,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=315640.0, ans=0.125 2023-09-29 09:04:52,731 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:04:53,644 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.30 vs. limit=22.5 2023-09-29 09:04:58,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=315640.0, ans=0.125 2023-09-29 09:05:00,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 09:05:03,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:05:08,317 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:05:08,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:05:08,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:05:13,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:05:13,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:05:14,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:05:14,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 09:05:20,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:05:22,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:05:22,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 09:05:22,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:05:22,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 09:05:24,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=315773.3333333333, ans=0.2 2023-09-29 09:05:26,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:05:26,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:05:31,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:05:31,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 09:05:32,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 09:05:32,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:05:40,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:05:41,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 09:05:42,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:05:42,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:05:43,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:05:45,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 09:05:45,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:05:46,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 09:05:48,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:05:50,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:05:50,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 09:05:56,010 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=315906.6666666667, ans=0.0 2023-09-29 09:05:59,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:05:59,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=315906.6666666667, ans=0.0 2023-09-29 09:06:04,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:06:04,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:06:09,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 09:06:09,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:06:12,307 INFO [train.py:1039] (3/4) Epoch 9, batch 4900, loss[loss=0.2089, simple_loss=0.2896, pruned_loss=0.06412, over 24035.00 frames. ], tot_loss[loss=0.212, simple_loss=0.2789, pruned_loss=0.07256, over 4674383.87 frames. ], batch size: 80, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:06:13,850 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.814e+02 2.067e+02 2.446e+02 3.189e+02 7.103e+02, threshold=4.893e+02, percent-clipped=2.0 2023-09-29 09:06:14,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:06:15,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:06:15,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:06:16,412 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.75 vs. limit=22.5 2023-09-29 09:06:16,462 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.45 vs. limit=6.0 2023-09-29 09:06:18,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=315973.3333333333, ans=0.0 2023-09-29 09:06:19,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 09:06:24,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 09:06:28,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 09:06:28,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=316040.0, ans=0.125 2023-09-29 09:06:31,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 09:06:31,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:06:33,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:06:33,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:06:33,062 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:06:33,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:06:34,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 09:06:36,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 09:06:37,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 09:06:39,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:06:41,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:06:41,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=316040.0, ans=0.0 2023-09-29 09:06:44,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:06:44,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:06:44,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=316106.6666666667, ans=0.125 2023-09-29 09:06:46,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:06:46,147 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 09:06:47,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:06:49,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:06:50,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 09:06:50,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 09:06:54,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 09:06:54,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:06:56,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:06:57,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:06:57,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:06:57,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 09:06:57,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:06:59,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 09:07:01,215 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.87 vs. limit=15.0 2023-09-29 09:07:04,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:07:04,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=316173.3333333333, ans=0.0 2023-09-29 09:07:06,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 09:07:07,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:07:10,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 09:07:12,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:07:13,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 09:07:14,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 09:07:19,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:07:22,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:07:22,658 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=316240.0, ans=0.0 2023-09-29 09:07:23,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 09:07:23,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 09:07:23,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:07:25,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:07:30,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:07:30,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:07:30,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:07:30,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 09:07:32,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 09:07:35,208 INFO [train.py:1039] (3/4) Epoch 9, batch 4950, loss[loss=0.2203, simple_loss=0.2782, pruned_loss=0.08119, over 23758.00 frames. ], tot_loss[loss=0.2105, simple_loss=0.2776, pruned_loss=0.07165, over 4684127.77 frames. ], batch size: 179, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:07:37,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:07:37,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 09:07:39,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 09:07:41,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 09:07:41,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:07:42,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 09:07:42,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:07:42,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:07:44,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:07:44,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:07:47,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:07:49,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:07:49,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:07:49,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=316306.6666666667, ans=0.125 2023-09-29 09:07:50,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:07:50,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:07:52,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:07:55,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:08:00,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:08:01,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:08:02,350 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=12.04 vs. limit=15.0 2023-09-29 09:08:04,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:08:05,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:08:07,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:08:07,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 09:08:08,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 09:08:10,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:08:13,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:08:13,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:08:16,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:08:16,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:08:18,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:08:19,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=316440.0, ans=0.125 2023-09-29 09:08:20,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:08:21,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=316440.0, ans=0.0 2023-09-29 09:08:25,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:08:27,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:08:28,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:08:28,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:08:30,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 09:08:30,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:08:31,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:08:34,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:08:35,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:08:35,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:08:36,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:08:38,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:08:40,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:08:41,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:08:41,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:08:43,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:08:43,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 09:08:49,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:08:53,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 09:08:53,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 09:08:55,220 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.05 vs. limit=15.0 2023-09-29 09:08:58,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=316640.0, ans=0.0 2023-09-29 09:08:59,017 INFO [train.py:1039] (3/4) Epoch 9, batch 5000, loss[loss=0.219, simple_loss=0.2783, pruned_loss=0.07988, over 23769.00 frames. ], tot_loss[loss=0.2097, simple_loss=0.2768, pruned_loss=0.07131, over 4681864.46 frames. ], batch size: 164, lr: 1.13e-02, grad_scale: 8.0 2023-09-29 09:09:00,622 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 2.030e+02 2.416e+02 2.948e+02 4.844e+02, threshold=4.831e+02, percent-clipped=0.0 2023-09-29 09:09:00,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:09:00,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:09:02,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 09:09:03,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 09:09:04,674 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.90 vs. limit=15.0 2023-09-29 09:09:05,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:09:07,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 09:09:07,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:09:07,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 09:09:08,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 09:09:08,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:09:08,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:09:10,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 09:09:10,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:09:10,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:09:13,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 09:09:13,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 09:09:15,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:09:15,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 09:09:15,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 09:09:16,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:09:16,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:09:16,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 09:09:16,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 09:09:20,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 09:09:20,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:09:20,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:09:22,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 09:09:22,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:09:24,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:09:25,914 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:09:28,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 09:09:28,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 09:09:30,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:09:31,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:09:37,782 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 09:09:39,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:09:41,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:09:41,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:09:47,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 09:09:47,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:09:47,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:09:49,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:09:50,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 09:09:50,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:09:52,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=316840.0, ans=0.0 2023-09-29 09:09:54,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:09:55,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:10:02,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 09:10:07,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:10:07,597 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=316906.6666666667, ans=0.04949747468305833 2023-09-29 09:10:09,474 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.36 vs. limit=15.0 2023-09-29 09:10:16,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:10:19,283 INFO [train.py:1039] (3/4) Epoch 9, batch 5050, loss[loss=0.184, simple_loss=0.2631, pruned_loss=0.05244, over 24488.00 frames. ], tot_loss[loss=0.2102, simple_loss=0.2774, pruned_loss=0.07146, over 4679548.48 frames. ], batch size: 63, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:10:19,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:10:19,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:10:19,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:10:19,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:10:19,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:10:19,569 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:10:22,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=316973.3333333333, ans=0.125 2023-09-29 09:10:24,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:10:24,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 09:10:27,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:10:30,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:10:31,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:10:32,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 09:10:33,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:10:34,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:10:37,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:10:39,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:10:41,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:10:49,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 09:10:50,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 09:10:50,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:10:52,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 09:10:52,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:10:53,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:10:53,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:10:54,839 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.09 vs. limit=10.0 2023-09-29 09:10:55,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:10:55,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 09:10:55,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 09:10:56,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:10:59,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:11:02,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:11:02,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 09:11:03,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:11:06,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 09:11:09,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:11:09,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:11:11,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:11:11,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=317173.3333333333, ans=0.125 2023-09-29 09:11:12,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:11:14,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:11:17,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:11:19,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:11:19,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:11:19,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:11:20,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 09:11:20,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:11:23,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:11:24,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=317240.0, ans=0.2 2023-09-29 09:11:27,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:11:27,143 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 09:11:27,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:11:28,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:11:30,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:11:31,684 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 09:11:35,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:11:35,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 09:11:35,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:11:40,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:11:41,020 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.52 vs. limit=15.0 2023-09-29 09:11:41,484 INFO [train.py:1039] (3/4) Epoch 9, batch 5100, loss[loss=0.1824, simple_loss=0.2551, pruned_loss=0.05488, over 24314.00 frames. ], tot_loss[loss=0.211, simple_loss=0.2788, pruned_loss=0.07161, over 4688663.19 frames. ], batch size: 56, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:11:41,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:11:41,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 09:11:43,048 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.653e+02 2.032e+02 2.319e+02 2.659e+02 3.992e+02, threshold=4.638e+02, percent-clipped=0.0 2023-09-29 09:11:43,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 09:11:45,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:11:45,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:11:45,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=317306.6666666667, ans=0.125 2023-09-29 09:11:46,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:11:49,297 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 09:11:52,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:11:56,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 09:11:57,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 09:11:57,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:11:59,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:12:02,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:12:02,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 09:12:02,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 09:12:03,273 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.03 vs. limit=15.0 2023-09-29 09:12:04,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=317373.3333333333, ans=0.125 2023-09-29 09:12:05,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=317373.3333333333, ans=0.1 2023-09-29 09:12:06,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:12:08,451 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:12:12,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:12:15,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 09:12:15,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:12:16,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:12:16,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 09:12:19,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:12:22,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:12:22,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 09:12:24,055 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 09:12:24,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:12:24,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 09:12:24,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 09:12:28,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:12:35,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=317506.6666666667, ans=0.125 2023-09-29 09:12:38,511 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:12:41,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 09:12:43,832 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 09:12:43,846 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 09:12:45,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 09:12:45,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:12:47,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 09:12:50,826 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=317573.3333333333, ans=0.2 2023-09-29 09:12:51,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 09:12:54,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 09:12:56,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:12:59,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 09:13:00,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:13:00,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 09:13:05,220 INFO [train.py:1039] (3/4) Epoch 9, batch 5150, loss[loss=0.2241, simple_loss=0.2911, pruned_loss=0.07852, over 23761.00 frames. ], tot_loss[loss=0.2129, simple_loss=0.2804, pruned_loss=0.07271, over 4688378.60 frames. ], batch size: 232, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:13:07,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:13:07,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:13:07,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:13:08,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:13:08,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:13:10,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:13:10,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 09:13:10,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 09:13:12,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 09:13:13,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:13:13,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 09:13:13,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:13:15,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 09:13:17,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:13:19,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:13:23,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 09:13:23,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 09:13:25,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:13:26,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:13:28,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:13:28,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:13:28,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:13:28,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:13:28,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:13:30,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 09:13:32,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:13:32,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:13:34,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 09:13:36,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=317706.6666666667, ans=0.0 2023-09-29 09:13:37,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 09:13:37,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:13:43,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:13:45,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 09:13:49,746 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:13:54,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:13:55,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:14:01,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:14:01,171 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:14:04,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 09:14:07,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:14:08,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:14:09,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:14:13,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:14:15,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:14:16,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 09:14:17,402 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=14.71 vs. limit=15.0 2023-09-29 09:14:21,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:14:23,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 09:14:25,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:14:25,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:14:25,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:14:25,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:14:26,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:14:26,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:14:28,377 INFO [train.py:1039] (3/4) Epoch 9, batch 5200, loss[loss=0.232, simple_loss=0.2976, pruned_loss=0.08323, over 23301.00 frames. ], tot_loss[loss=0.2143, simple_loss=0.2815, pruned_loss=0.07356, over 4685266.36 frames. ], batch size: 93, lr: 1.12e-02, grad_scale: 16.0 2023-09-29 09:14:29,841 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.704e+02 2.093e+02 2.397e+02 2.811e+02 4.237e+02, threshold=4.795e+02, percent-clipped=0.0 2023-09-29 09:14:30,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=317973.3333333333, ans=0.125 2023-09-29 09:14:31,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:14:32,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:14:36,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:14:40,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 09:14:40,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:14:41,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:14:42,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=317973.3333333333, ans=0.0 2023-09-29 09:14:42,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=317973.3333333333, ans=6.0 2023-09-29 09:14:43,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=318040.0, ans=0.2 2023-09-29 09:14:44,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:14:45,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:14:45,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:14:48,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 09:14:49,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 09:14:49,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=318040.0, ans=0.125 2023-09-29 09:14:51,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:14:53,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 09:14:55,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:14:55,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 09:14:56,687 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 09:14:56,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 09:14:59,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 09:15:01,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:15:01,205 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 09:15:01,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:15:02,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:15:02,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:15:03,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=318106.6666666667, ans=0.2 2023-09-29 09:15:04,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 09:15:05,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:15:08,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:15:11,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 09:15:11,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 09:15:13,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 09:15:17,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 09:15:17,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:15:25,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:15:25,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:15:27,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 09:15:27,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:15:27,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 09:15:27,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:15:29,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:15:33,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:15:33,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:15:38,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:15:40,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:15:40,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:15:43,375 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=318240.0, ans=0.0 2023-09-29 09:15:47,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:15:47,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 09:15:47,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:15:48,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:15:49,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:15:49,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:15:49,456 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=318306.6666666667, ans=0.2 2023-09-29 09:15:50,512 INFO [train.py:1039] (3/4) Epoch 9, batch 5250, loss[loss=0.178, simple_loss=0.2548, pruned_loss=0.05057, over 24432.00 frames. ], tot_loss[loss=0.2124, simple_loss=0.2796, pruned_loss=0.07261, over 4676583.02 frames. ], batch size: 58, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:15:50,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:15:52,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:15:55,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:15:56,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:15:57,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:15:57,968 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=318306.6666666667, ans=0.0 2023-09-29 09:16:04,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:16:04,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:16:07,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:16:08,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:16:09,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=318373.3333333333, ans=0.125 2023-09-29 09:16:10,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 09:16:11,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:16:12,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:16:35,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=318506.6666666667, ans=0.125 2023-09-29 09:16:38,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=318506.6666666667, ans=0.1 2023-09-29 09:17:04,990 INFO [train.py:1039] (3/4) Epoch 9, batch 5300, loss[loss=0.1979, simple_loss=0.2363, pruned_loss=0.07976, over 19081.00 frames. ], tot_loss[loss=0.2109, simple_loss=0.2779, pruned_loss=0.07198, over 4687092.73 frames. ], batch size: 388, lr: 1.12e-02, grad_scale: 8.0 2023-09-29 09:17:07,706 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.994e+02 2.180e+02 2.520e+02 5.243e+02, threshold=4.360e+02, percent-clipped=1.0 2023-09-29 09:17:11,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=318640.0, ans=0.07 2023-09-29 09:17:20,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:17:20,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 09:17:20,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 09:17:20,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:17:20,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:17:20,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:17:20,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:17:20,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:17:20,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:17:20,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:17:21,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:17:21,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:17:21,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 09:17:21,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 09:17:21,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 09:17:21,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:17:22,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 09:17:22,620 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 09:17:22,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:17:23,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:17:23,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:17:23,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:17:23,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:17:24,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:17:24,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:17:24,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:17:24,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:17:24,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:17:24,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:17:24,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:17:24,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:17:25,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 09:17:25,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:17:26,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:17:26,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 09:17:26,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 09:17:26,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:17:26,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:17:26,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 09:17:27,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 09:17:27,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 09:17:27,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:17:28,060 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:17:28,223 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 09:17:28,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 09:17:28,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:17:28,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:17:28,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 09:17:28,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 09:17:28,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 09:17:29,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 09:17:39,049 INFO [train.py:1039] (3/4) Epoch 10, batch 0, loss[loss=0.233, simple_loss=0.2928, pruned_loss=0.08661, over 23782.00 frames. ], tot_loss[loss=0.233, simple_loss=0.2928, pruned_loss=0.08661, over 23782.00 frames. ], batch size: 164, lr: 1.07e-02, grad_scale: 16.0 2023-09-29 09:17:39,049 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 09:17:53,004 INFO [train.py:1071] (3/4) Epoch 10, validation: loss=0.3048, simple_loss=0.281, pruned_loss=0.1643, over 1125622.00 frames. 2023-09-29 09:17:53,005 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 09:17:56,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 09:17:56,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:17:58,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:18:03,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:18:03,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:18:05,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:18:06,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 09:18:08,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 09:18:09,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:18:10,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:18:13,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:18:13,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:18:13,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:18:13,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:18:15,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 09:18:17,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:18:20,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=318786.6666666667, ans=0.125 2023-09-29 09:18:22,091 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=318786.6666666667, ans=0.1 2023-09-29 09:18:23,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:18:23,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:18:25,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 09:18:30,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:18:30,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:18:33,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:18:34,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=318853.3333333333, ans=0.09899494936611666 2023-09-29 09:18:35,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=318853.3333333333, ans=0.2 2023-09-29 09:18:36,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:18:42,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:18:47,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 09:18:51,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 09:18:53,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:18:53,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:18:53,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:18:53,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:18:56,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 09:19:00,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:19:00,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:19:00,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=318986.6666666667, ans=0.2 2023-09-29 09:19:01,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=318986.6666666667, ans=0.2 2023-09-29 09:19:06,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:19:08,826 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 09:19:10,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:19:12,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:19:13,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:19:13,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 09:19:15,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:19:15,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:19:16,488 INFO [train.py:1039] (3/4) Epoch 10, batch 50, loss[loss=0.2122, simple_loss=0.2785, pruned_loss=0.07299, over 23613.00 frames. ], tot_loss[loss=0.2089, simple_loss=0.2794, pruned_loss=0.06918, over 1059863.11 frames. ], batch size: 232, lr: 1.07e-02, grad_scale: 16.0 2023-09-29 09:19:18,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:19:18,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:19:22,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:19:26,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 09:19:26,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:19:32,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:19:34,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 09:19:37,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 09:19:39,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:19:41,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:19:41,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:19:43,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:19:43,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=319120.0, ans=0.1 2023-09-29 09:19:43,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=319120.0, ans=0.1 2023-09-29 09:19:44,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 09:19:44,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 09:19:44,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:19:45,662 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.68 vs. limit=15.0 2023-09-29 09:19:52,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:19:52,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=319186.6666666667, ans=10.0 2023-09-29 09:19:54,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:19:54,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 09:19:55,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 09:19:57,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:19:57,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:19:57,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 09:19:58,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:20:00,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 09:20:03,071 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.91 vs. limit=15.0 2023-09-29 09:20:03,087 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.51 vs. limit=15.0 2023-09-29 09:20:06,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:20:06,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:20:08,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:20:10,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:20:10,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:20:12,995 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=319253.3333333333, ans=0.125 2023-09-29 09:20:14,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 09:20:14,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 09:20:16,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:20:16,448 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:20:17,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:20:18,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:20:18,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 09:20:19,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 09:20:20,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 09:20:22,296 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.676e+02 2.176e+02 2.452e+02 2.821e+02 3.971e+02, threshold=4.904e+02, percent-clipped=0.0 2023-09-29 09:20:22,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:20:22,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:20:24,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 09:20:24,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 09:20:24,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:20:25,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:20:27,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 09:20:27,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:20:30,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:20:34,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:20:36,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:20:36,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=319320.0, ans=0.0 2023-09-29 09:20:38,812 INFO [train.py:1039] (3/4) Epoch 10, batch 100, loss[loss=0.2179, simple_loss=0.2813, pruned_loss=0.07725, over 23779.00 frames. ], tot_loss[loss=0.2118, simple_loss=0.281, pruned_loss=0.07132, over 1875631.12 frames. ], batch size: 164, lr: 1.07e-02, grad_scale: 16.0 2023-09-29 09:20:38,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 09:20:38,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:20:45,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:20:45,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:20:45,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:20:45,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:20:45,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:20:47,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 09:20:49,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:20:49,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:20:49,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:20:49,406 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:20:55,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 09:20:56,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:20:57,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:20:58,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:20:58,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=319453.3333333333, ans=0.125 2023-09-29 09:21:01,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:21:04,767 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 09:21:04,807 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 09:21:06,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:21:06,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:21:09,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:21:12,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:21:14,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:16,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=319520.0, ans=0.1 2023-09-29 09:21:21,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:21,292 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 09:21:23,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 09:21:27,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:21:29,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:21:30,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:33,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:21:36,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:21:38,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:21:40,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=319586.6666666667, ans=10.0 2023-09-29 09:21:41,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:43,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:21:43,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:21:43,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:21:45,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:21:45,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 09:21:45,602 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 09:21:47,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:21:48,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:21:48,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:21:48,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:21:48,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 09:21:48,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 09:21:50,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:21:50,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:21:51,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:21:51,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:21:51,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:21:53,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:21:55,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:21:59,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:21:59,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:22:00,950 INFO [train.py:1039] (3/4) Epoch 10, batch 150, loss[loss=0.2039, simple_loss=0.2863, pruned_loss=0.06074, over 24443.00 frames. ], tot_loss[loss=0.2107, simple_loss=0.2806, pruned_loss=0.07041, over 2514997.24 frames. ], batch size: 69, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:22:01,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:02,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:22:04,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:05,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=319720.0, ans=0.125 2023-09-29 09:22:07,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:22:08,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:13,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 09:22:13,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 09:22:13,336 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 09:22:15,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=319786.6666666667, ans=0.125 2023-09-29 09:22:16,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:22:16,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:22:17,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:22:19,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:22:19,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:22:19,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:20,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:22:21,638 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 09:22:23,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:22:29,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:22:32,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:22:35,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 09:22:36,849 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.22 vs. limit=10.0 2023-09-29 09:22:38,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:22:38,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:22:38,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:22:41,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:22:43,652 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=319853.3333333333, ans=0.1 2023-09-29 09:22:44,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:22:44,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:22:46,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:22:47,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 09:22:53,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:22:55,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:22:55,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:22:55,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:22:59,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:23:00,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 09:23:02,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:23:04,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:23:06,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:23:08,105 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.959e+02 2.278e+02 2.639e+02 3.877e+02, threshold=4.556e+02, percent-clipped=0.0 2023-09-29 09:23:08,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:23:08,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 09:23:08,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:23:08,400 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 09:23:17,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:23:22,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:23:23,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:23:25,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 09:23:26,911 INFO [train.py:1039] (3/4) Epoch 10, batch 200, loss[loss=0.2148, simple_loss=0.2912, pruned_loss=0.06923, over 23535.00 frames. ], tot_loss[loss=0.2119, simple_loss=0.2817, pruned_loss=0.07101, over 3005496.98 frames. ], batch size: 85, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:23:27,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:23:27,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:23:31,421 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 09:23:31,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=320053.3333333333, ans=0.2 2023-09-29 09:23:32,208 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.25 vs. limit=15.0 2023-09-29 09:23:32,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:23:34,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:23:35,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:23:39,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:23:41,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:23:41,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:23:50,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=320120.0, ans=0.125 2023-09-29 09:23:54,832 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=320120.0, ans=0.125 2023-09-29 09:23:54,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=320120.0, ans=0.04949747468305833 2023-09-29 09:24:00,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:24:02,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:24:03,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:24:05,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:24:06,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 09:24:06,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:24:06,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=320186.6666666667, ans=0.125 2023-09-29 09:24:08,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:08,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:24:08,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=320186.6666666667, ans=0.0 2023-09-29 09:24:09,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:24:09,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:24:12,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 09:24:12,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 09:24:12,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:24:13,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=320253.3333333333, ans=0.0 2023-09-29 09:24:17,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:24:25,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:24:25,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=320253.3333333333, ans=0.1 2023-09-29 09:24:30,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:32,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:24:38,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:38,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=320320.0, ans=0.1 2023-09-29 09:24:41,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 09:24:42,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:24:42,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:24:42,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:24:43,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:24:43,471 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=320320.0, ans=0.0 2023-09-29 09:24:44,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 09:24:44,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:24:44,713 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 09:24:46,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:47,883 INFO [train.py:1039] (3/4) Epoch 10, batch 250, loss[loss=0.2151, simple_loss=0.2792, pruned_loss=0.07546, over 23162.00 frames. ], tot_loss[loss=0.2094, simple_loss=0.2793, pruned_loss=0.06974, over 3390255.37 frames. ], batch size: 105, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:24:50,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:24:51,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:51,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:24:53,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:24:53,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:24:54,292 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.97 vs. limit=22.5 2023-09-29 09:24:57,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:25:00,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:25:09,302 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=320453.3333333333, ans=0.2 2023-09-29 09:25:10,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:25:13,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:25:14,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:25:21,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 09:25:21,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:25:22,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:25:22,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:25:24,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:25:24,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:25:26,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:25:26,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=320520.0, ans=0.125 2023-09-29 09:25:29,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:25:33,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 09:25:33,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:25:35,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:25:35,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:25:36,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:25:36,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:25:38,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:25:38,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:25:40,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:25:41,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:25:41,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:25:43,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=320586.6666666667, ans=0.1 2023-09-29 09:25:46,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:25:50,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:25:53,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:25:55,088 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 2.019e+02 2.235e+02 2.582e+02 3.547e+02, threshold=4.469e+02, percent-clipped=0.0 2023-09-29 09:25:59,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:26:02,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:26:07,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 09:26:07,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:26:08,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:26:08,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 09:26:10,202 INFO [train.py:1039] (3/4) Epoch 10, batch 300, loss[loss=0.1874, simple_loss=0.2639, pruned_loss=0.05551, over 24343.00 frames. ], tot_loss[loss=0.2078, simple_loss=0.2774, pruned_loss=0.0691, over 3680887.45 frames. ], batch size: 61, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:26:10,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 09:26:11,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:26:11,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 09:26:14,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=320720.0, ans=0.95 2023-09-29 09:26:16,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:26:18,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:26:21,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:26:21,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 09:26:23,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:26:24,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:26:24,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 09:26:24,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:26:27,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 09:26:32,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:26:34,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 09:26:38,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 09:26:38,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:26:41,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:26:43,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:26:43,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 09:26:43,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:26:43,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=320853.3333333333, ans=6.0 2023-09-29 09:26:45,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:26:46,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:26:48,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:26:52,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 09:26:53,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 09:26:53,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:26:56,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:26:57,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 09:26:57,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:27:01,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:27:04,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:27:04,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 09:27:06,447 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:27:07,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:27:07,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:27:11,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:27:13,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=320920.0, ans=0.1 2023-09-29 09:27:14,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:27:14,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 09:27:14,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 09:27:15,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:27:17,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 09:27:21,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:27:21,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:21,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=320986.6666666667, ans=0.125 2023-09-29 09:27:22,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:27:22,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:27:22,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:27,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:27:27,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 09:27:30,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:31,837 INFO [train.py:1039] (3/4) Epoch 10, batch 350, loss[loss=0.2173, simple_loss=0.2978, pruned_loss=0.06842, over 24406.00 frames. ], tot_loss[loss=0.2068, simple_loss=0.2755, pruned_loss=0.06901, over 3903801.07 frames. ], batch size: 77, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:27:35,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:27:40,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:27:41,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:42,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=321053.3333333333, ans=0.0 2023-09-29 09:27:44,566 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.14 vs. limit=6.0 2023-09-29 09:27:45,465 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 09:27:47,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:27:47,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 09:27:49,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:27:50,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 09:27:50,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:27:55,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 09:27:57,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:27:59,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:28:00,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:28:02,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:28:02,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:28:02,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:28:02,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:28:02,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:28:05,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:28:05,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:28:05,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=321186.6666666667, ans=0.125 2023-09-29 09:28:12,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:28:12,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:28:15,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:28:15,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:28:21,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 09:28:21,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:28:27,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:28:27,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:28:27,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:28:29,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 09:28:32,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:28:32,342 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 09:28:35,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 09:28:35,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:28:39,688 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 2.071e+02 2.540e+02 3.083e+02 5.946e+02, threshold=5.081e+02, percent-clipped=4.0 2023-09-29 09:28:39,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:28:39,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 09:28:43,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:28:44,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:28:46,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:28:47,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:28:47,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:28:50,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:28:52,867 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=321320.0, ans=0.125 2023-09-29 09:28:53,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:28:55,217 INFO [train.py:1039] (3/4) Epoch 10, batch 400, loss[loss=0.2181, simple_loss=0.2819, pruned_loss=0.07714, over 23834.00 frames. ], tot_loss[loss=0.2072, simple_loss=0.2763, pruned_loss=0.06898, over 4097546.29 frames. ], batch size: 195, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:28:56,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:28:58,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 09:28:58,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:28:58,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:29:02,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:29:02,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:29:05,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:29:07,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:29:08,067 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.60 vs. limit=12.0 2023-09-29 09:29:08,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 09:29:10,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 09:29:10,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:29:11,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 09:29:11,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:29:16,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:29:16,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:29:16,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 09:29:18,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:29:18,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:29:18,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:29:19,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:29:22,570 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 09:29:22,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 09:29:29,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:29:30,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:29:31,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 09:29:31,891 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=321520.0, ans=0.1 2023-09-29 09:29:33,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 09:29:33,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=321520.0, ans=0.0 2023-09-29 09:29:33,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=321520.0, ans=0.125 2023-09-29 09:29:34,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:29:37,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:29:45,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 09:29:48,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 09:29:49,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 09:29:52,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:29:54,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:29:55,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 09:29:58,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:30:02,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:30:04,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:30:07,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:30:07,713 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.56 vs. limit=22.5 2023-09-29 09:30:08,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 09:30:08,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=321653.3333333333, ans=0.0 2023-09-29 09:30:10,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 09:30:11,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 09:30:11,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=321653.3333333333, ans=0.125 2023-09-29 09:30:15,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:30:15,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:30:16,745 INFO [train.py:1039] (3/4) Epoch 10, batch 450, loss[loss=0.23, simple_loss=0.2907, pruned_loss=0.08463, over 23784.00 frames. ], tot_loss[loss=0.2073, simple_loss=0.2769, pruned_loss=0.06889, over 4241113.19 frames. ], batch size: 179, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:30:16,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 09:30:18,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 09:30:18,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:30:20,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 09:30:22,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 09:30:22,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:30:23,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:30:25,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:30:25,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 09:30:26,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:30:26,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:30:29,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:30:37,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:30:38,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:30:41,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 09:30:41,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 09:30:45,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:30:48,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:30:51,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:30:56,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:30:56,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:30:59,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 09:30:59,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 09:31:00,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 09:31:02,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:31:02,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:31:03,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:31:05,351 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 09:31:05,365 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 09:31:05,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:31:06,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:31:07,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=321920.0, ans=0.125 2023-09-29 09:31:09,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 09:31:12,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:31:14,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:31:14,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 09:31:15,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 09:31:17,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:31:17,654 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=321920.0, ans=0.0 2023-09-29 09:31:20,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:31:21,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:31:22,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 09:31:25,381 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.898e+02 2.147e+02 2.621e+02 3.477e+02, threshold=4.294e+02, percent-clipped=0.0 2023-09-29 09:31:25,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:31:27,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 09:31:29,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 09:31:30,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:31:35,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:31:36,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:31:38,100 INFO [train.py:1039] (3/4) Epoch 10, batch 500, loss[loss=0.1986, simple_loss=0.2812, pruned_loss=0.05799, over 24471.00 frames. ], tot_loss[loss=0.2081, simple_loss=0.2775, pruned_loss=0.06934, over 4346076.08 frames. ], batch size: 69, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:31:39,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:31:39,677 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 09:31:43,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:31:45,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:31:46,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:31:46,881 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 09:31:48,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 09:31:48,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:31:51,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:31:54,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 09:31:57,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:32:00,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:32:00,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:32:02,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:06,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=322120.0, ans=0.0 2023-09-29 09:32:08,276 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=322120.0, ans=0.125 2023-09-29 09:32:11,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:32:11,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 09:32:11,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=322186.6666666667, ans=0.1 2023-09-29 09:32:12,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:32:12,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:32:12,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 09:32:12,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 09:32:17,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:32:19,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:32:19,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:32:19,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:32:21,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 09:32:24,632 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 09:32:26,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:32:27,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:29,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:29,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:30,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:32:32,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 09:32:34,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:32:34,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=322253.3333333333, ans=0.0 2023-09-29 09:32:37,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:32:41,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:32:44,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:32:44,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=322320.0, ans=0.0 2023-09-29 09:32:48,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:32:52,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 09:32:52,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:32:52,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:32:52,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=322320.0, ans=0.0 2023-09-29 09:32:56,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 09:32:56,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 09:32:59,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:33:00,704 INFO [train.py:1039] (3/4) Epoch 10, batch 550, loss[loss=0.273, simple_loss=0.3242, pruned_loss=0.1109, over 19233.00 frames. ], tot_loss[loss=0.2104, simple_loss=0.2795, pruned_loss=0.07064, over 4422990.31 frames. ], batch size: 388, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:33:04,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 09:33:07,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 09:33:07,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:33:07,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 09:33:07,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=322386.6666666667, ans=0.0 2023-09-29 09:33:09,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:33:09,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:33:10,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:10,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=322386.6666666667, ans=0.0 2023-09-29 09:33:12,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:12,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:33:12,266 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=322386.6666666667, ans=0.125 2023-09-29 09:33:12,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=322386.6666666667, ans=10.0 2023-09-29 09:33:14,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:33:15,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:33:17,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 09:33:17,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:33:20,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:33:20,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:22,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=322453.3333333333, ans=0.125 2023-09-29 09:33:23,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:33:23,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=322453.3333333333, ans=0.125 2023-09-29 09:33:25,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:29,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 09:33:31,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 09:33:32,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:33:37,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:33:37,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:33:39,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:33:44,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:33:44,034 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 09:33:44,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=322520.0, ans=0.0 2023-09-29 09:33:46,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:33:47,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 09:33:49,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:33:49,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 09:33:49,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:33:52,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:33:52,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 09:33:54,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 09:33:55,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:33:55,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:33:57,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:33:57,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:34:00,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:34:00,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:34:02,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:34:04,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:34:07,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 09:34:08,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:34:10,435 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 2.009e+02 2.272e+02 2.657e+02 5.113e+02, threshold=4.543e+02, percent-clipped=1.0 2023-09-29 09:34:10,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:34:12,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:34:12,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:34:14,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:34:14,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 09:34:20,021 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.31 vs. limit=15.0 2023-09-29 09:34:21,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 09:34:22,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 09:34:24,177 INFO [train.py:1039] (3/4) Epoch 10, batch 600, loss[loss=0.1924, simple_loss=0.2649, pruned_loss=0.05998, over 20301.00 frames. ], tot_loss[loss=0.2097, simple_loss=0.2794, pruned_loss=0.07002, over 4496165.49 frames. ], batch size: 44, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:34:24,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:34:24,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:34:24,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:34:31,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:34:35,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:34:37,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 09:34:39,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:34:40,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:34:42,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:34:44,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=322786.6666666667, ans=0.2 2023-09-29 09:34:46,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 09:34:46,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:34:52,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 09:34:55,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:34:55,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:34:55,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:35:04,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:35:04,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:35:04,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:35:13,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:35:16,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:35:16,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:35:16,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:35:24,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 09:35:25,276 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=322920.0, ans=0.125 2023-09-29 09:35:28,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:35:29,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:35:33,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 09:35:33,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:35:36,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 09:35:38,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:35:38,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:35:45,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 09:35:46,810 INFO [train.py:1039] (3/4) Epoch 10, batch 650, loss[loss=0.2041, simple_loss=0.2642, pruned_loss=0.07196, over 23593.00 frames. ], tot_loss[loss=0.2083, simple_loss=0.2781, pruned_loss=0.06919, over 4540253.01 frames. ], batch size: 135, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:35:47,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 09:35:48,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=323053.3333333333, ans=0.09899494936611666 2023-09-29 09:35:49,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:35:51,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:35:55,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:35:56,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 09:35:56,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:36:01,966 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=323120.0, ans=0.0 2023-09-29 09:36:03,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:36:03,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:36:06,681 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:36:11,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 09:36:14,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:36:14,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:36:19,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:36:19,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 09:36:21,842 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=323186.6666666667, ans=0.04949747468305833 2023-09-29 09:36:23,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:36:23,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:23,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 09:36:24,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:25,022 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:36:28,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:36:28,094 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 09:36:28,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:36:28,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:36:33,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:33,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:36:35,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:36:36,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:36:36,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 09:36:38,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:36:38,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:36:38,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 09:36:38,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:36:39,048 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:36:40,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 09:36:41,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 09:36:44,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 09:36:44,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:44,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:36:44,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:36:45,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:36:46,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:36:46,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=323253.3333333333, ans=0.1 2023-09-29 09:36:54,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:36:55,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:36:56,671 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.920e+02 2.159e+02 2.398e+02 3.616e+02, threshold=4.317e+02, percent-clipped=0.0 2023-09-29 09:36:56,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:36:59,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:36:59,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 09:37:01,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:37:09,364 INFO [train.py:1039] (3/4) Epoch 10, batch 700, loss[loss=0.1768, simple_loss=0.2451, pruned_loss=0.05422, over 24319.00 frames. ], tot_loss[loss=0.2065, simple_loss=0.2758, pruned_loss=0.06862, over 4574918.97 frames. ], batch size: 56, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:37:09,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:37:09,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:37:09,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:37:10,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:37:14,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 09:37:16,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 09:37:19,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 09:37:19,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:37:20,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:37:22,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 09:37:29,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:37:31,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:37:32,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:37:32,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 09:37:34,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:37:36,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:37:39,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 09:37:39,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:37:42,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 09:37:45,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 09:37:45,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=323520.0, ans=0.2 2023-09-29 09:37:49,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:37:50,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:37:52,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:37:57,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:37:58,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 09:38:04,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:38:04,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:38:05,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 09:38:10,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:38:10,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:38:10,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=323586.6666666667, ans=0.1 2023-09-29 09:38:15,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:38:18,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:38:18,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 09:38:22,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 09:38:22,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 09:38:25,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:38:26,391 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.09 vs. limit=12.0 2023-09-29 09:38:27,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=323653.3333333333, ans=0.0 2023-09-29 09:38:28,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:38:28,867 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=323653.3333333333, ans=0.125 2023-09-29 09:38:30,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:38:32,729 INFO [train.py:1039] (3/4) Epoch 10, batch 750, loss[loss=0.1876, simple_loss=0.2593, pruned_loss=0.05796, over 20192.00 frames. ], tot_loss[loss=0.2064, simple_loss=0.2752, pruned_loss=0.06884, over 4600552.35 frames. ], batch size: 44, lr: 1.06e-02, grad_scale: 8.0 2023-09-29 09:38:32,824 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:38:32,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 09:38:33,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=323720.0, ans=0.0 2023-09-29 09:38:36,145 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=323720.0, ans=0.1 2023-09-29 09:38:37,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 09:38:37,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 09:38:38,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 09:38:38,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 09:38:39,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 09:38:40,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:38:41,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 09:38:42,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:38:43,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:38:43,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:38:45,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:38:46,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:38:46,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:38:50,383 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:38:50,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:38:53,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:38:54,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=323786.6666666667, ans=0.0 2023-09-29 09:38:55,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:38:57,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:38:57,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 09:38:58,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:38:58,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:39:00,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:39:04,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 09:39:05,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 09:39:05,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:39:07,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 09:39:07,175 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 09:39:07,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 09:39:07,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:39:07,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 09:39:10,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:39:16,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=323853.3333333333, ans=0.0 2023-09-29 09:39:17,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:39:17,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:39:17,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:39:21,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:39:21,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:39:22,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 09:39:22,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:39:24,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 09:39:25,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:39:29,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:39:29,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 09:39:30,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:39:37,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:39:38,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:39:38,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:39:41,886 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.983e+02 2.343e+02 2.893e+02 4.717e+02, threshold=4.686e+02, percent-clipped=1.0 2023-09-29 09:39:41,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:39:43,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 09:39:43,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:39:43,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:39:49,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:39:49,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:39:51,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:39:53,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:39:54,408 INFO [train.py:1039] (3/4) Epoch 10, batch 800, loss[loss=0.1883, simple_loss=0.2642, pruned_loss=0.05622, over 24473.00 frames. ], tot_loss[loss=0.2077, simple_loss=0.2764, pruned_loss=0.06956, over 4625781.16 frames. ], batch size: 63, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:39:55,258 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=12.89 vs. limit=15.0 2023-09-29 09:40:00,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:40:00,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:02,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:40:02,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:40:04,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:04,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:06,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:12,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:40:12,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:40:16,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 09:40:16,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:19,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:40:19,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:40:19,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:40:19,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 09:40:21,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:40:21,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 09:40:24,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:25,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:40:27,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:40:27,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:40:30,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:30,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:35,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:40:35,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:40:35,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 09:40:35,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=324186.6666666667, ans=0.0 2023-09-29 09:40:36,378 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.19 vs. limit=15.0 2023-09-29 09:40:37,083 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 09:40:38,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 09:40:38,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:40:38,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:40:38,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:40:38,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=324186.6666666667, ans=0.2 2023-09-29 09:40:41,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:40:42,223 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.01 vs. limit=22.5 2023-09-29 09:40:46,230 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 09:40:46,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 09:40:49,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:40:49,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 09:40:51,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:40:54,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:40:54,771 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=324253.3333333333, ans=0.0 2023-09-29 09:40:56,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 09:40:57,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:41:01,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 09:41:09,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:41:10,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=324320.0, ans=0.0 2023-09-29 09:41:11,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:41:12,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 09:41:13,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:41:15,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:41:15,680 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=324386.6666666667, ans=0.2 2023-09-29 09:41:16,492 INFO [train.py:1039] (3/4) Epoch 10, batch 850, loss[loss=0.2423, simple_loss=0.2985, pruned_loss=0.09309, over 22791.00 frames. ], tot_loss[loss=0.2089, simple_loss=0.2778, pruned_loss=0.07006, over 4641272.37 frames. ], batch size: 322, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:41:16,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 09:41:16,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:41:20,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:41:21,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:41:23,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:41:25,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:41:26,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 09:41:28,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 09:41:28,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 09:41:28,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:41:28,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:41:30,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:41:31,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:41:31,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:41:37,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:41:37,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:41:37,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 09:41:42,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 09:41:45,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:41:47,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 09:41:53,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 09:41:55,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 09:41:57,555 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 09:41:57,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:41:57,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:41:57,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 09:42:00,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:42:03,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:42:03,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 09:42:05,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:42:05,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:42:06,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:42:06,590 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:42:08,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:42:08,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=324586.6666666667, ans=0.0 2023-09-29 09:42:09,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 09:42:11,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 09:42:13,688 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.64 vs. limit=6.0 2023-09-29 09:42:14,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:42:14,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:42:16,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:42:16,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:42:17,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:42:19,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:42:22,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 09:42:23,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:42:24,599 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.670e+02 1.918e+02 2.187e+02 2.530e+02 4.309e+02, threshold=4.375e+02, percent-clipped=0.0 2023-09-29 09:42:24,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:42:26,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:42:32,359 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=324653.3333333333, ans=0.2 2023-09-29 09:42:33,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 09:42:34,612 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.97 vs. limit=22.5 2023-09-29 09:42:35,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:42:36,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 09:42:36,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:42:36,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:42:38,099 INFO [train.py:1039] (3/4) Epoch 10, batch 900, loss[loss=0.209, simple_loss=0.2776, pruned_loss=0.0702, over 23799.00 frames. ], tot_loss[loss=0.2101, simple_loss=0.2785, pruned_loss=0.0708, over 4647896.27 frames. ], batch size: 195, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:42:41,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 09:42:45,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:42:46,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=324720.0, ans=0.2 2023-09-29 09:42:47,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:42:47,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 09:42:50,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:42:51,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 09:42:52,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 09:42:54,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:42:54,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:42:54,619 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 09:42:55,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:43:08,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:43:09,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:43:09,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:43:12,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=324853.3333333333, ans=0.1 2023-09-29 09:43:14,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:43:18,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 09:43:20,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:43:20,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=324853.3333333333, ans=0.025 2023-09-29 09:43:24,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 09:43:25,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 09:43:25,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=324920.0, ans=0.1 2023-09-29 09:43:26,961 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 09:43:27,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 09:43:27,359 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:43:35,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 09:43:35,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:43:36,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:43:44,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:43:44,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:43:45,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 09:43:47,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:43:48,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 09:43:48,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:43:50,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:43:50,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:43:51,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:43:55,259 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 09:43:56,530 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 09:43:58,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 09:43:59,855 INFO [train.py:1039] (3/4) Epoch 10, batch 950, loss[loss=0.2105, simple_loss=0.2638, pruned_loss=0.07854, over 23746.00 frames. ], tot_loss[loss=0.2119, simple_loss=0.2795, pruned_loss=0.07216, over 4641528.83 frames. ], batch size: 232, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:43:59,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 09:44:03,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:44:08,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 09:44:11,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=325053.3333333333, ans=0.125 2023-09-29 09:44:12,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:44:14,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:44:16,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:44:16,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:44:18,352 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 09:44:21,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:44:21,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:44:22,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:44:24,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:44:24,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 09:44:24,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 09:44:27,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:44:28,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 09:44:29,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:44:32,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:44:32,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:44:32,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:44:34,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 09:44:37,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 09:44:37,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:44:40,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:44:45,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:44:45,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:44:49,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 09:44:51,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 09:44:51,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 09:44:52,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:44:53,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:44:53,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:44:57,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 09:44:58,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:45:00,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:45:01,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:45:01,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 09:45:03,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:45:03,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:45:03,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 09:45:07,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:45:09,605 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.907e+02 2.117e+02 2.443e+02 3.103e+02, threshold=4.235e+02, percent-clipped=0.0 2023-09-29 09:45:09,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:45:15,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:45:17,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 09:45:17,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 09:45:22,952 INFO [train.py:1039] (3/4) Epoch 10, batch 1000, loss[loss=0.1903, simple_loss=0.2626, pruned_loss=0.05899, over 22213.00 frames. ], tot_loss[loss=0.2114, simple_loss=0.2786, pruned_loss=0.07214, over 4651250.86 frames. ], batch size: 49, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:45:23,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:45:25,094 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.42 vs. limit=12.0 2023-09-29 09:45:27,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 09:45:27,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:45:35,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:45:35,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 09:45:35,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 09:45:35,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=325386.6666666667, ans=0.05 2023-09-29 09:45:37,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=325453.3333333333, ans=0.0 2023-09-29 09:45:39,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:45:39,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:45:43,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:45:46,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 09:45:50,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 09:45:50,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 09:45:51,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:45:53,894 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 09:45:54,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 09:45:54,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 09:45:57,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:45:57,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:02,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=325520.0, ans=0.0 2023-09-29 09:46:04,451 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=15.23 vs. limit=15.0 2023-09-29 09:46:06,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:46:06,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:46:08,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:08,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:46:08,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 09:46:09,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:46:11,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:46:11,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:46:11,575 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_ff3.min_abs, batch_count=325586.6666666667, ans=0.2 2023-09-29 09:46:12,855 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 09:46:14,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 09:46:16,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 09:46:18,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 09:46:18,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=325586.6666666667, ans=0.2 2023-09-29 09:46:21,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:46:27,190 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=325653.3333333333, ans=0.0 2023-09-29 09:46:28,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:28,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:46:29,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=325653.3333333333, ans=0.125 2023-09-29 09:46:30,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:30,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:46:33,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 09:46:34,978 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:46:35,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 09:46:35,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 09:46:36,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:46:36,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:46:38,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:46:41,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:46:42,943 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:46:44,475 INFO [train.py:1039] (3/4) Epoch 10, batch 1050, loss[loss=0.2231, simple_loss=0.2846, pruned_loss=0.08082, over 23414.00 frames. ], tot_loss[loss=0.2099, simple_loss=0.2773, pruned_loss=0.07128, over 4666879.42 frames. ], batch size: 106, lr: 1.06e-02, grad_scale: 16.0 2023-09-29 09:46:46,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:46:47,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:46:49,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:46:51,225 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:46:52,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:46:54,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 09:46:55,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 09:46:59,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:46:59,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:46:59,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:47:01,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:47:03,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 09:47:03,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:47:04,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 09:47:06,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:47:06,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 09:47:06,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 09:47:06,800 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.05 vs. limit=22.5 2023-09-29 09:47:12,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=325786.6666666667, ans=0.125 2023-09-29 09:47:15,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:47:15,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:47:15,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:47:18,749 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=325853.3333333333, ans=0.1 2023-09-29 09:47:20,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 09:47:20,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 09:47:20,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:47:22,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 09:47:24,104 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=325853.3333333333, ans=0.1 2023-09-29 09:47:25,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 09:47:25,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=325853.3333333333, ans=0.2 2023-09-29 09:47:26,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:47:31,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 09:47:32,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_ff2.min_abs, batch_count=325920.0, ans=0.1 2023-09-29 09:47:34,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 09:47:34,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:47:35,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:47:39,253 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.41 vs. limit=12.0 2023-09-29 09:47:40,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:47:43,373 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 09:47:43,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=325920.0, ans=0.125 2023-09-29 09:47:44,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 09:47:45,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 09:47:45,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:47:46,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:47:48,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 09:47:51,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:47:52,397 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 2.027e+02 2.343e+02 2.792e+02 3.800e+02, threshold=4.687e+02, percent-clipped=0.0 2023-09-29 09:47:54,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:47:54,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:47:56,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:47:56,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:01,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:01,170 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 09:48:04,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 09:48:04,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 09:48:04,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 09:48:05,647 INFO [train.py:1039] (3/4) Epoch 10, batch 1100, loss[loss=0.213, simple_loss=0.2743, pruned_loss=0.07588, over 23382.00 frames. ], tot_loss[loss=0.2089, simple_loss=0.2765, pruned_loss=0.07064, over 4677510.09 frames. ], batch size: 119, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:48:05,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:48:11,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:48:14,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:48:17,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 09:48:18,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 09:48:19,452 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:48:19,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 09:48:20,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:48:22,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 09:48:26,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:48:29,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:48:29,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 09:48:30,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 09:48:32,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:48:32,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:48:33,348 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.13 vs. limit=6.0 2023-09-29 09:48:35,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:48:36,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=326120.0, ans=0.125 2023-09-29 09:48:37,644 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=326186.6666666667, ans=0.1 2023-09-29 09:48:38,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 09:48:44,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:48:47,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 09:48:47,691 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 09:48:47,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:50,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=326186.6666666667, ans=0.125 2023-09-29 09:48:52,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:53,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:48:53,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:48:55,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 09:48:56,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:48:56,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 09:48:56,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:48:56,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:48:58,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 09:49:04,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 09:49:04,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 09:49:08,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:49:11,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:49:13,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 09:49:13,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 09:49:14,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:49:18,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:49:18,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:49:20,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 09:49:21,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:49:21,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:49:23,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 09:49:23,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:49:23,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 09:49:25,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:49:25,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:49:25,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=326320.0, ans=0.125 2023-09-29 09:49:26,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:49:28,283 INFO [train.py:1039] (3/4) Epoch 10, batch 1150, loss[loss=0.2262, simple_loss=0.3042, pruned_loss=0.07411, over 24380.00 frames. ], tot_loss[loss=0.2101, simple_loss=0.2779, pruned_loss=0.07114, over 4672457.97 frames. ], batch size: 77, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:49:31,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:49:34,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:49:36,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:49:36,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:49:38,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 09:49:38,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:49:41,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 09:49:43,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:49:43,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:49:49,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 09:49:51,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:49:54,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=326453.3333333333, ans=0.125 2023-09-29 09:49:55,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:49:55,759 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.71 vs. limit=15.0 2023-09-29 09:49:56,609 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:49:58,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 09:49:58,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:49:58,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:50:04,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 09:50:05,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:50:07,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:50:17,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:50:21,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=326586.6666666667, ans=0.125 2023-09-29 09:50:23,831 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:50:23,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 09:50:24,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=326586.6666666667, ans=0.125 2023-09-29 09:50:25,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:50:25,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:50:30,546 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 09:50:30,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:50:37,352 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 2.028e+02 2.366e+02 3.044e+02 5.235e+02, threshold=4.733e+02, percent-clipped=2.0 2023-09-29 09:50:39,766 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 09:50:42,086 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.46 vs. limit=10.0 2023-09-29 09:50:44,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:50:45,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:50:45,922 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 09:50:45,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:50:49,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:50:50,928 INFO [train.py:1039] (3/4) Epoch 10, batch 1200, loss[loss=0.1962, simple_loss=0.2642, pruned_loss=0.06412, over 24316.00 frames. ], tot_loss[loss=0.2104, simple_loss=0.2782, pruned_loss=0.07125, over 4673838.55 frames. ], batch size: 56, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 09:50:52,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=326720.0, ans=0.0 2023-09-29 09:50:54,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 09:50:54,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:50:57,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:50:57,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:50:57,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:51:00,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:51:02,247 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 09:51:04,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:51:04,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:51:07,437 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 09:51:10,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 09:51:15,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:51:17,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:51:19,087 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=1.677e-02 2023-09-29 09:51:20,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:51:20,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:51:20,423 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 09:51:23,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:51:28,726 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=326853.3333333333, ans=0.1 2023-09-29 09:51:31,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 09:51:31,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:51:31,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 09:51:33,078 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:51:34,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 09:51:40,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 09:51:40,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:51:42,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:51:43,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:51:44,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 09:51:45,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:51:45,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:51:46,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:51:47,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 09:51:47,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:51:49,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:51:49,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 09:51:50,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:51:50,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:51:55,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 09:51:57,768 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:51:58,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 09:52:02,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 09:52:05,201 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 09:52:08,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:52:11,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:52:13,020 INFO [train.py:1039] (3/4) Epoch 10, batch 1250, loss[loss=0.2189, simple_loss=0.2856, pruned_loss=0.07613, over 23484.00 frames. ], tot_loss[loss=0.2101, simple_loss=0.2785, pruned_loss=0.07087, over 4684059.94 frames. ], batch size: 93, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 09:52:13,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:52:15,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:52:16,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 09:52:21,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:52:22,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:52:23,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 09:52:26,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:52:26,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:52:27,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=327053.3333333333, ans=0.125 2023-09-29 09:52:29,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=327120.0, ans=0.2 2023-09-29 09:52:30,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=327120.0, ans=0.2 2023-09-29 09:52:31,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 09:52:33,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:52:33,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:52:33,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:52:36,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 09:52:36,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=327120.0, ans=0.125 2023-09-29 09:52:41,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 09:52:41,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:52:41,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:52:42,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:52:43,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:52:47,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:52:49,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 09:52:52,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 09:52:54,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 09:52:56,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=327186.6666666667, ans=0.0 2023-09-29 09:52:56,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=327186.6666666667, ans=0.0 2023-09-29 09:52:57,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:52:57,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 09:52:58,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:52:59,557 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 09:52:59,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:52:59,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:53:03,753 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.02 vs. limit=6.0 2023-09-29 09:53:04,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:53:06,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:53:07,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:53:09,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 09:53:09,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 09:53:09,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 09:53:11,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:53:12,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 09:53:12,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:53:16,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 09:53:16,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:53:17,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 09:53:18,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 09:53:19,060 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 09:53:19,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 09:53:20,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:53:21,761 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 2.035e+02 2.245e+02 2.590e+02 3.760e+02, threshold=4.489e+02, percent-clipped=0.0 2023-09-29 09:53:21,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 09:53:26,459 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:53:28,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:53:30,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 09:53:30,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=327320.0, ans=0.125 2023-09-29 09:53:35,034 INFO [train.py:1039] (3/4) Epoch 10, batch 1300, loss[loss=0.2168, simple_loss=0.2878, pruned_loss=0.07294, over 24058.00 frames. ], tot_loss[loss=0.2095, simple_loss=0.2783, pruned_loss=0.0703, over 4692068.70 frames. ], batch size: 80, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 09:53:35,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 09:53:37,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:53:38,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 09:53:43,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:53:45,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 09:53:47,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:53:49,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:53:49,724 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 09:53:51,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 09:53:55,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:53:56,012 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.14 vs. limit=22.5 2023-09-29 09:53:56,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:53:57,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 09:54:01,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 09:54:04,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:54:04,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:54:06,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:54:08,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:54:08,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 09:54:10,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 09:54:11,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 09:54:17,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:54:18,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 09:54:19,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 09:54:21,088 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 09:54:24,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:54:26,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:54:26,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 09:54:27,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:54:27,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 09:54:29,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:54:33,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:54:33,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:54:36,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 09:54:38,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 09:54:38,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=327653.3333333333, ans=0.125 2023-09-29 09:54:41,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 09:54:47,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:54:49,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 09:54:51,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:54:55,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=327720.0, ans=0.1 2023-09-29 09:54:56,846 INFO [train.py:1039] (3/4) Epoch 10, batch 1350, loss[loss=0.2004, simple_loss=0.2638, pruned_loss=0.06853, over 23787.00 frames. ], tot_loss[loss=0.2087, simple_loss=0.2771, pruned_loss=0.07015, over 4698158.24 frames. ], batch size: 164, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 09:54:58,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 09:55:03,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:55:05,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:55:05,863 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=327720.0, ans=0.125 2023-09-29 09:55:07,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:55:07,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:55:10,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:55:11,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:55:15,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 09:55:19,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 09:55:19,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:55:20,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 09:55:22,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 09:55:24,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:55:25,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:55:25,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 09:55:27,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 09:55:28,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 09:55:29,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=327853.3333333333, ans=0.125 2023-09-29 09:55:31,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:55:32,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 09:55:43,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:55:53,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:55:55,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:55:55,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 09:55:59,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:55:59,839 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.89 vs. limit=12.0 2023-09-29 09:56:00,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 09:56:00,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 09:56:00,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:56:03,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:56:06,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 09:56:06,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 09:56:09,399 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 2.130e+02 2.395e+02 2.953e+02 6.223e+02, threshold=4.790e+02, percent-clipped=2.0 2023-09-29 09:56:12,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 09:56:14,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 09:56:20,586 INFO [train.py:1039] (3/4) Epoch 10, batch 1400, loss[loss=0.1808, simple_loss=0.2516, pruned_loss=0.05497, over 24281.00 frames. ], tot_loss[loss=0.2078, simple_loss=0.2759, pruned_loss=0.06985, over 4694448.47 frames. ], batch size: 56, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:56:20,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 09:56:23,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:56:25,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:56:27,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:56:34,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 09:56:35,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 09:56:36,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=328120.0, ans=0.2 2023-09-29 09:56:36,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=328120.0, ans=0.0 2023-09-29 09:56:45,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 09:56:47,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:56:48,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:56:50,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 09:56:52,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=328186.6666666667, ans=0.1 2023-09-29 09:56:55,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 09:56:56,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 09:57:03,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=328186.6666666667, ans=0.2 2023-09-29 09:57:07,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:57:08,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:57:11,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 09:57:11,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:57:11,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 09:57:14,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 09:57:14,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:57:15,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 09:57:15,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:57:15,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:57:17,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 09:57:17,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:57:21,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:57:25,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:57:31,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 09:57:33,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 09:57:35,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 09:57:38,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 09:57:38,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:57:41,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:57:43,374 INFO [train.py:1039] (3/4) Epoch 10, batch 1450, loss[loss=0.203, simple_loss=0.2768, pruned_loss=0.06459, over 24493.00 frames. ], tot_loss[loss=0.2067, simple_loss=0.2754, pruned_loss=0.06899, over 4716081.55 frames. ], batch size: 63, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:57:45,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 09:57:47,258 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:57:47,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:57:47,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 09:57:47,549 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=328386.6666666667, ans=0.125 2023-09-29 09:57:50,824 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:57:52,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:57:53,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 09:57:53,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:57:55,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 09:57:56,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 09:57:57,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 09:57:58,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:58:00,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:58:00,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 09:58:01,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:58:01,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 09:58:01,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 09:58:01,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:58:03,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 09:58:06,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:58:10,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:58:11,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 09:58:11,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 09:58:14,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:58:14,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:58:19,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 09:58:19,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 09:58:19,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:58:19,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:58:22,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 09:58:24,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 09:58:29,217 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 09:58:30,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:58:32,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 09:58:33,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:58:35,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 09:58:38,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=328586.6666666667, ans=0.125 2023-09-29 09:58:38,747 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 09:58:41,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:58:42,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 09:58:43,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 09:58:45,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:58:49,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:58:49,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:58:50,087 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=328653.3333333333, ans=0.125 2023-09-29 09:58:53,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 09:58:54,752 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 1.987e+02 2.423e+02 2.995e+02 4.591e+02, threshold=4.846e+02, percent-clipped=0.0 2023-09-29 09:58:54,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 09:58:54,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 09:58:57,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:58:57,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 09:59:06,769 INFO [train.py:1039] (3/4) Epoch 10, batch 1500, loss[loss=0.2255, simple_loss=0.2834, pruned_loss=0.08382, over 23561.00 frames. ], tot_loss[loss=0.2065, simple_loss=0.2756, pruned_loss=0.06873, over 4719184.93 frames. ], batch size: 256, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 09:59:10,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 09:59:11,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 09:59:11,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 09:59:11,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:59:12,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:59:14,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 09:59:14,590 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 09:59:16,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 09:59:16,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 09:59:16,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 09:59:17,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 09:59:19,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 09:59:21,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:59:26,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 09:59:26,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 09:59:27,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 09:59:29,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 09:59:29,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:59:33,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 09:59:35,636 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.27 vs. limit=15.0 2023-09-29 09:59:36,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=328786.6666666667, ans=0.0 2023-09-29 09:59:39,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 09:59:39,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 09:59:41,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 09:59:42,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 09:59:44,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 09:59:47,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 09:59:47,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 09:59:48,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 09:59:50,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 09:59:50,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:59:50,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 09:59:50,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 09:59:50,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=328853.3333333333, ans=0.125 2023-09-29 09:59:57,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 09:59:57,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 10:00:03,933 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 10:00:05,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:00:07,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=328920.0, ans=0.125 2023-09-29 10:00:10,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=328986.6666666667, ans=0.125 2023-09-29 10:00:12,161 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 10:00:14,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:00:14,072 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 10:00:15,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:00:17,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:00:17,282 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 10:00:18,869 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:00:22,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 10:00:23,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:00:27,784 INFO [train.py:1039] (3/4) Epoch 10, batch 1550, loss[loss=0.1901, simple_loss=0.2741, pruned_loss=0.05303, over 24483.00 frames. ], tot_loss[loss=0.2064, simple_loss=0.2759, pruned_loss=0.06851, over 4729336.17 frames. ], batch size: 69, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 10:00:27,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:00:27,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:00:28,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:00:28,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:00:29,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:00:31,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 10:00:31,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 10:00:32,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:00:32,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 10:00:33,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 10:00:35,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:00:36,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:00:36,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:00:36,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:00:38,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:00:38,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=329053.3333333333, ans=0.5 2023-09-29 10:00:40,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:00:44,143 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 10:00:44,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:00:44,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:00:44,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 10:00:47,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:00:47,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 10:00:49,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:00:49,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 10:00:51,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 10:00:51,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 10:00:51,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:00:52,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:00:55,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:00:58,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 10:00:58,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 10:01:00,494 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=329186.6666666667, ans=0.0 2023-09-29 10:01:07,173 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=329186.6666666667, ans=0.125 2023-09-29 10:01:08,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:01:10,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=329186.6666666667, ans=0.125 2023-09-29 10:01:12,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:01:12,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:01:13,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:01:15,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 10:01:18,993 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=329253.3333333333, ans=0.1 2023-09-29 10:01:20,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:01:22,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:01:25,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:01:28,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:01:28,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:01:28,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 10:01:29,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:01:31,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:01:32,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:01:34,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 10:01:34,330 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 10:01:37,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:01:39,013 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.945e+02 2.181e+02 2.494e+02 3.678e+02, threshold=4.362e+02, percent-clipped=0.0 2023-09-29 10:01:43,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=329320.0, ans=0.0 2023-09-29 10:01:44,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 10:01:49,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:01:50,487 INFO [train.py:1039] (3/4) Epoch 10, batch 1600, loss[loss=0.2264, simple_loss=0.2976, pruned_loss=0.07758, over 23407.00 frames. ], tot_loss[loss=0.2073, simple_loss=0.277, pruned_loss=0.06878, over 4736370.80 frames. ], batch size: 105, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:01:50,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:01:52,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 10:01:52,447 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=329386.6666666667, ans=0.125 2023-09-29 10:01:54,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:01:56,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:01:56,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:01:56,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:01:57,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:01:58,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=329386.6666666667, ans=0.125 2023-09-29 10:02:01,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:02:03,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 10:02:03,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 10:02:06,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 10:02:09,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:02:09,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 10:02:10,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:02:12,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:02:18,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=329453.3333333333, ans=0.1 2023-09-29 10:02:19,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:02:21,114 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=329453.3333333333, ans=0.0 2023-09-29 10:02:22,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 10:02:25,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:02:25,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 10:02:25,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:02:27,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 10:02:34,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 10:02:38,462 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=16.81 vs. limit=15.0 2023-09-29 10:02:42,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:02:42,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 10:02:43,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:02:44,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:02:44,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:02:45,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 10:02:47,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=329586.6666666667, ans=0.2 2023-09-29 10:02:47,583 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=329586.6666666667, ans=0.125 2023-09-29 10:02:50,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 10:02:53,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:02:53,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:02:53,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:02:55,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:02:56,590 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.29 vs. limit=15.0 2023-09-29 10:02:57,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:02:57,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:02:58,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:03:04,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:03:06,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:03:07,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 10:03:07,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:03:09,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 10:03:13,620 INFO [train.py:1039] (3/4) Epoch 10, batch 1650, loss[loss=0.2108, simple_loss=0.2655, pruned_loss=0.07803, over 23745.00 frames. ], tot_loss[loss=0.2075, simple_loss=0.2772, pruned_loss=0.06896, over 4731527.38 frames. ], batch size: 232, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:03:13,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:03:15,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:03:15,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:03:15,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 10:03:15,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 10:03:17,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 10:03:17,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 10:03:20,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:03:20,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=329720.0, ans=0.2 2023-09-29 10:03:22,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:03:22,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:03:22,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:03:24,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:03:25,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 10:03:30,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:03:30,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:03:30,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:03:30,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:03:31,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 10:03:31,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 10:03:40,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:03:43,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:03:43,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=329786.6666666667, ans=0.125 2023-09-29 10:03:49,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 10:03:51,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:03:54,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 10:03:55,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:04:00,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:04:00,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:04:01,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:04:02,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:04:03,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:04:06,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:04:08,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:04:08,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:04:09,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:04:09,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=329920.0, ans=0.125 2023-09-29 10:04:10,042 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.44 vs. limit=22.5 2023-09-29 10:04:11,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:04:11,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:04:13,755 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=329920.0, ans=0.125 2023-09-29 10:04:15,686 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.65 vs. limit=22.5 2023-09-29 10:04:16,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:04:17,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 10:04:20,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:04:20,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 10:04:21,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 10:04:21,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 10:04:22,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:04:22,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:04:22,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:04:23,879 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.669e+02 2.033e+02 2.438e+02 2.787e+02 4.126e+02, threshold=4.877e+02, percent-clipped=0.0 2023-09-29 10:04:24,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:04:24,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 10:04:27,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=329986.6666666667, ans=0.0 2023-09-29 10:04:28,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:04:30,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:04:30,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:04:33,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 10:04:35,219 INFO [train.py:1039] (3/4) Epoch 10, batch 1700, loss[loss=0.2042, simple_loss=0.2824, pruned_loss=0.063, over 24061.00 frames. ], tot_loss[loss=0.2071, simple_loss=0.2768, pruned_loss=0.06869, over 4737650.00 frames. ], batch size: 80, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:04:35,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=330053.3333333333, ans=0.0 2023-09-29 10:04:37,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:04:37,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:04:37,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 10:04:38,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:04:38,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:04:38,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:04:41,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:04:41,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:04:43,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 10:04:46,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:04:55,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:04:56,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:05:02,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:05:02,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:05:03,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:05:03,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:05:08,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 10:05:09,721 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:05:09,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:05:11,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:05:12,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 10:05:13,527 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.12 vs. limit=15.0 2023-09-29 10:05:16,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 10:05:16,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 10:05:18,432 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:05:18,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=330186.6666666667, ans=0.125 2023-09-29 10:05:19,243 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.81 vs. limit=15.0 2023-09-29 10:05:19,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 10:05:22,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:05:22,469 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=330186.6666666667, ans=0.1 2023-09-29 10:05:31,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:05:32,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:05:33,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:05:35,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:05:35,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 10:05:35,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:05:37,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:05:37,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 10:05:38,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:05:38,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:05:39,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:05:39,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:05:41,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:05:41,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:05:41,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:05:43,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:05:44,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:05:44,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=330320.0, ans=0.125 2023-09-29 10:05:48,051 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:05:50,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 10:05:51,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:05:53,284 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:05:54,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 10:05:57,317 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=330386.6666666667, ans=0.1 2023-09-29 10:05:58,419 INFO [train.py:1039] (3/4) Epoch 10, batch 1750, loss[loss=0.2141, simple_loss=0.2672, pruned_loss=0.08056, over 22758.00 frames. ], tot_loss[loss=0.2054, simple_loss=0.2745, pruned_loss=0.06811, over 4733678.39 frames. ], batch size: 322, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:06:01,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:06:04,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:06:04,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:06:06,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 10:06:06,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:06:09,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:06:09,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:06:14,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 10:06:16,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:06:17,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 10:06:17,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:06:18,021 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=330453.3333333333, ans=0.125 2023-09-29 10:06:19,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:06:19,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=330453.3333333333, ans=0.2 2023-09-29 10:06:23,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 10:06:26,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 10:06:28,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:06:28,255 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 10:06:35,208 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:06:38,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:06:38,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:06:41,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:06:41,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:06:44,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:06:47,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:06:49,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:06:50,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:06:53,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 10:06:54,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:06:57,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 10:06:59,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:07:00,741 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.07 vs. limit=6.0 2023-09-29 10:07:01,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:07:01,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:07:05,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:07:05,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 10:07:07,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:07:09,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=330653.3333333333, ans=0.125 2023-09-29 10:07:10,067 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.729e+02 2.086e+02 2.387e+02 2.992e+02 5.082e+02, threshold=4.774e+02, percent-clipped=1.0 2023-09-29 10:07:10,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:07:13,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:07:16,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:07:18,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:07:20,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 10:07:20,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:07:21,621 INFO [train.py:1039] (3/4) Epoch 10, batch 1800, loss[loss=0.2276, simple_loss=0.2882, pruned_loss=0.08343, over 23889.00 frames. ], tot_loss[loss=0.2056, simple_loss=0.2747, pruned_loss=0.06825, over 4730921.56 frames. ], batch size: 195, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:07:21,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:07:21,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:07:21,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:07:21,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:07:21,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:07:24,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:07:26,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:07:28,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 10:07:31,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:07:35,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:07:36,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:07:38,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=330786.6666666667, ans=0.1 2023-09-29 10:07:40,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:07:43,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:07:43,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:07:44,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:07:48,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:07:48,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 10:07:49,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:07:51,737 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=8.19 vs. limit=15.0 2023-09-29 10:07:52,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:07:56,597 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 10:07:59,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 10:08:00,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 10:08:01,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:08:01,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:08:01,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:08:02,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:08:10,163 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 10:08:11,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:08:11,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=330920.0, ans=0.0 2023-09-29 10:08:13,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:08:14,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 10:08:16,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 10:08:16,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:08:18,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:08:19,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:08:23,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 10:08:24,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=330920.0, ans=0.125 2023-09-29 10:08:31,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:08:31,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 10:08:32,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:08:32,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:08:32,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:08:34,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 10:08:37,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:08:37,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:08:40,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 10:08:40,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:08:42,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:08:42,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:08:42,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:08:44,181 INFO [train.py:1039] (3/4) Epoch 10, batch 1850, loss[loss=0.2315, simple_loss=0.2897, pruned_loss=0.08666, over 23391.00 frames. ], tot_loss[loss=0.2065, simple_loss=0.2755, pruned_loss=0.06878, over 4722708.85 frames. ], batch size: 285, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 10:08:44,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:08:45,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:08:47,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:08:48,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:08:52,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:08:52,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:08:58,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:09:00,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 10:09:02,261 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.06 vs. limit=15.0 2023-09-29 10:09:05,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 10:09:08,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 10:09:11,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:09:11,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 10:09:11,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 10:09:11,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=331120.0, ans=0.125 2023-09-29 10:09:21,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:09:22,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 10:09:26,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:09:26,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:09:31,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 10:09:31,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:09:31,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:09:34,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:09:34,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:09:37,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:09:38,607 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=8.06 vs. limit=15.0 2023-09-29 10:09:42,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:09:43,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:09:43,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 10:09:44,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:09:45,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:09:47,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:09:50,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 10:09:52,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:09:55,854 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.969e+02 2.199e+02 2.550e+02 3.875e+02, threshold=4.397e+02, percent-clipped=0.0 2023-09-29 10:09:57,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:09:59,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:09:59,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 10:09:59,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 10:10:01,256 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 10:10:02,713 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 10:10:04,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:10:04,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:10:04,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:10:05,657 INFO [train.py:1039] (3/4) Epoch 10, batch 1900, loss[loss=0.2868, simple_loss=0.3339, pruned_loss=0.1198, over 19427.00 frames. ], tot_loss[loss=0.2073, simple_loss=0.2765, pruned_loss=0.06911, over 4718870.71 frames. ], batch size: 388, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 10:10:05,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:10:05,854 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 10:10:05,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:10:07,373 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:10:07,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:10:07,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:10:09,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:10:10,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 10:10:12,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:10:12,809 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 10:10:12,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:10:12,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:10:17,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:10:21,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:10:21,849 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 10:10:23,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 10:10:23,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:10:24,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:10:24,908 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 10:10:24,976 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 10:10:28,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 10:10:31,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=331453.3333333333, ans=0.2 2023-09-29 10:10:32,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:10:35,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 10:10:35,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 10:10:41,008 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=331520.0, ans=0.125 2023-09-29 10:10:42,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=331520.0, ans=0.125 2023-09-29 10:10:46,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 10:10:47,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=331520.0, ans=0.2 2023-09-29 10:10:49,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 10:10:49,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:10:51,250 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 10:10:51,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 10:10:52,726 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 10:10:52,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 10:10:52,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:10:57,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 10:11:00,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:11:04,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:11:04,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 10:11:06,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:11:11,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 10:11:11,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:11:17,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:11:17,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:11:17,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:11:18,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:11:19,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:11:19,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 10:11:19,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=331653.3333333333, ans=0.125 2023-09-29 10:11:21,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:11:21,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=331653.3333333333, ans=0.95 2023-09-29 10:11:22,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:11:22,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:11:27,156 INFO [train.py:1039] (3/4) Epoch 10, batch 1950, loss[loss=0.2928, simple_loss=0.3325, pruned_loss=0.1266, over 19091.00 frames. ], tot_loss[loss=0.209, simple_loss=0.2775, pruned_loss=0.07027, over 4708471.20 frames. ], batch size: 388, lr: 1.05e-02, grad_scale: 16.0 2023-09-29 10:11:27,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:11:27,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:11:27,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:11:28,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:11:31,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:11:35,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:11:35,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=331720.0, ans=0.125 2023-09-29 10:11:36,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:11:36,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:11:40,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 10:11:41,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 10:11:41,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:11:43,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:11:45,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:11:46,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:11:46,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:11:49,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:11:53,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:11:53,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:11:53,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:11:53,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:11:58,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:12:01,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:12:01,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:12:01,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 10:12:01,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 10:12:01,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:12:01,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:12:02,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:12:06,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:12:09,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:12:14,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:12:18,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:12:18,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:12:18,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 10:12:19,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:12:23,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=331920.0, ans=0.0 2023-09-29 10:12:24,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:12:25,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:12:27,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:12:34,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:12:35,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:12:37,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:12:38,989 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.745e+02 2.098e+02 2.334e+02 2.724e+02 3.808e+02, threshold=4.669e+02, percent-clipped=0.0 2023-09-29 10:12:40,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:12:42,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:12:44,284 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:12:45,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 10:12:45,726 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:12:45,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:12:47,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 10:12:48,794 INFO [train.py:1039] (3/4) Epoch 10, batch 2000, loss[loss=0.1725, simple_loss=0.2499, pruned_loss=0.04752, over 24579.00 frames. ], tot_loss[loss=0.2084, simple_loss=0.2774, pruned_loss=0.06968, over 4723851.73 frames. ], batch size: 60, lr: 1.05e-02, grad_scale: 32.0 2023-09-29 10:12:48,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:12:53,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:12:55,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:12:55,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:12:58,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:13:00,078 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:13:03,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 10:13:05,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:13:06,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:13:08,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 10:13:10,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:13:10,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:13:12,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=332120.0, ans=0.0 2023-09-29 10:13:13,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:13:14,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 10:13:14,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:16,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:16,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:18,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 10:13:18,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:13:20,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 10:13:21,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:13:25,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=332186.6666666667, ans=0.2 2023-09-29 10:13:26,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:13:27,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 10:13:27,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:27,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:13:28,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:13:29,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 10:13:33,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 10:13:34,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:13:34,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:13:39,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:13:39,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:13:39,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:13:40,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:13:41,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=332253.3333333333, ans=0.0 2023-09-29 10:13:42,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:13:42,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:13:44,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:13:44,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:13:45,031 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.65 vs. limit=10.0 2023-09-29 10:13:45,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:13:48,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:13:50,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 10:13:54,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 10:13:55,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:13:58,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:13:58,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:13:58,776 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.00 vs. limit=6.0 2023-09-29 10:14:03,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=332320.0, ans=0.2 2023-09-29 10:14:04,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:07,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:14:07,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:09,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 10:14:09,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:14:10,536 INFO [train.py:1039] (3/4) Epoch 10, batch 2050, loss[loss=0.1804, simple_loss=0.2561, pruned_loss=0.05238, over 24435.00 frames. ], tot_loss[loss=0.2088, simple_loss=0.2773, pruned_loss=0.0702, over 4700755.27 frames. ], batch size: 58, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:14:12,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:14:12,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:15,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:14:15,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:21,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:14:24,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:14:25,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:14:25,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:14:25,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=332453.3333333333, ans=0.1 2023-09-29 10:14:28,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 10:14:28,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:14:31,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:14:31,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:14:40,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:14:40,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:14:42,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 10:14:45,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:14:47,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 10:14:47,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:14:50,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:14:51,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:14:53,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:14:53,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:14:55,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:14:57,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:14:57,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:15:00,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:15:02,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:15:04,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:15:06,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:15:10,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:15:16,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:15:16,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 10:15:22,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:15:22,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:15:24,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:15:26,038 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 1.999e+02 2.333e+02 2.741e+02 4.462e+02, threshold=4.667e+02, percent-clipped=0.0 2023-09-29 10:15:27,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 10:15:30,823 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 10:15:30,824 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:15:30,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:15:32,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:15:34,231 INFO [train.py:1039] (3/4) Epoch 10, batch 2100, loss[loss=0.2196, simple_loss=0.29, pruned_loss=0.07457, over 23692.00 frames. ], tot_loss[loss=0.2075, simple_loss=0.2759, pruned_loss=0.06956, over 4703859.46 frames. ], batch size: 85, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:15:34,355 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:15:34,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 10:15:34,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 10:15:36,446 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.40 vs. limit=15.0 2023-09-29 10:15:37,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:15:39,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:15:41,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:15:41,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:15:42,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:15:42,816 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 10:15:44,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:15:45,728 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 10:15:45,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 10:15:49,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:15:49,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:15:49,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 10:15:49,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 10:15:49,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=332786.6666666667, ans=0.125 2023-09-29 10:15:52,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=332786.6666666667, ans=0.09899494936611666 2023-09-29 10:15:55,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 10:15:55,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:15:58,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:15:59,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:16:03,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:16:03,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 10:16:05,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:16:05,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 10:16:06,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 10:16:06,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:16:06,995 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=332853.3333333333, ans=0.125 2023-09-29 10:16:08,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 10:16:09,028 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 10:16:09,111 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 10:16:12,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:16:14,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:16:17,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:16:17,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=332853.3333333333, ans=0.125 2023-09-29 10:16:20,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:16:22,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:16:24,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:16:24,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 10:16:24,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:16:24,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:16:24,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:16:24,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 10:16:27,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 10:16:28,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 10:16:33,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:16:34,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:16:36,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 10:16:41,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:16:45,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:16:45,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:16:45,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:16:45,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 10:16:46,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:16:47,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:16:47,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:16:47,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=332986.6666666667, ans=0.0 2023-09-29 10:16:48,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:16:48,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:16:50,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 10:16:51,740 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 10:16:51,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:16:54,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:16:54,828 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:16:54,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:16:56,150 INFO [train.py:1039] (3/4) Epoch 10, batch 2150, loss[loss=0.2122, simple_loss=0.2718, pruned_loss=0.07628, over 22915.00 frames. ], tot_loss[loss=0.2069, simple_loss=0.2751, pruned_loss=0.06931, over 4710568.49 frames. ], batch size: 322, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:16:56,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:17:03,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 10:17:04,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:17:04,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=333053.3333333333, ans=0.0 2023-09-29 10:17:06,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:17:07,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:17:07,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:07,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:17:09,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:17:11,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:17:11,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:17:14,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:14,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 10:17:21,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:17:21,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=333120.0, ans=0.2 2023-09-29 10:17:22,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:17:24,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:24,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:17:24,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:25,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:17:25,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:17:25,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:17:27,284 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:17:27,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=333186.6666666667, ans=0.2 2023-09-29 10:17:28,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 10:17:30,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:17:32,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:17:32,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:17:34,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:17:35,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:17:35,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=333186.6666666667, ans=0.0 2023-09-29 10:17:37,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:17:38,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:17:40,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:17:40,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 10:17:40,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:17:43,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:17:44,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:47,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:17:48,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 10:17:50,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:17:50,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:17:50,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 10:17:52,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 10:17:53,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:17:53,824 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 10:17:53,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:17:54,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:17:55,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 10:17:55,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:17:55,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 10:17:56,945 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 10:17:56,945 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 10:17:57,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 10:17:58,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:17:59,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:17:59,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:18:00,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=333320.0, ans=0.1 2023-09-29 10:18:01,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:04,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 10:18:04,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:18:04,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:10,135 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.840e+02 1.996e+02 2.223e+02 3.215e+02, threshold=3.992e+02, percent-clipped=0.0 2023-09-29 10:18:10,663 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=333320.0, ans=0.125 2023-09-29 10:18:13,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:18:13,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 10:18:18,075 INFO [train.py:1039] (3/4) Epoch 10, batch 2200, loss[loss=0.2518, simple_loss=0.2955, pruned_loss=0.1041, over 19348.00 frames. ], tot_loss[loss=0.2064, simple_loss=0.2749, pruned_loss=0.06893, over 4712605.95 frames. ], batch size: 388, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:18:18,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:18:18,832 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.64 vs. limit=15.0 2023-09-29 10:18:23,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:25,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:18:25,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:18:27,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:18:27,545 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.95 vs. limit=15.0 2023-09-29 10:18:30,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:18:31,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:18:31,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 10:18:31,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=333386.6666666667, ans=0.125 2023-09-29 10:18:37,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 10:18:40,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 10:18:45,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 10:18:48,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:49,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:18:50,511 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:18:52,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:18:52,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 10:18:52,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=333520.0, ans=0.0 2023-09-29 10:18:57,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:18:59,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:18:59,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 10:19:04,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:19:05,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:19:07,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:19:08,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:19:11,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 10:19:11,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:19:12,103 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:19:13,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 10:19:15,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:19:15,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:19:16,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:19:18,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:19:20,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:19:20,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:19:20,126 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:19:21,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:19:21,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:19:23,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:19:26,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 10:19:26,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:19:30,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:19:30,259 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 10:19:34,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:19:34,513 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 10:19:36,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 10:19:36,106 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 10:19:36,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=333653.3333333333, ans=0.125 2023-09-29 10:19:37,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:19:37,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=333653.3333333333, ans=0.2 2023-09-29 10:19:39,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 10:19:40,609 INFO [train.py:1039] (3/4) Epoch 10, batch 2250, loss[loss=0.2162, simple_loss=0.2942, pruned_loss=0.06912, over 24568.00 frames. ], tot_loss[loss=0.2058, simple_loss=0.2749, pruned_loss=0.06835, over 4720897.62 frames. ], batch size: 71, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:19:40,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:19:41,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=333720.0, ans=0.125 2023-09-29 10:19:42,267 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 10:19:43,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:19:45,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:19:52,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:19:53,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:19:58,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:19:58,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:19:59,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:20:01,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 10:20:01,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:20:03,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:20:07,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 10:20:07,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:20:07,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:20:09,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:20:12,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:20:14,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 10:20:14,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:20:17,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 10:20:18,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:20:19,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=333853.3333333333, ans=0.2 2023-09-29 10:20:20,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:20:25,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:20:27,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:20:28,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:20:28,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:20:31,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:20:32,641 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.89 vs. limit=15.0 2023-09-29 10:20:34,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:20:38,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:20:39,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=333920.0, ans=0.125 2023-09-29 10:20:40,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 10:20:45,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:20:45,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:20:46,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:20:51,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 10:20:54,190 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 2.009e+02 2.238e+02 2.511e+02 3.719e+02, threshold=4.476e+02, percent-clipped=0.0 2023-09-29 10:20:54,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 10:20:54,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 10:20:54,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:20:55,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:20:57,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=333986.6666666667, ans=0.2 2023-09-29 10:20:57,194 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=333986.6666666667, ans=0.05 2023-09-29 10:20:59,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 10:21:02,432 INFO [train.py:1039] (3/4) Epoch 10, batch 2300, loss[loss=0.1765, simple_loss=0.248, pruned_loss=0.05246, over 24431.00 frames. ], tot_loss[loss=0.2075, simple_loss=0.2763, pruned_loss=0.06929, over 4719050.46 frames. ], batch size: 58, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:21:02,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:21:02,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:21:04,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=334053.3333333333, ans=0.125 2023-09-29 10:21:08,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:21:08,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:21:11,718 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.64 vs. limit=22.5 2023-09-29 10:21:12,939 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 10:21:16,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:21:25,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:21:25,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 10:21:25,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:21:26,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:21:26,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 10:21:27,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=334120.0, ans=0.0 2023-09-29 10:21:28,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:21:31,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:21:31,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:21:34,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:21:37,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:21:39,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:21:46,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:21:46,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:21:50,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:21:53,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:21:54,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:21:56,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:21:56,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:21:56,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 10:22:00,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 10:22:00,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:22:01,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:22:02,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:22:02,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:22:04,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 10:22:04,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:22:05,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 10:22:05,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:22:05,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:22:05,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 10:22:06,657 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.49 vs. limit=12.0 2023-09-29 10:22:12,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:22:15,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:22:22,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:22:22,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:22:22,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:22:24,212 INFO [train.py:1039] (3/4) Epoch 10, batch 2350, loss[loss=0.1949, simple_loss=0.2712, pruned_loss=0.05935, over 24303.00 frames. ], tot_loss[loss=0.2078, simple_loss=0.2776, pruned_loss=0.06901, over 4724508.86 frames. ], batch size: 61, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:22:24,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 10:22:24,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:22:24,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:22:24,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=334386.6666666667, ans=0.025 2023-09-29 10:22:25,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 10:22:32,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:22:32,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 10:22:32,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=334386.6666666667, ans=0.125 2023-09-29 10:22:38,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 10:22:40,762 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=334453.3333333333, ans=0.1 2023-09-29 10:22:41,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:22:46,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:22:46,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:22:46,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:22:47,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:22:48,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 10:22:52,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:22:57,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 10:23:00,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:23:00,469 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=334520.0, ans=0.125 2023-09-29 10:23:01,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:23:01,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:23:04,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:23:06,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 10:23:06,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:23:08,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:23:08,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:23:08,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:23:12,334 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.01 vs. limit=15.0 2023-09-29 10:23:13,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:23:15,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 10:23:16,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:23:17,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:23:17,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:23:20,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 10:23:22,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:23:25,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 10:23:25,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:23:30,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 10:23:35,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 10:23:36,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:23:36,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 10:23:36,812 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 10:23:36,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 10:23:38,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 10:23:39,747 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.645e+02 2.129e+02 2.355e+02 2.700e+02 4.237e+02, threshold=4.711e+02, percent-clipped=0.0 2023-09-29 10:23:41,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:23:44,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:23:46,753 INFO [train.py:1039] (3/4) Epoch 10, batch 2400, loss[loss=0.2173, simple_loss=0.2786, pruned_loss=0.07798, over 23737.00 frames. ], tot_loss[loss=0.2083, simple_loss=0.2771, pruned_loss=0.06974, over 4709562.72 frames. ], batch size: 135, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:23:48,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=334720.0, ans=0.125 2023-09-29 10:23:49,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:23:50,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:23:50,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 10:23:52,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 10:23:59,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:23:59,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:24:01,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 10:24:02,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:24:04,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:24:04,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 10:24:06,026 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:24:08,959 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:24:09,348 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=334786.6666666667, ans=0.1 2023-09-29 10:24:11,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 10:24:12,829 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.75 vs. limit=22.5 2023-09-29 10:24:17,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:24:23,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 10:24:26,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:24:28,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:24:33,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:24:33,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 10:24:33,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:24:40,523 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=334920.0, ans=0.125 2023-09-29 10:24:41,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:24:43,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:24:43,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=334920.0, ans=0.125 2023-09-29 10:24:45,090 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=334920.0, ans=0.2 2023-09-29 10:24:46,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:24:47,782 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:24:47,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 10:24:47,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:24:49,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:24:49,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:24:49,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 10:24:52,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:24:52,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:24:54,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 10:24:56,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 10:24:57,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:24:57,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:24:57,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 10:24:57,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 10:24:57,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 10:24:57,899 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 10:25:00,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 10:25:00,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:25:00,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=334986.6666666667, ans=0.0 2023-09-29 10:25:04,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:25:04,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:25:06,441 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 10:25:06,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:25:07,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 10:25:08,806 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.81 vs. limit=6.0 2023-09-29 10:25:08,971 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.09 vs. limit=15.0 2023-09-29 10:25:09,559 INFO [train.py:1039] (3/4) Epoch 10, batch 2450, loss[loss=0.1905, simple_loss=0.2623, pruned_loss=0.05939, over 20609.00 frames. ], tot_loss[loss=0.2071, simple_loss=0.2748, pruned_loss=0.06966, over 4690561.56 frames. ], batch size: 45, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:25:11,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:25:11,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:25:15,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=335053.3333333333, ans=10.0 2023-09-29 10:25:15,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:25:15,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:25:17,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 10:25:23,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:25:23,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:25:27,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:25:27,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:25:27,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:25:27,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 10:25:31,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=335120.0, ans=0.2 2023-09-29 10:25:32,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:25:33,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:25:34,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:25:39,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 10:25:39,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:25:39,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:25:39,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:25:42,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 10:25:43,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:25:50,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=335186.6666666667, ans=0.1 2023-09-29 10:25:51,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:25:53,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:25:53,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:25:53,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:25:53,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:25:54,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:25:56,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 10:26:01,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:26:01,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:26:05,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:26:05,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:26:10,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:26:11,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 10:26:11,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:26:11,729 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.34 vs. limit=15.0 2023-09-29 10:26:12,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:26:12,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 10:26:12,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:26:15,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:26:20,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:26:21,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:26:21,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:26:24,596 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 2.022e+02 2.367e+02 2.913e+02 5.353e+02, threshold=4.733e+02, percent-clipped=2.0 2023-09-29 10:26:26,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 10:26:27,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:26:30,834 INFO [train.py:1039] (3/4) Epoch 10, batch 2500, loss[loss=0.2076, simple_loss=0.2839, pruned_loss=0.06569, over 23926.00 frames. ], tot_loss[loss=0.2064, simple_loss=0.2747, pruned_loss=0.06908, over 4701653.62 frames. ], batch size: 80, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:26:31,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=335386.6666666667, ans=0.125 2023-09-29 10:26:32,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_ff3.min_abs, batch_count=335386.6666666667, ans=0.2 2023-09-29 10:26:34,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:26:43,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:26:43,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:26:44,544 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.34 vs. limit=10.0 2023-09-29 10:26:45,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:26:45,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 10:26:52,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:26:53,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:26:54,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 10:26:54,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 10:26:54,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 10:26:56,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:26:57,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:26:59,061 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 10:26:59,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:26:59,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 10:27:00,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:27:06,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:27:07,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:27:10,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=335520.0, ans=0.0 2023-09-29 10:27:11,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:27:11,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 10:27:13,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:27:15,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:27:19,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:27:25,536 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:27:28,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:27:33,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 10:27:34,794 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.98 vs. limit=15.0 2023-09-29 10:27:36,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 10:27:38,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:27:38,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 10:27:38,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=335653.3333333333, ans=0.95 2023-09-29 10:27:41,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:27:41,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:27:41,465 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 10:27:41,466 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 10:27:42,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 10:27:44,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:27:46,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 10:27:46,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 10:27:48,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:27:48,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 10:27:52,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 10:27:53,794 INFO [train.py:1039] (3/4) Epoch 10, batch 2550, loss[loss=0.2018, simple_loss=0.2657, pruned_loss=0.06895, over 22766.00 frames. ], tot_loss[loss=0.207, simple_loss=0.2757, pruned_loss=0.06912, over 4704801.00 frames. ], batch size: 322, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:27:56,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:27:57,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:27:59,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:28:00,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:28:02,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 10:28:03,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:28:06,609 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 10:28:08,150 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:28:08,494 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=335786.6666666667, ans=0.125 2023-09-29 10:28:09,712 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:28:12,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:28:12,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 10:28:12,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:28:14,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:28:14,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:28:16,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:28:16,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 10:28:17,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 10:28:18,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:28:18,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 10:28:33,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:28:38,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:28:38,736 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:28:40,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:28:40,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 10:28:43,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=335920.0, ans=0.125 2023-09-29 10:28:46,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:28:48,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:28:48,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:28:48,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:28:49,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:28:49,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:28:52,912 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.47 vs. limit=6.0 2023-09-29 10:28:53,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:28:53,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:28:58,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:28:58,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 10:28:58,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:29:00,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:29:00,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:29:01,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:29:03,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:29:09,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:29:10,862 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.915e+02 2.103e+02 2.425e+02 3.393e+02, threshold=4.205e+02, percent-clipped=0.0 2023-09-29 10:29:11,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:29:14,188 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 10:29:17,143 INFO [train.py:1039] (3/4) Epoch 10, batch 2600, loss[loss=0.2804, simple_loss=0.3229, pruned_loss=0.119, over 19014.00 frames. ], tot_loss[loss=0.2077, simple_loss=0.2764, pruned_loss=0.0695, over 4698785.73 frames. ], batch size: 388, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:29:17,233 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 10:29:17,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:29:17,326 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 10:29:18,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 10:29:18,871 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 10:29:20,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:29:22,604 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 10:29:23,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 10:29:25,460 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 10:29:26,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:29:28,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 10:29:30,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 10:29:32,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:29:32,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 10:29:35,466 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 10:29:35,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 10:29:43,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:29:43,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:29:43,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:29:43,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 10:29:44,307 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.52 vs. limit=10.0 2023-09-29 10:29:46,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:29:50,366 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=336186.6666666667, ans=0.0 2023-09-29 10:29:51,441 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 10:29:56,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:29:56,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:29:58,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 10:29:58,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:29:58,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:30:00,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 10:30:01,995 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=336186.6666666667, ans=0.0 2023-09-29 10:30:03,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:30:03,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:30:05,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:30:09,781 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 10:30:09,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:30:09,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:30:10,190 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=336253.3333333333, ans=0.125 2023-09-29 10:30:14,071 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.76 vs. limit=15.0 2023-09-29 10:30:17,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:30:19,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:30:19,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 10:30:19,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:30:19,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=336253.3333333333, ans=0.125 2023-09-29 10:30:22,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:30:23,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:30:30,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 10:30:32,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:30:33,681 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 10:30:38,151 INFO [train.py:1039] (3/4) Epoch 10, batch 2650, loss[loss=0.2054, simple_loss=0.2872, pruned_loss=0.06175, over 24344.00 frames. ], tot_loss[loss=0.2091, simple_loss=0.2775, pruned_loss=0.07039, over 4698738.32 frames. ], batch size: 74, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:30:40,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 10:30:40,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:30:40,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 10:30:40,690 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 10:30:42,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:30:45,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:30:48,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:30:50,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:30:52,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:30:55,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 10:30:55,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:30:55,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:30:58,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 10:30:58,807 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 10:31:01,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:31:05,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 10:31:05,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:31:07,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 10:31:10,428 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=336520.0, ans=10.0 2023-09-29 10:31:11,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:31:11,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:31:11,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:31:11,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:16,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 10:31:16,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 10:31:21,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=336520.0, ans=0.125 2023-09-29 10:31:22,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:31:24,511 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 10:31:24,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:31:26,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:26,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:31:26,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:31:26,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:31:28,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:31:30,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:31:30,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:31:30,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=336586.6666666667, ans=0.125 2023-09-29 10:31:31,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:31:33,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:31:34,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:31:36,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:31:39,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:31:41,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:31:41,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 10:31:44,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=336653.3333333333, ans=0.125 2023-09-29 10:31:45,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:45,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:31:45,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:31:46,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 10:31:50,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:31:53,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:54,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:31:56,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:31:57,509 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.958e+02 2.205e+02 2.606e+02 4.713e+02, threshold=4.410e+02, percent-clipped=1.0 2023-09-29 10:31:57,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:31:59,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:32:00,536 INFO [train.py:1039] (3/4) Epoch 10, batch 2700, loss[loss=0.2975, simple_loss=0.3435, pruned_loss=0.1257, over 19665.00 frames. ], tot_loss[loss=0.2096, simple_loss=0.2783, pruned_loss=0.07045, over 4707102.26 frames. ], batch size: 388, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:32:02,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:32:02,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 10:32:04,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:32:06,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 10:32:07,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:32:09,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:32:09,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:32:09,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=336720.0, ans=0.125 2023-09-29 10:32:10,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:32:10,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:32:12,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:32:12,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 10:32:12,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 10:32:12,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:32:14,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:32:14,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=336720.0, ans=0.2 2023-09-29 10:32:16,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:32:16,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:32:20,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:32:20,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 10:32:22,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:32:25,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:32:27,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:32:34,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:32:34,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:32:34,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:32:34,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:32:37,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:32:42,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:32:42,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:32:42,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:32:42,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=336853.3333333333, ans=0.035 2023-09-29 10:32:44,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=336853.3333333333, ans=0.125 2023-09-29 10:32:49,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:32:49,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:32:58,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:32:58,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:33:01,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:33:01,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:33:05,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:33:05,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=336986.6666666667, ans=0.0 2023-09-29 10:33:06,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:33:06,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:33:09,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:12,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:33:12,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:33:13,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:33:15,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:33:15,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:33:19,133 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=336986.6666666667, ans=0.0 2023-09-29 10:33:20,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 10:33:21,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:33:23,860 INFO [train.py:1039] (3/4) Epoch 10, batch 2750, loss[loss=0.2098, simple_loss=0.27, pruned_loss=0.0748, over 23513.00 frames. ], tot_loss[loss=0.2097, simple_loss=0.2783, pruned_loss=0.07057, over 4698423.73 frames. ], batch size: 134, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:33:23,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:33:24,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 10:33:25,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 10:33:25,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:33:28,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:33:28,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:33:32,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:32,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:33:33,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:36,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:33:36,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 10:33:38,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:33:38,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:38,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 10:33:38,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:33:38,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:33:43,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 10:33:46,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:33:48,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:48,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:33:48,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 10:33:49,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:33:51,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:33:51,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:33:53,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:33:56,547 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.88 vs. limit=22.5 2023-09-29 10:33:57,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 10:33:57,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 10:33:57,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:33:58,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:33:58,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 10:34:06,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:34:08,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 10:34:08,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:34:08,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=337186.6666666667, ans=15.0 2023-09-29 10:34:10,667 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.39 vs. limit=22.5 2023-09-29 10:34:13,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:34:13,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:34:14,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:34:21,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:34:22,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:34:22,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 10:34:27,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:34:30,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 10:34:34,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 10:34:37,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:34:37,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 10:34:38,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:34:40,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=337320.0, ans=0.125 2023-09-29 10:34:41,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:34:41,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 10:34:41,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:34:42,599 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.677e+02 2.063e+02 2.307e+02 2.543e+02 4.120e+02, threshold=4.614e+02, percent-clipped=0.0 2023-09-29 10:34:46,175 INFO [train.py:1039] (3/4) Epoch 10, batch 2800, loss[loss=0.2111, simple_loss=0.2652, pruned_loss=0.07852, over 23710.00 frames. ], tot_loss[loss=0.2076, simple_loss=0.2757, pruned_loss=0.0698, over 4683720.56 frames. ], batch size: 232, lr: 1.04e-02, grad_scale: 16.0 2023-09-29 10:34:46,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 10:34:46,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:34:48,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:34:50,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 10:34:50,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:34:50,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:34:51,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:34:51,994 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 10:34:51,995 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 10:34:56,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:34:59,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:34:59,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:35:03,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:35:04,458 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.05 vs. limit=12.0 2023-09-29 10:35:06,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 10:35:08,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 10:35:08,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 10:35:10,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:35:11,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:35:11,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:35:16,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:35:16,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:35:16,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 10:35:17,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:35:19,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=337520.0, ans=0.125 2023-09-29 10:35:26,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:35:26,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.min_positive, batch_count=337520.0, ans=0.05 2023-09-29 10:35:28,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:35:29,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:35:31,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:35:32,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:35:33,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=337520.0, ans=0.125 2023-09-29 10:35:39,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:35:39,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 10:35:40,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:35:40,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:35:40,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:35:46,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:35:47,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:35:51,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:35:52,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=337653.3333333333, ans=0.0 2023-09-29 10:35:53,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:35:53,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:35:53,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:35:54,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:35:54,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:35:56,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:35:56,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 10:35:56,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:35:58,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:35:58,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:36:00,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 10:36:02,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:36:02,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:36:03,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:36:05,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 10:36:05,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=337653.3333333333, ans=0.125 2023-09-29 10:36:10,393 INFO [train.py:1039] (3/4) Epoch 10, batch 2850, loss[loss=0.2083, simple_loss=0.27, pruned_loss=0.07326, over 23836.00 frames. ], tot_loss[loss=0.2067, simple_loss=0.2746, pruned_loss=0.06936, over 4673176.45 frames. ], batch size: 179, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:36:12,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:36:12,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:36:13,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:36:15,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:36:18,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:36:20,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:36:20,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:36:23,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:36:23,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:36:25,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:36:26,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 10:36:31,921 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 10:36:31,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:36:34,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 10:36:34,445 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 10:36:35,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:36:38,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 10:36:38,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 10:36:39,106 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_na.min_abs, batch_count=337786.6666666667, ans=0.02 2023-09-29 10:36:40,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:36:55,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:36:56,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:36:56,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:36:56,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 10:36:56,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:36:58,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:37:00,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:37:00,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 10:37:01,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:37:01,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:37:03,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:37:04,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:37:06,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:37:06,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:37:06,654 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=337920.0, ans=0.0 2023-09-29 10:37:09,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:37:11,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:37:14,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:37:16,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:37:16,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:37:18,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:37:24,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:37:25,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 10:37:25,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 10:37:25,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=337986.6666666667, ans=0.05 2023-09-29 10:37:27,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:37:28,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:37:28,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 10:37:28,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:37:30,083 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 2.039e+02 2.270e+02 2.590e+02 3.840e+02, threshold=4.540e+02, percent-clipped=0.0 2023-09-29 10:37:30,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:37:30,309 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:37:32,251 INFO [train.py:1039] (3/4) Epoch 10, batch 2900, loss[loss=0.2225, simple_loss=0.287, pruned_loss=0.07904, over 23684.00 frames. ], tot_loss[loss=0.2068, simple_loss=0.2749, pruned_loss=0.06939, over 4690348.12 frames. ], batch size: 232, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:37:32,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:37:32,330 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 10:37:32,396 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 10:37:32,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:37:32,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:37:34,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=338053.3333333333, ans=10.0 2023-09-29 10:37:35,966 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=338053.3333333333, ans=0.0 2023-09-29 10:37:37,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 10:37:37,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:37:37,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:37:38,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 10:37:43,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:37:43,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 10:37:45,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 10:37:47,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:37:47,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:37:49,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:37:51,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:37:55,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:37:55,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:37:57,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:37:58,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 10:37:58,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:38:00,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:38:00,726 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=338120.0, ans=0.2 2023-09-29 10:38:00,728 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=338120.0, ans=0.1 2023-09-29 10:38:03,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 10:38:05,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 10:38:07,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:38:07,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 10:38:07,226 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:38:08,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:38:08,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 10:38:11,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:38:13,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:38:15,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:38:19,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:38:20,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 10:38:20,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 10:38:20,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:38:25,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:38:27,265 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=338253.3333333333, ans=0.5 2023-09-29 10:38:28,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 10:38:28,548 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:38:34,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:38:34,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=338253.3333333333, ans=0.1 2023-09-29 10:38:43,667 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.62 vs. limit=22.5 2023-09-29 10:38:44,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:38:44,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:38:45,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 10:38:49,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:38:49,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 10:38:49,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:38:49,885 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.76 vs. limit=15.0 2023-09-29 10:38:50,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:38:54,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=338386.6666666667, ans=0.0 2023-09-29 10:38:55,135 INFO [train.py:1039] (3/4) Epoch 10, batch 2950, loss[loss=0.2127, simple_loss=0.2739, pruned_loss=0.07574, over 23422.00 frames. ], tot_loss[loss=0.2074, simple_loss=0.2756, pruned_loss=0.06958, over 4684299.78 frames. ], batch size: 285, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:38:55,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:38:58,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 10:38:58,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:38:58,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:39:01,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:39:02,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:39:04,259 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 10:39:04,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 10:39:04,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:39:04,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:39:11,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=338453.3333333333, ans=0.1 2023-09-29 10:39:13,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:39:14,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:39:16,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:39:16,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:39:19,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:39:19,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:39:21,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:39:22,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:39:24,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:39:25,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 10:39:31,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 10:39:31,355 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 10:39:32,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:39:33,446 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.68 vs. limit=15.0 2023-09-29 10:39:34,412 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 10:39:35,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 10:39:35,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:39:37,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:39:37,429 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 10:39:37,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 10:39:40,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 10:39:41,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:39:42,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:39:46,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:39:47,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:39:47,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:39:49,102 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 10:39:50,448 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:39:50,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 10:39:56,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:39:58,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:39:59,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 10:39:59,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:40:03,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 10:40:07,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:40:08,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:40:08,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:40:10,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:40:10,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 10:40:13,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:40:13,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:40:13,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:40:13,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:40:15,013 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.713e+02 2.095e+02 2.360e+02 2.809e+02 3.974e+02, threshold=4.720e+02, percent-clipped=0.0 2023-09-29 10:40:15,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:40:15,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=338720.0, ans=0.0 2023-09-29 10:40:16,513 INFO [train.py:1039] (3/4) Epoch 10, batch 3000, loss[loss=0.2456, simple_loss=0.2992, pruned_loss=0.09595, over 22830.00 frames. ], tot_loss[loss=0.2083, simple_loss=0.2766, pruned_loss=0.07002, over 4696605.75 frames. ], batch size: 323, lr: 1.04e-02, grad_scale: 8.0 2023-09-29 10:40:16,514 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 10:40:31,299 INFO [train.py:1071] (3/4) Epoch 10, validation: loss=0.2858, simple_loss=0.2843, pruned_loss=0.1436, over 1125622.00 frames. 2023-09-29 10:40:31,300 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 10:40:31,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:40:33,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:40:33,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 10:40:34,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:40:38,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:40:38,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 10:40:39,539 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=15.15 vs. limit=15.0 2023-09-29 10:40:43,339 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 10:40:43,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 10:40:46,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:40:46,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:40:47,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 10:40:49,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:40:51,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=338786.6666666667, ans=0.125 2023-09-29 10:40:55,638 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:41:06,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:41:06,731 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.36 vs. limit=10.0 2023-09-29 10:41:14,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 10:41:16,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:41:19,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:41:19,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:41:19,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:41:21,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:41:21,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 10:41:22,599 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 10:41:24,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:41:24,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 10:41:28,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:41:28,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:41:28,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:41:28,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:41:32,456 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=338920.0, ans=0.125 2023-09-29 10:41:35,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:41:35,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:41:35,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:41:36,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:41:39,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 10:41:39,902 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.95 vs. limit=10.0 2023-09-29 10:41:40,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:41:40,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:41:40,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:41:45,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:41:45,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:41:47,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 10:41:47,427 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 10:41:47,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:41:49,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 10:41:49,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:41:51,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=338986.6666666667, ans=0.0 2023-09-29 10:41:52,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 10:41:54,029 INFO [train.py:1039] (3/4) Epoch 10, batch 3050, loss[loss=0.3345, simple_loss=0.358, pruned_loss=0.1555, over 19698.00 frames. ], tot_loss[loss=0.2082, simple_loss=0.2766, pruned_loss=0.06991, over 4706345.11 frames. ], batch size: 388, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:41:54,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:41:54,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 10:41:54,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 10:41:55,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 10:41:55,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 10:41:57,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:41:57,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=339053.3333333333, ans=0.125 2023-09-29 10:41:58,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:41:58,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 10:41:58,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:00,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:42:01,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 10:42:05,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:42:07,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:42:07,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:42:10,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:15,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 10:42:22,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 10:42:22,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 10:42:22,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:42:27,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:42:30,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:30,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:42:31,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:42:33,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=339186.6666666667, ans=0.0 2023-09-29 10:42:34,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:42:34,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:42:34,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:42:37,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:42:37,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:42:37,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:38,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:42:40,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:42:41,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 10:42:43,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:42:43,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:42:46,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:42:46,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 10:42:48,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:42:48,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:42:54,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:42:56,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:43:02,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:02,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:43:02,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:43:03,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:43:05,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 10:43:05,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:43:07,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 10:43:08,086 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.17 vs. limit=15.0 2023-09-29 10:43:09,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:43:09,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:10,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 10:43:10,959 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=339320.0, ans=0.0 2023-09-29 10:43:12,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:43:15,089 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 1.990e+02 2.334e+02 2.687e+02 4.208e+02, threshold=4.668e+02, percent-clipped=0.0 2023-09-29 10:43:16,561 INFO [train.py:1039] (3/4) Epoch 10, batch 3100, loss[loss=0.238, simple_loss=0.2974, pruned_loss=0.08929, over 23862.00 frames. ], tot_loss[loss=0.2089, simple_loss=0.2774, pruned_loss=0.07014, over 4715718.46 frames. ], batch size: 164, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:43:18,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:43:19,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:43:21,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 10:43:25,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 10:43:28,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 10:43:28,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 10:43:29,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:43:30,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=339386.6666666667, ans=0.0 2023-09-29 10:43:31,852 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=339453.3333333333, ans=0.1 2023-09-29 10:43:34,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:43:34,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:36,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 10:43:37,024 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.12 vs. limit=15.0 2023-09-29 10:43:40,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=339453.3333333333, ans=0.2 2023-09-29 10:43:41,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:47,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 10:43:48,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=339520.0, ans=15.0 2023-09-29 10:43:50,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 10:43:50,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:43:52,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:43:52,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:43:53,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 10:43:55,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:43:55,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 10:43:55,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:43:56,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:43:58,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 10:44:00,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:44:04,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:44:05,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 10:44:07,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 10:44:08,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:08,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:44:10,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:44:10,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:10,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:44:12,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:44:12,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:44:14,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:44:14,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=339586.6666666667, ans=0.1 2023-09-29 10:44:15,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:44:15,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:15,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 10:44:15,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=339586.6666666667, ans=0.0 2023-09-29 10:44:18,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:44:20,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 10:44:23,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:44:23,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 10:44:25,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:44:25,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:25,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 10:44:28,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=339653.3333333333, ans=0.125 2023-09-29 10:44:32,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=339653.3333333333, ans=0.09899494936611666 2023-09-29 10:44:37,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 10:44:38,080 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.11 vs. limit=15.0 2023-09-29 10:44:38,896 INFO [train.py:1039] (3/4) Epoch 10, batch 3150, loss[loss=0.1803, simple_loss=0.2606, pruned_loss=0.05003, over 24357.00 frames. ], tot_loss[loss=0.2074, simple_loss=0.2759, pruned_loss=0.0694, over 4716216.87 frames. ], batch size: 61, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:44:41,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:44:42,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:44:43,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:44:43,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:44:45,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 10:44:45,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:44:47,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 10:44:48,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 10:44:48,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:44:49,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=339720.0, ans=0.025 2023-09-29 10:44:51,758 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 10:44:54,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 10:44:56,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:44:56,345 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 10:44:57,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 10:44:58,317 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=339786.6666666667, ans=0.125 2023-09-29 10:44:59,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 10:44:59,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 10:44:59,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 10:44:59,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:44:59,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:45:00,928 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:45:04,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 10:45:06,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:45:06,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:45:07,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:45:09,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 10:45:13,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 10:45:14,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:45:16,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 10:45:16,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:45:17,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 10:45:22,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 10:45:22,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:45:24,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 10:45:24,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 10:45:24,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:45:24,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:45:25,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:45:25,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 10:45:27,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 10:45:27,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 10:45:27,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:45:28,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:45:30,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:45:31,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 10:45:31,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:45:33,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 10:45:34,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:45:34,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 10:45:37,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 10:45:38,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:45:40,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:45:40,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 10:45:42,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 10:45:43,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:45:47,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:45:49,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:45:49,247 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:45:55,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:45:55,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:45:57,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 10:45:58,674 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.693e+02 1.980e+02 2.239e+02 2.536e+02 3.813e+02, threshold=4.479e+02, percent-clipped=0.0 2023-09-29 10:46:00,315 INFO [train.py:1039] (3/4) Epoch 10, batch 3200, loss[loss=0.1731, simple_loss=0.2512, pruned_loss=0.04748, over 24326.00 frames. ], tot_loss[loss=0.2059, simple_loss=0.2741, pruned_loss=0.06886, over 4691876.85 frames. ], batch size: 56, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:46:00,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=340053.3333333333, ans=0.125 2023-09-29 10:46:04,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:46:04,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 10:46:08,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:46:10,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:46:10,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 10:46:10,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=340053.3333333333, ans=0.125 2023-09-29 10:46:11,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:46:12,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=340053.3333333333, ans=0.125 2023-09-29 10:46:16,801 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=7.92 vs. limit=15.0 2023-09-29 10:46:17,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:46:17,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=340120.0, ans=0.2 2023-09-29 10:46:21,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:46:26,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=340120.0, ans=0.125 2023-09-29 10:46:27,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=340120.0, ans=0.125 2023-09-29 10:46:31,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:46:31,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=340120.0, ans=0.0 2023-09-29 10:46:33,548 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.60 vs. limit=15.0 2023-09-29 10:46:36,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=340186.6666666667, ans=0.125 2023-09-29 10:46:42,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 10:46:42,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:46:44,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 10:46:44,323 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=340186.6666666667, ans=0.0 2023-09-29 10:46:45,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 10:46:50,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:46:50,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:46:51,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:46:56,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 10:46:58,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 10:46:59,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 10:47:01,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 10:47:04,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:47:05,210 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=340253.3333333333, ans=0.0 2023-09-29 10:47:09,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:47:10,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 10:47:10,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:47:12,476 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 10:47:12,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 10:47:15,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:47:18,673 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 10:47:18,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 10:47:20,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 10:47:21,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 10:47:23,701 INFO [train.py:1039] (3/4) Epoch 10, batch 3250, loss[loss=0.1996, simple_loss=0.2833, pruned_loss=0.0579, over 24285.00 frames. ], tot_loss[loss=0.2061, simple_loss=0.2747, pruned_loss=0.06873, over 4702793.56 frames. ], batch size: 74, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:47:23,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:47:29,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:47:29,633 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 10:47:29,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:47:29,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:47:32,554 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 10:47:37,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:47:40,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:47:44,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=340453.3333333333, ans=0.0 2023-09-29 10:47:47,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:47:47,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 10:47:48,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:47:48,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:47:48,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:47:50,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:47:50,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 10:47:51,073 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.32 vs. limit=10.0 2023-09-29 10:47:53,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:47:54,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:47:56,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:47:56,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:47:56,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:47:56,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:47:59,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:48:00,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:48:03,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:48:03,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:48:04,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:48:05,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:48:05,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:48:11,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 10:48:11,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:48:11,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:48:12,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:48:14,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:48:19,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:48:22,701 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.31 vs. limit=15.0 2023-09-29 10:48:28,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:48:29,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:48:29,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 10:48:29,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:48:29,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 10:48:29,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:48:32,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 10:48:32,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 10:48:33,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:48:34,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:48:35,704 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=6.55 vs. limit=12.0 2023-09-29 10:48:36,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:48:36,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 10:48:37,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:48:42,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:48:42,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:48:44,368 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 2.128e+02 2.401e+02 2.998e+02 4.766e+02, threshold=4.802e+02, percent-clipped=1.0 2023-09-29 10:48:44,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 10:48:44,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:48:46,485 INFO [train.py:1039] (3/4) Epoch 10, batch 3300, loss[loss=0.2325, simple_loss=0.3024, pruned_loss=0.08127, over 23447.00 frames. ], tot_loss[loss=0.2073, simple_loss=0.2759, pruned_loss=0.06934, over 4715682.65 frames. ], batch size: 106, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:48:46,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:48:46,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 10:48:49,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:48:51,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 10:48:52,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 10:48:52,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 10:48:53,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:48:56,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:48:57,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:48:57,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:48:58,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=340720.0, ans=0.0 2023-09-29 10:48:59,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 10:49:00,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 10:49:01,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=340786.6666666667, ans=0.1 2023-09-29 10:49:03,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:49:05,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:49:05,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=340786.6666666667, ans=0.0 2023-09-29 10:49:10,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 10:49:12,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:49:12,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:49:14,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:49:15,715 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 10:49:17,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:49:17,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 10:49:19,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 10:49:19,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:49:20,939 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 10:49:22,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:49:22,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 10:49:24,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:49:24,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 10:49:25,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 10:49:25,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:49:27,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:49:30,403 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 10:49:33,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 10:49:33,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:49:35,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 10:49:38,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:49:40,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 10:49:41,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:49:42,478 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.94 vs. limit=15.0 2023-09-29 10:49:45,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:49:45,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:49:45,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:49:45,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:49:48,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:49:48,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:49:49,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:49:50,675 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 10:49:52,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 10:49:55,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 10:49:57,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:49:57,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:49:57,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=340986.6666666667, ans=0.125 2023-09-29 10:49:58,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:49:58,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:50:00,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 10:50:00,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:00,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:50:00,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:50:02,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=340986.6666666667, ans=0.0 2023-09-29 10:50:03,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:50:05,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 10:50:05,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:06,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:08,280 INFO [train.py:1039] (3/4) Epoch 10, batch 3350, loss[loss=0.1922, simple_loss=0.2682, pruned_loss=0.0581, over 24661.00 frames. ], tot_loss[loss=0.208, simple_loss=0.2771, pruned_loss=0.0694, over 4716745.27 frames. ], batch size: 68, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:50:08,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 10:50:08,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:50:09,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:50:11,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:50:11,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:15,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:50:15,428 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=341053.3333333333, ans=0.05 2023-09-29 10:50:17,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:19,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:50:23,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:23,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:50:26,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:50:27,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:50:28,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 10:50:30,656 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 10:50:30,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:50:35,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 10:50:35,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 10:50:35,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 10:50:35,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:50:36,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=341120.0, ans=0.1 2023-09-29 10:50:38,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:50:38,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 10:50:38,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:38,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:50:41,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:42,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:44,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:50:45,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:50:49,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:50:52,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:50:52,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:50:56,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:50:56,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:51:00,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:51:00,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:51:01,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:51:05,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 10:51:05,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 10:51:05,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 10:51:05,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:51:06,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 10:51:08,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:51:09,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:51:16,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:51:17,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 10:51:18,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:51:20,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:51:22,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:51:28,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:51:30,303 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 2.041e+02 2.250e+02 2.635e+02 4.628e+02, threshold=4.499e+02, percent-clipped=0.0 2023-09-29 10:51:30,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 10:51:31,823 INFO [train.py:1039] (3/4) Epoch 10, batch 3400, loss[loss=0.2446, simple_loss=0.3052, pruned_loss=0.09196, over 23596.00 frames. ], tot_loss[loss=0.2092, simple_loss=0.2786, pruned_loss=0.06995, over 4717316.08 frames. ], batch size: 94, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:51:31,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 10:51:31,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:51:33,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:51:35,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 10:51:37,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:51:37,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 10:51:38,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:51:39,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=341386.6666666667, ans=0.0 2023-09-29 10:51:40,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:51:40,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 10:51:41,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:51:41,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 10:51:43,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 10:51:43,916 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 10:51:45,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:51:50,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:51:50,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 10:51:51,377 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:51:52,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 10:51:59,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:51:59,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 10:51:59,893 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=341453.3333333333, ans=0.125 2023-09-29 10:52:04,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:52:08,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:52:09,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:52:11,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 10:52:16,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:52:19,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 10:52:24,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:52:25,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:52:25,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 10:52:26,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:52:27,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:52:27,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:52:28,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:52:32,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:52:35,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 10:52:35,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:52:42,974 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:52:44,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 10:52:51,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 10:52:51,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=341653.3333333333, ans=0.125 2023-09-29 10:52:53,962 INFO [train.py:1039] (3/4) Epoch 10, batch 3450, loss[loss=0.1709, simple_loss=0.253, pruned_loss=0.04438, over 24668.00 frames. ], tot_loss[loss=0.2083, simple_loss=0.2776, pruned_loss=0.06946, over 4721318.68 frames. ], batch size: 65, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:52:55,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 10:52:58,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=341720.0, ans=0.125 2023-09-29 10:53:00,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 10:53:00,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:53:01,764 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:53:01,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 10:53:03,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:53:06,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:53:10,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:53:12,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:53:13,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:53:13,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:53:16,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:53:24,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 10:53:28,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 10:53:28,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 10:53:28,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:53:30,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:53:36,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 10:53:38,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:53:41,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:53:41,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:53:43,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 10:53:45,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:53:46,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 10:53:46,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:53:50,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:53:52,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:53:55,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 10:53:59,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:54:04,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:54:06,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:09,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:54:12,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:12,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:54:13,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:54:13,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:54:16,393 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.481e+02 2.010e+02 2.253e+02 2.520e+02 3.608e+02, threshold=4.507e+02, percent-clipped=0.0 2023-09-29 10:54:16,438 INFO [train.py:1039] (3/4) Epoch 10, batch 3500, loss[loss=0.2094, simple_loss=0.2445, pruned_loss=0.08714, over 19294.00 frames. ], tot_loss[loss=0.2062, simple_loss=0.2755, pruned_loss=0.06847, over 4717010.90 frames. ], batch size: 388, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:54:17,332 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.87 vs. limit=15.0 2023-09-29 10:54:18,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:54:21,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:54:23,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 10:54:25,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 10:54:25,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=342053.3333333333, ans=0.125 2023-09-29 10:54:28,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 10:54:31,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:54:31,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 10:54:35,054 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 10:54:36,534 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:54:38,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:54:38,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:54:38,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 10:54:39,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:39,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:54:39,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 10:54:42,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:44,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 10:54:46,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:54:49,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=342186.6666666667, ans=0.0 2023-09-29 10:54:51,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:51,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 10:54:51,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:54:51,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=342186.6666666667, ans=0.1 2023-09-29 10:54:54,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:54:55,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 10:54:58,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:54:59,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 10:54:59,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:55:00,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=342186.6666666667, ans=0.025 2023-09-29 10:55:01,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 10:55:01,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 10:55:02,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 10:55:04,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:55:05,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:55:05,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:55:07,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 10:55:10,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 10:55:10,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 10:55:15,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:55:16,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 10:55:16,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 10:55:16,794 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:55:20,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:55:20,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:55:22,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:55:25,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 10:55:26,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:55:26,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:55:29,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 10:55:30,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 10:55:33,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:55:35,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:55:35,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:55:35,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:55:38,073 INFO [train.py:1039] (3/4) Epoch 10, batch 3550, loss[loss=0.2103, simple_loss=0.2746, pruned_loss=0.07294, over 23262.00 frames. ], tot_loss[loss=0.2056, simple_loss=0.2748, pruned_loss=0.06819, over 4725698.53 frames. ], batch size: 119, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:55:39,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 10:55:44,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:55:46,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=342386.6666666667, ans=0.1 2023-09-29 10:55:47,412 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.62 vs. limit=10.0 2023-09-29 10:55:48,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 10:55:50,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:55:52,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 10:55:55,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:55:55,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:55:55,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=342453.3333333333, ans=0.0 2023-09-29 10:55:56,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 10:56:00,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:56:00,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:56:00,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:56:00,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 10:56:02,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 10:56:07,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=342453.3333333333, ans=0.5 2023-09-29 10:56:07,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=342453.3333333333, ans=0.0 2023-09-29 10:56:08,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 10:56:08,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 10:56:10,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:56:10,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:56:11,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:56:11,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 10:56:11,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:56:13,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:56:14,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 10:56:19,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:56:19,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:56:21,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:56:24,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 10:56:24,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:56:26,144 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.62 vs. limit=15.0 2023-09-29 10:56:26,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 10:56:27,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 10:56:30,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 10:56:30,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 10:56:33,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 10:56:33,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=342586.6666666667, ans=0.0 2023-09-29 10:56:35,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:56:40,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:56:42,343 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 10:56:42,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:56:47,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:56:48,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 10:56:49,094 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=342653.3333333333, ans=0.125 2023-09-29 10:56:53,829 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=342653.3333333333, ans=0.0 2023-09-29 10:56:55,609 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 10:56:55,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:56:55,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 10:56:57,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:56:58,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:56:58,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 10:57:02,293 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 2.051e+02 2.303e+02 2.669e+02 4.347e+02, threshold=4.606e+02, percent-clipped=0.0 2023-09-29 10:57:02,337 INFO [train.py:1039] (3/4) Epoch 10, batch 3600, loss[loss=0.191, simple_loss=0.2595, pruned_loss=0.06123, over 23310.00 frames. ], tot_loss[loss=0.2047, simple_loss=0.2742, pruned_loss=0.06757, over 4739184.58 frames. ], batch size: 119, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 10:57:03,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:57:05,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:57:07,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 10:57:07,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 10:57:09,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:57:09,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 10:57:14,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:57:15,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:57:20,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:57:23,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:57:24,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 10:57:24,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:57:26,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 10:57:26,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 10:57:28,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:57:29,465 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 10:57:31,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:57:33,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:57:33,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:57:35,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=342853.3333333333, ans=0.125 2023-09-29 10:57:36,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 10:57:40,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=342853.3333333333, ans=0.05 2023-09-29 10:57:43,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:57:45,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 10:57:45,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 10:57:49,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:57:53,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:57:56,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:58:01,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 10:58:01,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 10:58:01,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 10:58:03,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=342920.0, ans=0.125 2023-09-29 10:58:05,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 10:58:05,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 10:58:07,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=342986.6666666667, ans=0.125 2023-09-29 10:58:08,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 10:58:08,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 10:58:10,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 10:58:11,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:58:11,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 10:58:11,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:58:13,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 10:58:14,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 10:58:18,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:58:18,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 10:58:20,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=342986.6666666667, ans=0.025 2023-09-29 10:58:25,478 INFO [train.py:1039] (3/4) Epoch 10, batch 3650, loss[loss=0.2009, simple_loss=0.2614, pruned_loss=0.07021, over 23870.00 frames. ], tot_loss[loss=0.2043, simple_loss=0.2741, pruned_loss=0.06731, over 4746171.18 frames. ], batch size: 212, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:58:25,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 10:58:27,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 10:58:27,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=343053.3333333333, ans=0.035 2023-09-29 10:58:30,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 10:58:33,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 10:58:36,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:58:36,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 10:58:36,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 10:58:40,278 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=343120.0, ans=0.1 2023-09-29 10:58:41,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 10:58:41,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 10:58:42,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 10:58:44,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 10:58:45,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:58:45,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 10:58:46,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 10:58:48,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:58:48,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:58:49,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 10:58:50,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=343120.0, ans=0.125 2023-09-29 10:58:50,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=343120.0, ans=0.125 2023-09-29 10:58:52,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 10:58:54,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 10:58:54,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:58:56,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 10:58:59,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:58:59,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 10:59:02,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 10:59:04,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:59:04,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 10:59:06,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 10:59:06,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 10:59:07,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 10:59:09,989 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.70 vs. limit=6.0 2023-09-29 10:59:12,845 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:59:14,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:59:14,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 10:59:16,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=343253.3333333333, ans=0.125 2023-09-29 10:59:17,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 10:59:19,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 10:59:19,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:59:23,309 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.46 vs. limit=12.0 2023-09-29 10:59:25,649 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 10:59:29,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:59:29,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 10:59:31,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 10:59:31,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:59:33,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 10:59:33,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=343320.0, ans=0.125 2023-09-29 10:59:35,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:59:37,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 10:59:37,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:59:40,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 10:59:43,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 10:59:43,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 10:59:47,275 INFO [train.py:1039] (3/4) Epoch 10, batch 3700, loss[loss=0.2109, simple_loss=0.2768, pruned_loss=0.07254, over 23463.00 frames. ], tot_loss[loss=0.2048, simple_loss=0.2746, pruned_loss=0.06755, over 4730213.74 frames. ], batch size: 106, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 10:59:47,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:59:47,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 10:59:47,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 10:59:48,842 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.980e+02 2.218e+02 2.465e+02 4.377e+02, threshold=4.435e+02, percent-clipped=0.0 2023-09-29 10:59:48,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 10:59:49,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 10:59:53,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 10:59:57,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 10:59:58,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 10:59:58,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 10:59:58,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 10:59:59,261 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=343386.6666666667, ans=0.125 2023-09-29 11:00:00,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 11:00:02,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:00:04,373 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 11:00:13,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:00:15,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 11:00:17,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:00:17,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 11:00:17,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:00:19,210 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=343520.0, ans=0.0 2023-09-29 11:00:20,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:00:20,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 11:00:22,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:00:23,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:00:24,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=343520.0, ans=0.1 2023-09-29 11:00:25,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:00:25,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:00:28,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:00:30,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=343520.0, ans=0.1 2023-09-29 11:00:31,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=343520.0, ans=0.125 2023-09-29 11:00:32,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:00:32,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 11:00:34,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:00:34,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 11:00:40,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:00:40,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:00:40,994 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=343586.6666666667, ans=0.0 2023-09-29 11:00:44,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:00:44,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 11:00:45,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:00:45,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 11:00:47,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:00:47,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:00:50,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:00:52,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 11:00:53,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 11:00:55,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:00:55,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:00:57,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:00:57,874 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.93 vs. limit=12.0 2023-09-29 11:00:58,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:00:58,879 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=343653.3333333333, ans=0.125 2023-09-29 11:01:01,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:01:03,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:01:04,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:01:07,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 11:01:08,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 11:01:10,105 INFO [train.py:1039] (3/4) Epoch 10, batch 3750, loss[loss=0.1742, simple_loss=0.2442, pruned_loss=0.05209, over 24438.00 frames. ], tot_loss[loss=0.2051, simple_loss=0.2749, pruned_loss=0.06759, over 4740115.81 frames. ], batch size: 58, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:01:11,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 11:01:11,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 11:01:13,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:01:15,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:01:17,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:01:17,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:01:22,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:01:26,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:01:28,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:01:30,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:01:31,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:01:32,646 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.39 vs. limit=15.0 2023-09-29 11:01:33,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 11:01:34,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:01:35,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:01:35,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:01:38,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 11:01:43,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 11:01:44,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=343853.3333333333, ans=0.1 2023-09-29 11:01:45,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:01:45,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:01:47,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:01:52,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:01:55,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 11:01:56,186 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=343853.3333333333, ans=0.0 2023-09-29 11:01:58,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 11:02:02,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:02:07,921 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.18 vs. limit=15.0 2023-09-29 11:02:08,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:02:08,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:02:11,713 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:02:16,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 11:02:17,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:02:18,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.whiten.whitening_limit, batch_count=343986.6666666667, ans=12.0 2023-09-29 11:02:21,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:02:23,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:02:25,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 11:02:33,615 INFO [train.py:1039] (3/4) Epoch 10, batch 3800, loss[loss=0.2072, simple_loss=0.2862, pruned_loss=0.06408, over 24289.00 frames. ], tot_loss[loss=0.205, simple_loss=0.2751, pruned_loss=0.06743, over 4740379.52 frames. ], batch size: 74, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:02:35,061 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 2.030e+02 2.447e+02 3.016e+02 6.033e+02, threshold=4.894e+02, percent-clipped=2.0 2023-09-29 11:02:35,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:02:38,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:02:38,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 11:02:40,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 11:02:40,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:02:43,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:02:44,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 11:02:45,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 11:02:45,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:02:46,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:02:48,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:02:48,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:02:50,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:02:50,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 11:02:53,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 11:02:53,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:02:55,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=344120.0, ans=0.0 2023-09-29 11:02:58,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:03:02,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:03:03,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:03:05,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 11:03:05,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:03:08,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:03:11,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:03:14,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=344186.6666666667, ans=0.2 2023-09-29 11:03:16,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 11:03:16,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 11:03:17,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:03:23,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:03:28,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=344253.3333333333, ans=0.0 2023-09-29 11:03:29,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:03:33,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 11:03:34,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 11:03:36,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:03:37,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:03:39,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:03:41,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 11:03:44,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 11:03:44,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 11:03:44,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:03:45,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:03:50,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:03:52,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:03:55,615 INFO [train.py:1039] (3/4) Epoch 10, batch 3850, loss[loss=0.1998, simple_loss=0.2626, pruned_loss=0.06855, over 23648.00 frames. ], tot_loss[loss=0.2048, simple_loss=0.2741, pruned_loss=0.06772, over 4719729.63 frames. ], batch size: 149, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:03:59,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:04:00,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 11:04:02,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:04:03,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:04:07,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 11:04:10,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:04:12,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 11:04:12,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 11:04:17,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:19,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:04:21,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:04:21,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:04:21,972 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.66 vs. limit=15.0 2023-09-29 11:04:25,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:25,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:04:25,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:04:25,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:04:28,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:04:30,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=344520.0, ans=0.1 2023-09-29 11:04:31,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:04:31,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:31,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:04:33,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 11:04:33,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 11:04:35,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:04:35,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:38,402 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=344520.0, ans=0.1 2023-09-29 11:04:40,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:04:40,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:40,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 11:04:43,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 11:04:44,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:04:47,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 11:04:49,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 11:04:54,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:04:55,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:04:59,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:05:01,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 11:05:04,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 11:05:07,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:05:08,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:05:11,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 11:05:11,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 11:05:12,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:14,028 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:14,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:05:14,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 11:05:15,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:05:17,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 11:05:17,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:19,082 INFO [train.py:1039] (3/4) Epoch 10, batch 3900, loss[loss=0.2345, simple_loss=0.3047, pruned_loss=0.08215, over 23881.00 frames. ], tot_loss[loss=0.2044, simple_loss=0.2733, pruned_loss=0.0677, over 4709528.81 frames. ], batch size: 86, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:05:19,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:05:19,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:05:20,644 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.918e+02 2.148e+02 2.537e+02 4.144e+02, threshold=4.296e+02, percent-clipped=0.0 2023-09-29 11:05:20,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:22,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:05:22,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:05:22,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:05:23,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:05:23,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 11:05:23,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:29,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:05:29,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 11:05:30,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:05:32,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:05:34,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 11:05:34,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:35,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:05:35,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=344786.6666666667, ans=0.0 2023-09-29 11:05:36,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 11:05:36,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:05:39,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 11:05:39,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:05:39,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 11:05:42,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 11:05:46,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:05:48,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:05:48,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:05:48,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:05:52,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:05:55,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:05:56,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:05:56,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:05:58,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:06:01,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=344853.3333333333, ans=0.0 2023-09-29 11:06:03,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:06:03,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:06:07,862 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=344853.3333333333, ans=0.05 2023-09-29 11:06:12,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:06:14,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:06:24,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:06:26,687 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.71 vs. limit=12.0 2023-09-29 11:06:28,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:06:30,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 11:06:30,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 11:06:30,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:06:33,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 11:06:34,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:06:34,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 11:06:35,553 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.15 vs. limit=6.0 2023-09-29 11:06:42,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:06:42,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 11:06:43,927 INFO [train.py:1039] (3/4) Epoch 10, batch 3950, loss[loss=0.1949, simple_loss=0.2768, pruned_loss=0.0565, over 24304.00 frames. ], tot_loss[loss=0.2047, simple_loss=0.2733, pruned_loss=0.06805, over 4700870.08 frames. ], batch size: 74, lr: 1.03e-02, grad_scale: 8.0 2023-09-29 11:06:44,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:06:44,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=345053.3333333333, ans=0.0 2023-09-29 11:06:47,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:06:48,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:06:57,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=345053.3333333333, ans=0.0 2023-09-29 11:06:58,397 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 11:06:58,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:06:58,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 11:06:58,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=345120.0, ans=0.1 2023-09-29 11:07:00,604 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 11:07:00,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:07:03,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:07:03,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:07:03,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:07:07,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 11:07:08,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:07:10,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:07:10,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:07:10,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:07:10,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:07:22,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:07:22,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:07:29,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 11:07:34,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 11:07:34,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 11:07:36,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:07:37,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:07:38,490 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=23.61 vs. limit=22.5 2023-09-29 11:07:47,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:07:47,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:07:47,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:07:47,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:07:47,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 11:07:54,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=345320.0, ans=0.1 2023-09-29 11:07:55,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:07:57,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:07:59,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 11:08:07,682 INFO [train.py:1039] (3/4) Epoch 10, batch 4000, loss[loss=0.2187, simple_loss=0.2892, pruned_loss=0.0741, over 24054.00 frames. ], tot_loss[loss=0.2055, simple_loss=0.2743, pruned_loss=0.06834, over 4698855.71 frames. ], batch size: 86, lr: 1.03e-02, grad_scale: 16.0 2023-09-29 11:08:09,113 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 2.046e+02 2.407e+02 2.777e+02 6.014e+02, threshold=4.814e+02, percent-clipped=1.0 2023-09-29 11:08:10,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:08:17,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:08:22,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:08:23,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:08:23,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:08:23,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 11:08:25,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 11:08:26,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 11:08:26,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:08:26,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 11:08:31,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:08:34,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:08:34,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:08:34,856 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:08:34,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:08:34,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 11:08:36,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff2.min_abs, batch_count=345453.3333333333, ans=0.1 2023-09-29 11:08:37,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:08:39,360 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 11:08:40,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:08:41,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:08:43,978 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 11:08:44,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 11:08:44,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:08:52,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 11:08:52,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=345520.0, ans=0.125 2023-09-29 11:08:53,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:08:55,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:08:56,855 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 11:08:58,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:08:58,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 11:08:58,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:09:00,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:09:01,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:09:01,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=345586.6666666667, ans=0.0 2023-09-29 11:09:03,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:09:03,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:09:05,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:09:08,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 11:09:08,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:09:10,464 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 11:09:10,772 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=345586.6666666667, ans=0.125 2023-09-29 11:09:16,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:09:18,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 11:09:20,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:09:21,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:09:23,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:09:24,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:09:29,483 INFO [train.py:1039] (3/4) Epoch 10, batch 4050, loss[loss=0.1999, simple_loss=0.286, pruned_loss=0.05685, over 24423.00 frames. ], tot_loss[loss=0.2062, simple_loss=0.2749, pruned_loss=0.06877, over 4693420.57 frames. ], batch size: 77, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:09:29,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:09:32,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 11:09:34,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 11:09:37,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:09:37,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:09:38,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:09:38,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:09:40,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:09:44,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:09:49,202 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:09:49,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 11:09:50,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:09:50,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:09:55,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:09:57,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:10:00,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 11:10:02,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 11:10:02,598 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=345853.3333333333, ans=0.0 2023-09-29 11:10:03,781 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 11:10:05,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:10:05,828 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=345853.3333333333, ans=0.125 2023-09-29 11:10:11,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 11:10:11,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:10:17,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:10:20,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:10:21,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=345920.0, ans=0.125 2023-09-29 11:10:22,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:10:22,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:10:25,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:10:28,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 11:10:28,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 11:10:29,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:10:32,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 11:10:36,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:10:44,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 11:10:45,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:10:45,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:10:47,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 11:10:47,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 11:10:47,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:10:50,227 INFO [train.py:1039] (3/4) Epoch 10, batch 4100, loss[loss=0.2135, simple_loss=0.2917, pruned_loss=0.06761, over 24280.00 frames. ], tot_loss[loss=0.2074, simple_loss=0.2761, pruned_loss=0.06928, over 4689981.14 frames. ], batch size: 74, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:10:50,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:10:52,478 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 1.996e+02 2.168e+02 2.450e+02 3.987e+02, threshold=4.335e+02, percent-clipped=0.0 2023-09-29 11:10:52,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:10:52,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:11:01,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 11:11:04,462 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 11:11:05,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 11:11:08,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 11:11:08,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:11:09,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:11:09,716 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:11:09,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:11:11,339 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 11:11:14,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:11:14,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:11:15,816 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:11:17,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:11:20,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:11:21,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:11:23,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:11:23,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 11:11:23,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:11:23,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:11:23,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:11:23,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:11:23,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=346186.6666666667, ans=0.1 2023-09-29 11:11:25,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 11:11:27,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:11:28,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 11:11:29,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=346186.6666666667, ans=0.2 2023-09-29 11:11:30,526 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:11:32,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:11:32,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 11:11:33,649 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.69 vs. limit=8.0 2023-09-29 11:11:35,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:11:35,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:11:37,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:11:37,419 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=346186.6666666667, ans=0.125 2023-09-29 11:11:38,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 11:11:40,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:11:40,452 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:11:44,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 11:11:44,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:11:45,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:11:48,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:11:49,145 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.56 vs. limit=6.0 2023-09-29 11:11:53,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:11:55,299 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=346320.0, ans=0.125 2023-09-29 11:11:56,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:11:58,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:12:06,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:12:06,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:12:09,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:12:12,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:12:13,821 INFO [train.py:1039] (3/4) Epoch 10, batch 4150, loss[loss=0.2125, simple_loss=0.2667, pruned_loss=0.07913, over 23653.00 frames. ], tot_loss[loss=0.2083, simple_loss=0.2769, pruned_loss=0.06986, over 4693673.63 frames. ], batch size: 232, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:12:17,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:12:19,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:12:19,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:12:19,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:12:19,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=346386.6666666667, ans=0.1 2023-09-29 11:12:23,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 11:12:23,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:12:23,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 11:12:23,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 11:12:25,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 11:12:26,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:12:31,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:12:31,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:12:36,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:12:37,216 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=346453.3333333333, ans=0.0 2023-09-29 11:12:38,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:12:38,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 11:12:39,372 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.18 vs. limit=10.0 2023-09-29 11:12:40,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:12:40,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:12:41,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=346453.3333333333, ans=0.125 2023-09-29 11:12:42,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 11:12:46,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:12:52,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:12:52,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 11:12:53,378 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=9.44 vs. limit=15.0 2023-09-29 11:12:56,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 11:12:56,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:12:56,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 11:12:56,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:12:56,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:13:00,134 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=346520.0, ans=0.125 2023-09-29 11:13:01,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:13:01,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:13:04,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 11:13:07,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 11:13:08,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:13:10,581 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 11:13:11,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:13:13,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 11:13:13,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=346586.6666666667, ans=0.0 2023-09-29 11:13:15,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:13:18,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:13:18,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:13:18,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=346653.3333333333, ans=0.1 2023-09-29 11:13:19,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 11:13:19,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:13:19,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 11:13:21,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 11:13:29,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 11:13:29,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:13:29,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:13:29,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 11:13:30,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 11:13:31,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:13:31,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 11:13:32,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:13:33,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:13:33,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 11:13:35,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 11:13:39,475 INFO [train.py:1039] (3/4) Epoch 10, batch 4200, loss[loss=0.2257, simple_loss=0.2955, pruned_loss=0.07794, over 23379.00 frames. ], tot_loss[loss=0.207, simple_loss=0.2754, pruned_loss=0.06927, over 4692974.27 frames. ], batch size: 93, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:13:39,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:13:39,988 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=346720.0, ans=0.125 2023-09-29 11:13:41,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 11:13:41,792 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=346720.0, ans=0.04949747468305833 2023-09-29 11:13:42,748 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 2.147e+02 2.478e+02 3.007e+02 3.865e+02, threshold=4.955e+02, percent-clipped=0.0 2023-09-29 11:13:43,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:13:45,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=346720.0, ans=0.125 2023-09-29 11:13:46,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:13:48,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:13:49,554 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:13:49,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:13:50,578 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.66 vs. limit=12.0 2023-09-29 11:13:51,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 11:13:54,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 11:13:54,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:13:56,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:13:57,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=346786.6666666667, ans=0.0 2023-09-29 11:13:58,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=346786.6666666667, ans=0.09899494936611666 2023-09-29 11:13:59,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:14:00,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=346786.6666666667, ans=0.125 2023-09-29 11:14:03,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 11:14:06,108 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:14:06,150 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:14:07,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 11:14:07,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:14:07,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=346786.6666666667, ans=0.125 2023-09-29 11:14:09,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:14:10,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:14:10,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:14:12,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:14:15,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 11:14:15,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:14:20,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 11:14:20,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:14:22,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:14:23,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:14:25,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:14:26,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 11:14:27,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:14:27,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:14:32,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:14:33,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=346920.0, ans=0.125 2023-09-29 11:14:35,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:14:42,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:14:44,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 11:14:47,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:14:48,440 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.51 vs. limit=10.0 2023-09-29 11:14:53,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 11:14:54,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:14:55,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 11:15:01,584 INFO [train.py:1039] (3/4) Epoch 10, batch 4250, loss[loss=0.1812, simple_loss=0.2537, pruned_loss=0.05431, over 24659.00 frames. ], tot_loss[loss=0.2058, simple_loss=0.2744, pruned_loss=0.06864, over 4688196.37 frames. ], batch size: 60, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:15:03,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:15:06,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:15:07,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 11:15:09,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:15:14,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:15:14,802 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 11:15:14,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:15:17,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:15:22,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:15:25,472 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.86 vs. limit=15.0 2023-09-29 11:15:26,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:15:28,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:15:29,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:15:29,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:15:31,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:15:33,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:15:35,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:15:38,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:15:39,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:15:41,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 11:15:43,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 11:15:43,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:15:45,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:15:45,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:15:46,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:15:46,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:15:48,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:15:51,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 11:15:51,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:15:56,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:15:58,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:15:59,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 11:15:59,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:16:01,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 11:16:03,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:16:05,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:16:06,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:16:06,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:16:10,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 11:16:11,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:16:11,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:16:15,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:16:18,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:16:21,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:16:22,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:16:23,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=347386.6666666667, ans=0.07 2023-09-29 11:16:24,809 INFO [train.py:1039] (3/4) Epoch 10, batch 4300, loss[loss=0.2085, simple_loss=0.2867, pruned_loss=0.0652, over 23999.00 frames. ], tot_loss[loss=0.2046, simple_loss=0.2733, pruned_loss=0.06791, over 4700313.79 frames. ], batch size: 80, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:16:24,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:16:26,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:16:27,854 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.974e+02 2.213e+02 2.611e+02 3.799e+02, threshold=4.426e+02, percent-clipped=0.0 2023-09-29 11:16:27,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:16:27,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 11:16:29,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:16:31,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=347386.6666666667, ans=0.125 2023-09-29 11:16:33,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:16:33,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=347386.6666666667, ans=0.2 2023-09-29 11:16:34,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:16:38,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:16:45,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:16:45,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 11:16:48,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:16:49,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:16:49,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:16:49,900 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 11:16:54,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 11:16:56,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:16:58,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=347520.0, ans=0.125 2023-09-29 11:16:59,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 11:16:59,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:16:59,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 11:17:02,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 11:17:04,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:17:07,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:17:07,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:17:09,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:17:11,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:17:11,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:17:13,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 11:17:13,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 11:17:15,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:17:18,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:17:18,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 11:17:18,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:17:18,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:17:18,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 11:17:18,779 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 11:17:20,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 11:17:20,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:17:20,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 11:17:21,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 11:17:25,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:17:27,075 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 11:17:28,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:17:28,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:17:28,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:17:31,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 11:17:33,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:17:33,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:17:33,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:17:33,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:17:33,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:17:35,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:17:38,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:17:40,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:17:40,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:17:47,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 11:17:48,399 INFO [train.py:1039] (3/4) Epoch 10, batch 4350, loss[loss=0.2012, simple_loss=0.2856, pruned_loss=0.05844, over 24332.00 frames. ], tot_loss[loss=0.2062, simple_loss=0.2747, pruned_loss=0.06885, over 4698621.45 frames. ], batch size: 74, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:17:48,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 11:17:52,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=347720.0, ans=0.125 2023-09-29 11:17:53,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:17:57,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:17:59,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:17:59,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:18:05,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:18:08,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:18:11,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:18:11,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:18:16,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:18:19,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:18:21,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:18:27,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 11:18:27,873 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.98 vs. limit=6.0 2023-09-29 11:18:28,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:18:28,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:18:30,498 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=347853.3333333333, ans=0.125 2023-09-29 11:18:35,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:18:36,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 11:18:39,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:18:40,868 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.45 vs. limit=15.0 2023-09-29 11:18:41,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:18:45,958 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 11:18:46,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:18:46,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=347920.0, ans=0.0 2023-09-29 11:18:47,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:18:47,675 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 11:18:49,097 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 11:18:49,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:18:50,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:18:50,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:18:50,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:18:52,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:18:52,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:18:55,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 11:18:55,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:18:55,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:18:56,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:18:56,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 11:18:57,981 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 11:18:59,926 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 11:18:59,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 11:19:03,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:19:03,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=347986.6666666667, ans=0.0 2023-09-29 11:19:04,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:19:04,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:19:05,254 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.89 vs. limit=12.0 2023-09-29 11:19:06,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:19:08,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 11:19:09,564 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 11:19:09,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:19:10,994 INFO [train.py:1039] (3/4) Epoch 10, batch 4400, loss[loss=0.195, simple_loss=0.2788, pruned_loss=0.05561, over 24696.00 frames. ], tot_loss[loss=0.2067, simple_loss=0.2757, pruned_loss=0.06883, over 4714085.02 frames. ], batch size: 73, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:19:14,005 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 2.070e+02 2.289e+02 2.714e+02 5.548e+02, threshold=4.577e+02, percent-clipped=2.0 2023-09-29 11:19:14,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:19:14,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:19:17,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:19:18,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 11:19:18,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 11:19:18,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=348053.3333333333, ans=0.125 2023-09-29 11:19:20,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 11:19:20,227 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 11:19:21,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:19:21,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:19:22,017 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=348053.3333333333, ans=0.125 2023-09-29 11:19:24,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 11:19:26,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:19:28,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:19:28,850 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 11:19:30,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:19:30,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 11:19:30,679 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 11:19:35,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 11:19:35,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 11:19:37,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 11:19:37,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:19:38,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:19:38,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:19:39,066 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 11:19:40,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:19:42,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 11:19:42,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 11:19:43,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:19:46,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:19:46,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:19:48,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:19:48,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:19:48,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 11:19:48,559 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 11:19:50,297 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=348186.6666666667, ans=0.1 2023-09-29 11:19:51,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:19:58,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:20:01,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 11:20:04,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:20:06,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:20:06,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=348253.3333333333, ans=0.0 2023-09-29 11:20:10,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:20:10,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 11:20:10,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:20:10,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:20:10,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:20:11,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:20:16,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 11:20:17,111 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=348320.0, ans=0.125 2023-09-29 11:20:18,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 11:20:19,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 11:20:19,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:20:19,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 11:20:21,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:20:23,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=348320.0, ans=0.125 2023-09-29 11:20:24,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=348320.0, ans=0.125 2023-09-29 11:20:25,933 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:20:28,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 11:20:32,549 INFO [train.py:1039] (3/4) Epoch 10, batch 4450, loss[loss=0.2325, simple_loss=0.3026, pruned_loss=0.08118, over 23409.00 frames. ], tot_loss[loss=0.2073, simple_loss=0.2766, pruned_loss=0.06899, over 4705907.22 frames. ], batch size: 93, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:20:32,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:20:35,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:20:35,890 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.73 vs. limit=6.0 2023-09-29 11:20:36,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:20:44,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:20:44,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:20:49,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:20:49,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=348453.3333333333, ans=0.125 2023-09-29 11:20:52,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:20:55,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:20:55,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:20:56,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 11:20:57,511 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:20:59,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:20:59,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:20:59,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:21:00,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=348453.3333333333, ans=0.125 2023-09-29 11:21:02,032 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 11:21:03,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=348520.0, ans=0.0 2023-09-29 11:21:05,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=348520.0, ans=0.2 2023-09-29 11:21:06,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:21:06,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:21:09,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:21:09,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:21:09,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=348520.0, ans=0.1 2023-09-29 11:21:11,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:21:15,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 11:21:17,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 11:21:17,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 11:21:17,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:21:22,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:21:23,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 11:21:26,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:21:32,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:21:33,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 11:21:33,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:21:33,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:21:33,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:21:33,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:21:35,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:21:38,322 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 11:21:39,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 11:21:41,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:21:43,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:21:43,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:21:47,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:21:47,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 11:21:48,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:21:51,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 11:21:53,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:21:54,283 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=21.27 vs. limit=22.5 2023-09-29 11:21:54,828 INFO [train.py:1039] (3/4) Epoch 10, batch 4500, loss[loss=0.2035, simple_loss=0.2583, pruned_loss=0.07431, over 23843.00 frames. ], tot_loss[loss=0.2069, simple_loss=0.2768, pruned_loss=0.06847, over 4717400.31 frames. ], batch size: 195, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:21:58,628 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.956e+02 2.459e+02 2.945e+02 4.663e+02, threshold=4.917e+02, percent-clipped=1.0 2023-09-29 11:21:58,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:22:00,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 11:22:00,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 11:22:01,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:22:08,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:22:08,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:22:09,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:22:10,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:22:10,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:22:11,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:22:22,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:22:23,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:22:25,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:22:27,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:22:28,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:22:35,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 11:22:40,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:22:44,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=348920.0, ans=0.09899494936611666 2023-09-29 11:22:45,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:22:49,046 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:22:50,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 11:22:52,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:22:52,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:22:53,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:22:55,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:22:57,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:22:57,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 11:22:57,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:22:57,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:22:58,343 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.56 vs. limit=12.0 2023-09-29 11:23:02,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=348986.6666666667, ans=0.2 2023-09-29 11:23:03,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:23:03,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:23:07,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:23:10,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:23:10,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:23:12,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 11:23:12,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 11:23:12,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 11:23:17,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 11:23:18,494 INFO [train.py:1039] (3/4) Epoch 10, batch 4550, loss[loss=0.1987, simple_loss=0.2808, pruned_loss=0.05831, over 24307.00 frames. ], tot_loss[loss=0.2059, simple_loss=0.2759, pruned_loss=0.06794, over 4730887.25 frames. ], batch size: 74, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:23:20,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 11:23:20,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:23:23,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:23:25,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:23:28,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:23:32,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:23:33,375 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.03 vs. limit=15.0 2023-09-29 11:23:35,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:23:38,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:23:38,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:23:38,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:23:41,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:23:42,398 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.61 vs. limit=15.0 2023-09-29 11:23:43,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:23:45,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:23:49,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 11:23:50,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 11:23:52,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:23:53,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 11:23:56,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 11:23:59,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:24:02,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 11:24:05,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:24:08,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:08,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:08,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:24:10,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 11:24:13,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:24:15,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:15,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:24:16,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:24:16,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 11:24:17,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 11:24:17,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=349253.3333333333, ans=0.125 2023-09-29 11:24:19,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:24:19,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 11:24:20,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 11:24:22,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:24:23,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:24:23,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:24:25,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:25,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:24:26,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 11:24:28,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 11:24:30,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:24:30,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 11:24:32,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 11:24:32,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:24:32,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 11:24:32,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=349320.0, ans=0.0 2023-09-29 11:24:35,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:24:35,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:24:37,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:24:37,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=349320.0, ans=0.125 2023-09-29 11:24:38,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:24:38,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 11:24:41,251 INFO [train.py:1039] (3/4) Epoch 10, batch 4600, loss[loss=0.2069, simple_loss=0.2808, pruned_loss=0.06644, over 24638.00 frames. ], tot_loss[loss=0.2055, simple_loss=0.275, pruned_loss=0.06805, over 4724881.10 frames. ], batch size: 65, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:24:42,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:24:44,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:24:46,324 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.915e+02 2.143e+02 2.405e+02 4.065e+02, threshold=4.286e+02, percent-clipped=0.0 2023-09-29 11:24:48,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:24:48,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:24:50,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=349386.6666666667, ans=0.1 2023-09-29 11:24:51,194 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:24:51,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:24:51,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:24:53,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 11:24:55,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:24:56,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=349453.3333333333, ans=0.125 2023-09-29 11:24:59,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:24:59,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:25:01,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:01,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=349453.3333333333, ans=0.2 2023-09-29 11:25:09,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 11:25:12,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:15,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:18,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:25:18,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:25:21,828 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.65 vs. limit=15.0 2023-09-29 11:25:24,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 11:25:24,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 11:25:24,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:25:29,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:29,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:25:31,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:25:37,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 11:25:38,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 11:25:39,319 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=349586.6666666667, ans=0.1 2023-09-29 11:25:42,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:25:45,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:25:48,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:25:48,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 11:25:48,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:25:48,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 11:25:50,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:25:50,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:25:51,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:25:52,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:25:53,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:25:53,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 11:25:55,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 11:25:55,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 11:25:55,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:25:56,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:25:58,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:25:58,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:26:03,463 INFO [train.py:1039] (3/4) Epoch 10, batch 4650, loss[loss=0.2291, simple_loss=0.289, pruned_loss=0.08458, over 23793.00 frames. ], tot_loss[loss=0.2052, simple_loss=0.2752, pruned_loss=0.06759, over 4733352.32 frames. ], batch size: 164, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:26:03,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=349720.0, ans=0.07 2023-09-29 11:26:09,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:26:12,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:26:12,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:26:12,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:26:14,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:26:14,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:26:14,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:26:18,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 11:26:20,043 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=349786.6666666667, ans=0.95 2023-09-29 11:26:21,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:26:22,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 11:26:24,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:26:24,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 11:26:24,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:26:26,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 11:26:26,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 11:26:26,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:26:27,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:26:28,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=349786.6666666667, ans=0.1 2023-09-29 11:26:31,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:26:32,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:26:33,014 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 11:26:35,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:26:36,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 11:26:39,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:26:40,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:26:41,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 11:26:42,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:26:45,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:26:49,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=349853.3333333333, ans=0.05 2023-09-29 11:26:52,319 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:26:55,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:26:59,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:27:00,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:27:02,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:27:02,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 11:27:03,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 11:27:03,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 11:27:03,928 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 11:27:04,173 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=349920.0, ans=0.125 2023-09-29 11:27:06,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:27:12,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:27:12,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:27:14,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 11:27:14,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:27:16,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:27:16,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:27:17,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:27:20,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:27:20,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:27:22,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:27:25,767 INFO [train.py:1039] (3/4) Epoch 10, batch 4700, loss[loss=0.1695, simple_loss=0.2463, pruned_loss=0.0464, over 24292.00 frames. ], tot_loss[loss=0.2053, simple_loss=0.2748, pruned_loss=0.06789, over 4719645.86 frames. ], batch size: 56, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:27:25,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:27:26,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:27:26,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:27:27,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 11:27:27,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:27:27,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=350053.3333333333, ans=0.0 2023-09-29 11:27:29,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 11:27:30,693 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.991e+02 2.178e+02 2.659e+02 4.780e+02, threshold=4.356e+02, percent-clipped=1.0 2023-09-29 11:27:39,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:27:39,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=350053.3333333333, ans=0.125 2023-09-29 11:27:40,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:27:40,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:27:43,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:27:44,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 11:27:48,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=350120.0, ans=0.2 2023-09-29 11:27:49,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 11:27:49,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 11:27:52,559 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:27:54,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:27:54,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:27:58,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:28:06,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 11:28:09,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 11:28:10,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:28:11,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=350186.6666666667, ans=15.0 2023-09-29 11:28:17,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 11:28:18,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:28:20,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:24,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 11:28:27,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:28:32,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:28:33,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 11:28:33,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:33,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:28:36,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:28:36,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:28:36,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 11:28:38,974 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 11:28:39,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:28:40,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:40,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:40,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 11:28:42,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:28:43,299 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=350320.0, ans=0.125 2023-09-29 11:28:46,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=350320.0, ans=0.1 2023-09-29 11:28:47,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 11:28:49,342 INFO [train.py:1039] (3/4) Epoch 10, batch 4750, loss[loss=0.2216, simple_loss=0.2995, pruned_loss=0.07181, over 23925.00 frames. ], tot_loss[loss=0.2058, simple_loss=0.2756, pruned_loss=0.06804, over 4726011.87 frames. ], batch size: 80, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:28:51,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:28:52,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:28:56,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:28:56,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:28:59,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 11:28:59,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:29:03,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 11:29:05,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:29:05,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:29:05,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:29:09,655 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.43 vs. limit=15.0 2023-09-29 11:29:12,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 11:29:18,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:29:18,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_positive, batch_count=350453.3333333333, ans=0.05 2023-09-29 11:29:19,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 11:29:19,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:29:22,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:29:22,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:29:24,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:29:25,122 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 11:29:25,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 11:29:30,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 11:29:33,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=350520.0, ans=0.125 2023-09-29 11:29:34,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:29:36,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:29:36,808 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.87 vs. limit=10.0 2023-09-29 11:29:39,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:29:39,151 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 11:29:39,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:29:42,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:29:42,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=350586.6666666667, ans=0.1 2023-09-29 11:29:45,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:29:47,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 11:29:47,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 11:29:47,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:29:48,794 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:29:48,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:29:50,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 11:29:50,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 11:29:55,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 11:29:57,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:01,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:30:01,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 11:30:02,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:30:04,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:30:04,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:30:06,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:06,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 11:30:09,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:30:09,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 11:30:09,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 11:30:11,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 11:30:12,439 INFO [train.py:1039] (3/4) Epoch 10, batch 4800, loss[loss=0.231, simple_loss=0.2915, pruned_loss=0.08528, over 23968.00 frames. ], tot_loss[loss=0.2056, simple_loss=0.2755, pruned_loss=0.06786, over 4732380.47 frames. ], batch size: 196, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:30:15,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:30:15,398 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:30:16,715 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.911e+02 2.173e+02 2.490e+02 3.366e+02, threshold=4.345e+02, percent-clipped=0.0 2023-09-29 11:30:16,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 11:30:22,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:23,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:28,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:30:28,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:30:30,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:30,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 11:30:30,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=350786.6666666667, ans=0.125 2023-09-29 11:30:31,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:30:31,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:30:33,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_na.min_abs, batch_count=350786.6666666667, ans=0.02 2023-09-29 11:30:35,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:30:39,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:30:40,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:30:42,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:30:43,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:30:43,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 11:30:44,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:45,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:30:49,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:30:51,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:52,436 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.40 vs. limit=15.0 2023-09-29 11:30:53,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:30:53,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:30:55,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 11:30:56,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:58,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 11:30:58,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 11:30:59,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:30:59,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:31:00,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:31:00,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:31:02,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:31:02,384 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 11:31:05,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:31:05,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:31:09,779 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:31:11,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:13,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:31:18,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 11:31:20,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:31:21,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:21,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:31:23,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:31:26,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:31:28,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:31:28,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:28,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:31:29,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:31:30,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:31:33,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:31:33,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:33,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:31:34,449 INFO [train.py:1039] (3/4) Epoch 10, batch 4850, loss[loss=0.2075, simple_loss=0.2732, pruned_loss=0.07094, over 23669.00 frames. ], tot_loss[loss=0.2062, simple_loss=0.276, pruned_loss=0.06825, over 4732149.46 frames. ], batch size: 149, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:31:34,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 11:31:37,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 11:31:37,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:31:37,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:31:37,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:31:37,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:31:41,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:31:44,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=351053.3333333333, ans=0.125 2023-09-29 11:31:51,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 11:31:51,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:31:56,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:31:57,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 11:31:57,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:32:01,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:32:02,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:32:04,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:32:04,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 11:32:07,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:32:10,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:32:11,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 11:32:11,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:32:11,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 11:32:14,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:32:15,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:32:18,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:32:18,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 11:32:21,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 11:32:22,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:32:29,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:32:30,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 11:32:32,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:32:32,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:32:32,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=351253.3333333333, ans=0.125 2023-09-29 11:32:34,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:32:36,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 11:32:36,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:32:36,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 11:32:36,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=351253.3333333333, ans=0.125 2023-09-29 11:32:37,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:32:37,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:32:39,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 11:32:44,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=351320.0, ans=0.09899494936611666 2023-09-29 11:32:49,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:32:54,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:32:54,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:32:57,372 INFO [train.py:1039] (3/4) Epoch 10, batch 4900, loss[loss=0.1992, simple_loss=0.2798, pruned_loss=0.05932, over 24394.00 frames. ], tot_loss[loss=0.2061, simple_loss=0.2756, pruned_loss=0.06833, over 4728908.92 frames. ], batch size: 77, lr: 1.02e-02, grad_scale: 16.0 2023-09-29 11:33:00,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 11:33:00,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:33:01,167 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=351386.6666666667, ans=0.0 2023-09-29 11:33:02,124 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.635e+02 2.045e+02 2.293e+02 2.550e+02 3.770e+02, threshold=4.586e+02, percent-clipped=0.0 2023-09-29 11:33:06,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:33:08,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:33:09,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:33:12,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 11:33:17,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 11:33:22,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 11:33:23,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 11:33:23,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:33:23,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:33:25,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:33:25,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:33:25,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:33:25,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 11:33:30,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 11:33:30,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:33:32,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:33:34,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:33:35,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:33:35,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:33:37,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:33:37,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 11:33:38,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:33:42,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:33:42,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 11:33:42,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 11:33:45,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 11:33:47,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:33:50,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:33:50,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:33:51,270 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=351586.6666666667, ans=0.2 2023-09-29 11:33:52,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:33:52,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 11:33:52,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:33:52,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 11:33:55,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:33:56,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 11:33:58,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:34:01,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 11:34:03,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:34:03,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 11:34:04,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 11:34:10,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=351653.3333333333, ans=0.5 2023-09-29 11:34:11,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:34:13,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:34:14,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 11:34:15,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 11:34:15,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:34:17,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:34:17,943 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.47 vs. limit=22.5 2023-09-29 11:34:18,036 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.69 vs. limit=22.5 2023-09-29 11:34:20,045 INFO [train.py:1039] (3/4) Epoch 10, batch 4950, loss[loss=0.1993, simple_loss=0.2582, pruned_loss=0.07019, over 23689.00 frames. ], tot_loss[loss=0.206, simple_loss=0.2747, pruned_loss=0.06865, over 4719413.71 frames. ], batch size: 232, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:34:20,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:34:20,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:34:22,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:34:22,355 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 11:34:23,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 11:34:24,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=351720.0, ans=0.125 2023-09-29 11:34:25,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:34:25,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 11:34:28,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 11:34:28,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=351720.0, ans=0.1 2023-09-29 11:34:30,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 11:34:30,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:34:31,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 11:34:31,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:34:31,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:34:31,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:34:33,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:34:34,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:34:36,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:34:37,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:34:37,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:34:39,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:34:41,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:34:43,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=351786.6666666667, ans=0.07 2023-09-29 11:34:44,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:34:48,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:34:49,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:34:49,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=351786.6666666667, ans=0.125 2023-09-29 11:34:52,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:34:53,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:34:54,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:34:56,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 11:34:57,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 11:34:59,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:35:00,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:35:00,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:35:03,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:35:03,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:35:05,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 11:35:08,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:35:10,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:35:11,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:35:13,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:35:13,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:35:15,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 11:35:15,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:35:17,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:35:21,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:35:23,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:35:23,398 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:35:23,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:35:24,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:35:26,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:35:29,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:35:29,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:35:29,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=351986.6666666667, ans=0.125 2023-09-29 11:35:30,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:35:31,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 11:35:31,993 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=351986.6666666667, ans=0.125 2023-09-29 11:35:36,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:35:38,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=351986.6666666667, ans=0.05 2023-09-29 11:35:41,005 INFO [train.py:1039] (3/4) Epoch 10, batch 5000, loss[loss=0.209, simple_loss=0.2667, pruned_loss=0.07562, over 23750.00 frames. ], tot_loss[loss=0.2058, simple_loss=0.2747, pruned_loss=0.06845, over 4723902.41 frames. ], batch size: 164, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:35:41,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 11:35:41,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 11:35:46,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:35:47,806 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 2.017e+02 2.302e+02 2.737e+02 4.823e+02, threshold=4.603e+02, percent-clipped=1.0 2023-09-29 11:35:47,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:35:49,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 11:35:51,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 11:35:53,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:35:54,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 11:35:54,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:35:54,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:35:55,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=352053.3333333333, ans=0.125 2023-09-29 11:35:55,881 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.59 vs. limit=15.0 2023-09-29 11:35:56,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 11:35:57,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:35:59,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:36:00,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=352120.0, ans=0.09899494936611666 2023-09-29 11:36:01,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 11:36:01,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:36:01,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:36:02,169 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.47 vs. limit=15.0 2023-09-29 11:36:02,246 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.39 vs. limit=12.0 2023-09-29 11:36:02,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 11:36:02,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 11:36:03,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:36:03,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 11:36:03,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:36:03,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:36:04,764 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:36:04,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 11:36:04,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 11:36:06,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 11:36:06,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:36:07,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:36:08,380 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=352120.0, ans=0.5 2023-09-29 11:36:09,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 11:36:09,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:36:11,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:36:12,637 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:36:12,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 11:36:14,417 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 11:36:15,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:36:15,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:36:21,608 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 11:36:24,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:36:26,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:36:26,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:36:32,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 11:36:32,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:36:32,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:36:32,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:36:34,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 11:36:35,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:36:37,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:36:38,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:36:44,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 11:36:49,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:36:58,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:37:00,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:37:00,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:37:00,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:37:01,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:37:01,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:37:02,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=352386.6666666667, ans=0.125 2023-09-29 11:37:03,144 INFO [train.py:1039] (3/4) Epoch 10, batch 5050, loss[loss=0.2157, simple_loss=0.2909, pruned_loss=0.07028, over 24029.00 frames. ], tot_loss[loss=0.2057, simple_loss=0.2747, pruned_loss=0.06839, over 4722630.27 frames. ], batch size: 80, lr: 1.02e-02, grad_scale: 8.0 2023-09-29 11:37:03,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:37:07,299 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=352386.6666666667, ans=0.125 2023-09-29 11:37:08,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:37:08,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 11:37:10,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:37:13,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:37:16,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:37:16,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 11:37:17,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:37:17,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:37:20,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:37:22,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:37:22,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:37:34,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 11:37:36,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 11:37:36,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:37:36,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 11:37:36,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:37:39,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:37:39,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:37:41,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:37:41,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 11:37:42,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 11:37:44,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:37:44,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=352520.0, ans=0.125 2023-09-29 11:37:46,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:37:49,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:37:51,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 11:37:52,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:37:55,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 11:37:57,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:37:57,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:37:58,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:37:59,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:38:00,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:38:02,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:38:03,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:38:03,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:38:03,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:38:05,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 11:38:06,087 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=352586.6666666667, ans=0.125 2023-09-29 11:38:07,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:38:09,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:38:12,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:38:12,600 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 11:38:12,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 11:38:14,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:38:14,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:38:14,296 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 11:38:17,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:38:17,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 11:38:17,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:38:20,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=352653.3333333333, ans=0.0 2023-09-29 11:38:21,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:38:23,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:38:23,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 11:38:24,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 11:38:26,301 INFO [train.py:1039] (3/4) Epoch 10, batch 5100, loss[loss=0.2149, simple_loss=0.2791, pruned_loss=0.07534, over 23693.00 frames. ], tot_loss[loss=0.2062, simple_loss=0.2754, pruned_loss=0.06849, over 4717684.14 frames. ], batch size: 149, lr: 1.01e-02, grad_scale: 8.0 2023-09-29 11:38:26,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:38:27,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:38:28,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:38:31,003 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 11:38:32,348 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.632e+02 1.939e+02 2.293e+02 2.682e+02 4.893e+02, threshold=4.586e+02, percent-clipped=1.0 2023-09-29 11:38:33,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:38:37,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 11:38:39,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 11:38:41,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:38:42,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:38:44,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:38:44,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 11:38:44,451 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 11:38:49,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=352786.6666666667, ans=0.125 2023-09-29 11:38:50,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:38:50,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:38:57,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:38:59,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 11:38:59,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:39:01,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:39:02,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 11:39:05,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:39:06,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:39:06,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 11:39:07,640 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 11:39:09,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:39:09,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 11:39:09,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 11:39:13,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=352853.3333333333, ans=0.025 2023-09-29 11:39:15,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:39:21,418 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:39:21,745 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=352920.0, ans=0.04949747468305833 2023-09-29 11:39:22,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 11:39:22,999 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 11:39:23,011 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 11:39:24,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 11:39:24,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:39:26,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=352920.0, ans=0.1 2023-09-29 11:39:29,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 11:39:33,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 11:39:36,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=352986.6666666667, ans=0.0 2023-09-29 11:39:37,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 11:39:39,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:39:41,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 11:39:44,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 11:39:44,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 11:39:49,569 INFO [train.py:1039] (3/4) Epoch 10, batch 5150, loss[loss=0.1935, simple_loss=0.274, pruned_loss=0.05652, over 24414.00 frames. ], tot_loss[loss=0.2069, simple_loss=0.276, pruned_loss=0.06887, over 4711610.32 frames. ], batch size: 69, lr: 1.01e-02, grad_scale: 8.0 2023-09-29 11:39:49,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:39:51,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:39:51,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:39:51,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:39:51,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 11:39:51,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:39:52,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 11:39:52,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 11:39:54,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 11:39:54,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:39:54,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 11:39:54,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=353053.3333333333, ans=0.125 2023-09-29 11:39:55,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:39:55,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 11:39:57,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:39:57,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=353053.3333333333, ans=0.1 2023-09-29 11:39:58,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:40:05,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:40:06,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 11:40:07,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:40:07,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:40:07,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 11:40:07,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:40:09,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:40:09,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:40:09,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:40:10,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 11:40:11,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:40:12,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:40:12,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:40:12,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=353120.0, ans=10.0 2023-09-29 11:40:14,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=353120.0, ans=0.125 2023-09-29 11:40:14,408 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=353120.0, ans=0.125 2023-09-29 11:40:15,591 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 11:40:17,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:40:24,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:40:24,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 11:40:26,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=353186.6666666667, ans=0.2 2023-09-29 11:40:28,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:40:34,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:40:35,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:40:39,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:40:41,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:40:44,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 11:40:49,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:40:51,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:40:51,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 11:40:54,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:40:55,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:40:57,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 11:41:00,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:41:02,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 11:41:04,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:41:04,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:41:05,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 11:41:05,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:41:05,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:41:05,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:41:09,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:41:11,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:41:13,144 INFO [train.py:1039] (3/4) Epoch 10, batch 5200, loss[loss=0.2726, simple_loss=0.309, pruned_loss=0.1181, over 19706.00 frames. ], tot_loss[loss=0.2074, simple_loss=0.2767, pruned_loss=0.06902, over 4714859.60 frames. ], batch size: 388, lr: 1.01e-02, grad_scale: 16.0 2023-09-29 11:41:14,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:41:16,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=353386.6666666667, ans=0.125 2023-09-29 11:41:19,172 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 2.032e+02 2.395e+02 2.917e+02 4.034e+02, threshold=4.790e+02, percent-clipped=0.0 2023-09-29 11:41:19,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 11:41:19,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:41:21,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:41:21,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=353386.6666666667, ans=0.1 2023-09-29 11:41:24,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:41:26,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:41:26,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:41:27,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 11:41:30,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:41:31,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=353453.3333333333, ans=0.0 2023-09-29 11:41:32,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:41:35,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 11:41:38,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:41:39,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:41:41,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 11:41:41,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 11:41:44,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 11:41:46,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:41:46,376 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 11:41:46,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:41:47,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:41:48,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:41:48,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 11:41:49,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:41:53,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:41:56,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 11:41:56,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 11:41:56,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 11:42:03,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 11:42:04,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:42:09,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:42:09,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:42:10,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 11:42:11,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:42:12,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 11:42:12,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:42:12,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:42:14,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=353586.6666666667, ans=0.5 2023-09-29 11:42:15,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:42:17,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:42:20,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:42:22,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:42:22,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:42:24,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=353653.3333333333, ans=0.0 2023-09-29 11:42:28,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:42:30,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 11:42:32,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:42:32,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:42:34,460 INFO [train.py:1039] (3/4) Epoch 10, batch 5250, loss[loss=0.179, simple_loss=0.2362, pruned_loss=0.06096, over 23776.00 frames. ], tot_loss[loss=0.2061, simple_loss=0.2756, pruned_loss=0.06826, over 4732919.53 frames. ], batch size: 212, lr: 1.01e-02, grad_scale: 16.0 2023-09-29 11:42:34,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:42:36,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 11:42:36,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:42:39,806 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.25 vs. limit=15.0 2023-09-29 11:42:40,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:42:43,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:42:45,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:42:45,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:42:47,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=353720.0, ans=0.125 2023-09-29 11:42:50,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:42:51,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:42:56,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:42:58,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:42:58,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 11:42:58,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:42:59,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:43:00,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=353786.6666666667, ans=0.1 2023-09-29 11:43:21,954 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.23 vs. limit=12.0 2023-09-29 11:43:48,680 INFO [train.py:1039] (3/4) Epoch 10, batch 5300, loss[loss=0.1924, simple_loss=0.2271, pruned_loss=0.0788, over 19404.00 frames. ], tot_loss[loss=0.2049, simple_loss=0.274, pruned_loss=0.06792, over 4702669.50 frames. ], batch size: 388, lr: 1.01e-02, grad_scale: 16.0 2023-09-29 11:43:54,371 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.719e+02 1.989e+02 2.153e+02 2.436e+02 4.114e+02, threshold=4.306e+02, percent-clipped=0.0 2023-09-29 11:44:05,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:44:05,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 11:44:05,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 11:44:05,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:44:06,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:44:06,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:44:06,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:44:06,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:44:06,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:06,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:44:06,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 11:44:06,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:44:07,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 11:44:07,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 11:44:07,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 11:44:07,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 11:44:07,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 11:44:07,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 11:44:07,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:44:08,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:44:08,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:44:08,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:44:08,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:44:09,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:44:09,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:44:09,681 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:44:09,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:44:09,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:44:09,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:44:09,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:44:09,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:44:10,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 11:44:10,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:44:11,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:44:11,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 11:44:11,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 11:44:11,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:44:11,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:44:11,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 11:44:11,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 11:44:12,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:44:12,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:44:13,454 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:44:13,603 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 11:44:13,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 11:44:13,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:44:13,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:44:14,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 11:44:14,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 11:44:14,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 11:44:14,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:44:22,142 INFO [train.py:1039] (3/4) Epoch 11, batch 0, loss[loss=0.2113, simple_loss=0.2856, pruned_loss=0.06847, over 24069.00 frames. ], tot_loss[loss=0.2113, simple_loss=0.2856, pruned_loss=0.06847, over 24069.00 frames. ], batch size: 80, lr: 9.67e-03, grad_scale: 32.0 2023-09-29 11:44:22,142 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 11:44:36,240 INFO [train.py:1071] (3/4) Epoch 11, validation: loss=0.3103, simple_loss=0.2886, pruned_loss=0.166, over 1125622.00 frames. 2023-09-29 11:44:36,241 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 11:44:39,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 11:44:39,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:44:41,300 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.16 vs. limit=15.0 2023-09-29 11:44:42,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:44:48,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:44:48,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:44:48,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:48,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 11:44:50,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 11:44:53,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:54,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:57,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:44:57,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:44:59,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:44:59,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:45:00,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 11:45:02,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:45:11,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:45:11,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:45:12,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=354273.3333333333, ans=0.2 2023-09-29 11:45:13,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 11:45:18,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:45:18,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:45:20,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:45:26,943 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:45:33,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:45:35,052 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=354340.0, ans=0.0 2023-09-29 11:45:36,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 11:45:39,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 11:45:41,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:45:41,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:45:41,524 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=354406.6666666667, ans=0.05 2023-09-29 11:45:42,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:45:44,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:45:45,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 11:45:49,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:45:50,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:45:54,411 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:45:57,373 INFO [train.py:1039] (3/4) Epoch 11, batch 50, loss[loss=0.1805, simple_loss=0.2595, pruned_loss=0.05074, over 24586.00 frames. ], tot_loss[loss=0.2041, simple_loss=0.2757, pruned_loss=0.06622, over 1071613.72 frames. ], batch size: 60, lr: 9.67e-03, grad_scale: 16.0 2023-09-29 11:45:57,488 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 11:46:00,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:46:02,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:46:05,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:46:05,943 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=354473.3333333333, ans=0.125 2023-09-29 11:46:06,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 11:46:07,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 11:46:08,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:46:09,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:46:11,594 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:46:14,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:46:17,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 11:46:17,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:46:24,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 11:46:28,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 11:46:30,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 11:46:32,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:46:33,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:46:33,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:46:33,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:46:35,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 11:46:35,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 11:46:35,980 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:46:43,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:46:45,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:46:45,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:46:46,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 11:46:47,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=354673.3333333333, ans=0.1 2023-09-29 11:46:48,481 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 11:46:49,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:46:49,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 11:46:50,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:46:51,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 11:46:55,337 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.88 vs. limit=15.0 2023-09-29 11:47:01,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:47:01,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:47:03,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:47:04,549 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.921e+02 2.105e+02 2.466e+02 3.711e+02, threshold=4.210e+02, percent-clipped=0.0 2023-09-29 11:47:04,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:47:04,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 11:47:05,035 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=354740.0, ans=0.1 2023-09-29 11:47:07,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 11:47:07,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 11:47:09,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:47:09,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 11:47:10,560 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.76 vs. limit=10.0 2023-09-29 11:47:11,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:47:12,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:47:12,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 11:47:14,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 11:47:14,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 11:47:15,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:47:17,317 INFO [train.py:1039] (3/4) Epoch 11, batch 100, loss[loss=0.2156, simple_loss=0.2986, pruned_loss=0.06626, over 23988.00 frames. ], tot_loss[loss=0.2029, simple_loss=0.2739, pruned_loss=0.06592, over 1885077.22 frames. ], batch size: 80, lr: 9.66e-03, grad_scale: 16.0 2023-09-29 11:47:17,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:47:18,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 11:47:18,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 11:47:19,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:47:20,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:47:22,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 11:47:22,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:47:26,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:47:28,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:47:32,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:47:34,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 11:47:34,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:47:37,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 11:47:37,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:47:38,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:47:38,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:47:38,779 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:47:40,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 11:47:42,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:47:42,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:47:42,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:47:42,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:47:47,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 11:47:47,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:47:49,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:47:49,445 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:47:49,841 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 11:47:49,879 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=354940.0, ans=0.05 2023-09-29 11:47:51,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:47:54,446 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 11:47:54,470 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 11:47:56,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:47:56,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:48:00,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 11:48:03,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:48:05,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:09,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:11,200 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 11:48:12,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 11:48:17,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:48:17,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:48:19,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:22,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:48:23,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=355073.3333333333, ans=0.0 2023-09-29 11:48:25,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:48:27,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:48:30,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:31,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:48:33,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:48:33,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:48:33,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:48:33,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=355073.3333333333, ans=0.0 2023-09-29 11:48:34,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 11:48:34,652 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 11:48:34,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:48:36,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:48:37,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:48:37,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:48:37,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 11:48:37,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:48:37,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 11:48:37,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:48:39,665 INFO [train.py:1039] (3/4) Epoch 11, batch 150, loss[loss=0.1985, simple_loss=0.2752, pruned_loss=0.0609, over 24703.00 frames. ], tot_loss[loss=0.2028, simple_loss=0.2734, pruned_loss=0.0661, over 2528286.58 frames. ], batch size: 73, lr: 9.66e-03, grad_scale: 16.0 2023-09-29 11:48:39,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:48:41,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:48:43,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:48:43,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:48:45,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:48:48,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:48:48,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:48:48,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:48:52,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:48:53,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:48:58,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:48:59,569 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:49:02,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 11:49:02,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 11:49:02,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 11:49:07,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:49:07,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:49:07,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:49:08,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:49:08,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:49:10,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:49:10,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:49:11,831 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 11:49:13,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:49:20,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:49:25,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 11:49:26,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 11:49:28,617 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=355340.0, ans=0.125 2023-09-29 11:49:29,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:49:29,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:49:29,989 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:49:33,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:49:35,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:49:35,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=355340.0, ans=0.2 2023-09-29 11:49:36,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:49:36,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:49:36,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 11:49:38,840 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.54 vs. limit=15.0 2023-09-29 11:49:41,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:49:41,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=355340.0, ans=0.125 2023-09-29 11:49:42,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:49:42,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:49:42,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:49:44,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:49:44,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=355406.6666666667, ans=0.125 2023-09-29 11:49:46,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 11:49:47,845 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.912e+02 2.159e+02 2.654e+02 4.388e+02, threshold=4.317e+02, percent-clipped=1.0 2023-09-29 11:49:48,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 11:49:50,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:49:54,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:49:55,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:49:55,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 11:49:57,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:49:57,091 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 11:50:01,515 INFO [train.py:1039] (3/4) Epoch 11, batch 200, loss[loss=0.207, simple_loss=0.2886, pruned_loss=0.06268, over 24441.00 frames. ], tot_loss[loss=0.2037, simple_loss=0.2749, pruned_loss=0.06627, over 3028227.01 frames. ], batch size: 69, lr: 9.65e-03, grad_scale: 16.0 2023-09-29 11:50:01,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:50:03,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=355473.3333333333, ans=0.2 2023-09-29 11:50:05,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:50:05,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:50:08,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 11:50:09,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:50:09,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:50:13,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 11:50:14,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 11:50:16,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:50:17,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:50:20,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:50:22,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:50:22,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:50:43,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:50:43,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:50:44,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 11:50:44,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:50:46,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 11:50:46,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:50:47,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:50:48,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=355606.6666666667, ans=0.025 2023-09-29 11:50:49,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:50:50,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:50:50,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:50:52,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 11:50:53,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 11:50:53,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:50:58,733 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=355673.3333333333, ans=0.1 2023-09-29 11:51:00,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:51:05,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:51:09,943 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=355740.0, ans=0.07 2023-09-29 11:51:11,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=355740.0, ans=0.125 2023-09-29 11:51:12,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:14,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:51:22,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:23,777 INFO [train.py:1039] (3/4) Epoch 11, batch 250, loss[loss=0.1765, simple_loss=0.2449, pruned_loss=0.05408, over 24449.00 frames. ], tot_loss[loss=0.2032, simple_loss=0.2744, pruned_loss=0.06605, over 3411795.18 frames. ], batch size: 58, lr: 9.65e-03, grad_scale: 16.0 2023-09-29 11:51:25,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 11:51:25,481 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:51:25,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:51:25,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:51:26,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 11:51:28,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 11:51:29,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:51:30,019 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 11:51:31,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:33,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:51:33,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:35,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:51:37,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:51:37,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:51:40,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:51:43,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=355873.3333333333, ans=0.1 2023-09-29 11:51:44,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:51:53,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:51:57,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:51:57,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:52:03,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 11:52:05,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 11:52:05,556 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=355940.0, ans=0.125 2023-09-29 11:52:07,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:52:07,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:52:07,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 11:52:07,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:52:07,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:52:10,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:52:12,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 11:52:12,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:52:12,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=356006.6666666667, ans=0.1 2023-09-29 11:52:15,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:52:15,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 11:52:15,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:52:16,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=356006.6666666667, ans=0.125 2023-09-29 11:52:17,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:52:19,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:52:19,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 11:52:20,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:52:23,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 11:52:23,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:52:27,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 11:52:30,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:52:32,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:52:33,628 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.944e+02 2.181e+02 2.498e+02 3.489e+02, threshold=4.363e+02, percent-clipped=0.0 2023-09-29 11:52:38,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:52:41,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:52:45,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 11:52:46,484 INFO [train.py:1039] (3/4) Epoch 11, batch 300, loss[loss=0.1831, simple_loss=0.2558, pruned_loss=0.05521, over 24546.00 frames. ], tot_loss[loss=0.2019, simple_loss=0.2725, pruned_loss=0.0657, over 3693613.19 frames. ], batch size: 60, lr: 9.64e-03, grad_scale: 16.0 2023-09-29 11:52:46,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:52:46,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 11:52:48,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 11:52:48,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 11:52:49,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:52:49,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 11:52:53,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:52:55,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:52:55,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=356140.0, ans=0.2 2023-09-29 11:52:55,758 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.31 vs. limit=15.0 2023-09-29 11:53:00,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 11:53:00,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 11:53:01,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:53:03,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 11:53:03,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 11:53:03,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:53:08,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 11:53:11,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:53:11,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 11:53:15,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 11:53:15,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:18,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:53:20,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:20,310 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 11:53:20,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 11:53:23,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:53:25,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:53:26,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:53:33,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 11:53:33,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 11:53:33,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:53:33,715 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.15 vs. limit=15.0 2023-09-29 11:53:36,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:37,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 11:53:39,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:53:42,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:53:47,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:53:47,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 11:53:51,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:51,722 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 11:53:54,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:53:56,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:53:56,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 11:53:57,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 11:53:59,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:54:00,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 11:54:02,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:54:03,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:05,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:54:05,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:54:06,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:09,945 INFO [train.py:1039] (3/4) Epoch 11, batch 350, loss[loss=0.1897, simple_loss=0.2618, pruned_loss=0.05882, over 23158.00 frames. ], tot_loss[loss=0.2009, simple_loss=0.2719, pruned_loss=0.065, over 3921591.88 frames. ], batch size: 105, lr: 9.64e-03, grad_scale: 16.0 2023-09-29 11:54:11,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:54:11,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 11:54:14,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:19,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:54:21,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:54:22,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:27,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 11:54:29,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:54:29,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 11:54:33,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:33,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 11:54:35,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:54:37,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 11:54:39,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:54:41,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 11:54:42,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:54:44,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:54:44,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:54:44,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:54:44,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:54:46,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 11:54:47,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:54:47,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:54:57,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:54:57,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 11:54:57,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:54:57,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:55:02,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 11:55:02,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:55:08,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:55:08,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:55:08,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:55:10,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 11:55:12,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:55:12,863 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=356673.3333333333, ans=0.125 2023-09-29 11:55:14,074 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 11:55:16,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 11:55:16,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:55:18,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=356740.0, ans=0.0 2023-09-29 11:55:19,164 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.719e+02 1.993e+02 2.217e+02 2.521e+02 3.405e+02, threshold=4.434e+02, percent-clipped=0.0 2023-09-29 11:55:19,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 11:55:19,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 11:55:22,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:55:25,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 11:55:25,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:55:25,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=356740.0, ans=0.04949747468305833 2023-09-29 11:55:26,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:55:27,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:55:30,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:55:32,060 INFO [train.py:1039] (3/4) Epoch 11, batch 400, loss[loss=0.2053, simple_loss=0.2881, pruned_loss=0.06124, over 24673.00 frames. ], tot_loss[loss=0.2005, simple_loss=0.2708, pruned_loss=0.06509, over 4102052.94 frames. ], batch size: 73, lr: 9.64e-03, grad_scale: 32.0 2023-09-29 11:55:33,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:55:37,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 11:55:38,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 11:55:38,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:55:38,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:55:40,885 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=356806.6666666667, ans=0.125 2023-09-29 11:55:41,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:55:42,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:55:45,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:55:47,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:55:49,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 11:55:51,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 11:55:51,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:55:51,413 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 11:55:52,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 11:55:54,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:55:56,556 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=8.41 vs. limit=15.0 2023-09-29 11:55:57,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:55:57,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:55:57,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 11:55:57,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:55:58,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:55:58,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:55:58,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=356873.3333333333, ans=0.1 2023-09-29 11:56:00,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:56:01,803 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 11:56:01,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 11:56:02,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=356873.3333333333, ans=0.125 2023-09-29 11:56:07,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:56:10,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:56:10,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 11:56:11,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 11:56:13,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 11:56:15,310 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:56:24,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 11:56:27,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 11:56:28,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 11:56:30,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:56:31,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 11:56:31,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 11:56:36,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 11:56:41,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 11:56:42,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:56:45,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:56:45,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=357073.3333333333, ans=0.125 2023-09-29 11:56:46,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 11:56:46,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=357073.3333333333, ans=0.0 2023-09-29 11:56:48,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 11:56:49,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 11:56:52,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 11:56:52,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:56:54,757 INFO [train.py:1039] (3/4) Epoch 11, batch 450, loss[loss=0.196, simple_loss=0.2831, pruned_loss=0.0544, over 24440.00 frames. ], tot_loss[loss=0.2012, simple_loss=0.2717, pruned_loss=0.06537, over 4243693.02 frames. ], batch size: 69, lr: 9.63e-03, grad_scale: 32.0 2023-09-29 11:56:54,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 11:56:56,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 11:56:58,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 11:56:59,358 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 11:56:59,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 11:57:01,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 11:57:01,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:57:01,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:57:01,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 11:57:03,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 11:57:03,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 11:57:06,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 11:57:13,907 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.73 vs. limit=15.0 2023-09-29 11:57:16,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:57:16,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:57:19,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 11:57:21,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 11:57:24,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 11:57:26,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:57:28,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:57:32,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:57:34,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:57:36,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 11:57:37,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 11:57:40,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 11:57:40,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:57:40,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:57:41,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=357273.3333333333, ans=0.125 2023-09-29 11:57:42,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 11:57:44,214 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 11:57:44,228 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 11:57:44,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:57:44,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:57:46,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 11:57:48,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 11:57:49,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 11:57:51,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 11:57:52,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 11:57:54,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:57:57,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 11:57:57,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 11:58:00,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 11:58:04,477 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.885e+02 2.166e+02 2.451e+02 4.204e+02, threshold=4.332e+02, percent-clipped=0.0 2023-09-29 11:58:04,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 11:58:06,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 11:58:07,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 11:58:09,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 11:58:16,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 11:58:17,676 INFO [train.py:1039] (3/4) Epoch 11, batch 500, loss[loss=0.2192, simple_loss=0.2902, pruned_loss=0.07406, over 23761.00 frames. ], tot_loss[loss=0.2026, simple_loss=0.2731, pruned_loss=0.06605, over 4357179.78 frames. ], batch size: 85, lr: 9.63e-03, grad_scale: 32.0 2023-09-29 11:58:17,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:58:19,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 11:58:19,408 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 11:58:24,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:58:24,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 11:58:26,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:58:26,156 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 11:58:27,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 11:58:27,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:58:30,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 11:58:32,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 11:58:32,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=357540.0, ans=0.1 2023-09-29 11:58:35,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 11:58:37,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:58:37,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 11:58:39,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:58:49,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:58:50,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 11:58:50,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 11:58:50,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:58:52,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 11:58:52,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 11:58:55,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:58:55,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 11:58:55,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=357606.6666666667, ans=0.125 2023-09-29 11:58:57,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 11:58:57,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:58:58,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 11:59:02,577 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 11:59:06,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:59:07,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:59:09,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:59:09,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:59:09,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 11:59:11,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 11:59:12,279 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.71 vs. limit=22.5 2023-09-29 11:59:15,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 11:59:17,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:59:20,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:59:24,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 11:59:31,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:59:35,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 11:59:35,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:59:35,283 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 11:59:38,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 11:59:39,842 INFO [train.py:1039] (3/4) Epoch 11, batch 550, loss[loss=0.2116, simple_loss=0.2768, pruned_loss=0.07317, over 23222.00 frames. ], tot_loss[loss=0.2044, simple_loss=0.2748, pruned_loss=0.06697, over 4445813.37 frames. ], batch size: 105, lr: 9.62e-03, grad_scale: 32.0 2023-09-29 11:59:39,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 11:59:41,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:59:46,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 11:59:47,728 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=357806.6666666667, ans=0.0 2023-09-29 11:59:48,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 11:59:48,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:59:50,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 11:59:50,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 11:59:50,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 11:59:52,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:59:52,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 11:59:52,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 11:59:53,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 11:59:55,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=357873.3333333333, ans=0.125 2023-09-29 11:59:56,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 11:59:56,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 11:59:57,523 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.40 vs. limit=6.0 2023-09-29 11:59:58,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:00:00,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=357873.3333333333, ans=0.125 2023-09-29 12:00:03,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:04,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:00:06,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:00:06,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:00:10,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 12:00:10,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 12:00:13,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:00:16,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:00:16,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:00:19,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:00:22,248 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=357940.0, ans=0.2 2023-09-29 12:00:23,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:23,365 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 12:00:23,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:00:25,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 12:00:25,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=357940.0, ans=0.0 2023-09-29 12:00:28,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:00:28,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:00:28,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:00:30,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:31,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 12:00:33,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 12:00:34,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:00:34,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:00:34,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:00:34,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:00:38,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:00:39,674 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.94 vs. limit=10.0 2023-09-29 12:00:40,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:00:43,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:00:43,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:43,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 12:00:46,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:00:48,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:00:49,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 12:00:49,731 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:00:50,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=358073.3333333333, ans=0.2 2023-09-29 12:00:51,044 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.720e+02 2.069e+02 2.330e+02 2.802e+02 5.186e+02, threshold=4.661e+02, percent-clipped=1.0 2023-09-29 12:00:53,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:00:53,202 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 12:00:59,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 12:01:02,381 INFO [train.py:1039] (3/4) Epoch 11, batch 600, loss[loss=0.1894, simple_loss=0.2681, pruned_loss=0.05538, over 24461.00 frames. ], tot_loss[loss=0.2056, simple_loss=0.2757, pruned_loss=0.0678, over 4504959.66 frames. ], batch size: 66, lr: 9.62e-03, grad_scale: 16.0 2023-09-29 12:01:02,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 12:01:04,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:01:04,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:01:06,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:01:14,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:01:14,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 12:01:16,435 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=358140.0, ans=0.0 2023-09-29 12:01:17,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 12:01:20,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:01:20,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:01:22,247 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:01:25,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 12:01:25,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:01:31,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 12:01:32,458 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=12.67 vs. limit=22.5 2023-09-29 12:01:34,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:01:34,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:01:34,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:01:39,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:01:39,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:01:41,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:01:44,855 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=358273.3333333333, ans=0.0 2023-09-29 12:01:49,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:01:53,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:01:53,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:01:54,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:02:02,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 12:02:07,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 12:02:07,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:02:13,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 12:02:13,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:02:17,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 12:02:17,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:02:17,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:02:24,981 INFO [train.py:1039] (3/4) Epoch 11, batch 650, loss[loss=0.2031, simple_loss=0.2716, pruned_loss=0.06725, over 23609.00 frames. ], tot_loss[loss=0.2044, simple_loss=0.2741, pruned_loss=0.06737, over 4535582.23 frames. ], batch size: 134, lr: 9.61e-03, grad_scale: 16.0 2023-09-29 12:02:25,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 12:02:26,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 12:02:30,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:02:31,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:02:33,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:02:35,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 12:02:37,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:02:43,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:02:43,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:02:47,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:02:51,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 12:02:53,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:02:55,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:02:58,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:02:58,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 12:03:01,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:03:01,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:03,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 12:03:03,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:05,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:03:09,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 12:03:09,495 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 12:03:09,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:03:09,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:03:09,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=358606.6666666667, ans=0.1 2023-09-29 12:03:09,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=358606.6666666667, ans=0.125 2023-09-29 12:03:13,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:14,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:03:14,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:03:14,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:03:16,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 12:03:18,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:03:18,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:03:18,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:03:18,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:03:21,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:03:23,298 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 12:03:24,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 12:03:24,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:24,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:03:24,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:03:26,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:03:28,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:03:34,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:34,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:03:35,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:03:37,065 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.924e+02 2.251e+02 2.757e+02 4.294e+02, threshold=4.503e+02, percent-clipped=0.0 2023-09-29 12:03:37,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:03:37,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:03:37,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:03:42,265 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=358740.0, ans=0.125 2023-09-29 12:03:45,463 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.06 vs. limit=15.0 2023-09-29 12:03:46,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:03:46,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:03:46,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:03:46,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:03:48,380 INFO [train.py:1039] (3/4) Epoch 11, batch 700, loss[loss=0.1926, simple_loss=0.2752, pruned_loss=0.05502, over 24670.00 frames. ], tot_loss[loss=0.2034, simple_loss=0.272, pruned_loss=0.06736, over 4562802.90 frames. ], batch size: 73, lr: 9.61e-03, grad_scale: 16.0 2023-09-29 12:03:48,936 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=358806.6666666667, ans=0.2 2023-09-29 12:03:52,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 12:03:52,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 12:03:55,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 12:03:56,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:03:58,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:04:00,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 12:04:03,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:04:09,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:04:10,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:04:12,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:04:12,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:04:15,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:04:17,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 12:04:17,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:04:20,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 12:04:23,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 12:04:27,534 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:04:28,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:04:30,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:04:35,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:04:35,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 12:04:36,261 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=358940.0, ans=10.0 2023-09-29 12:04:41,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:04:41,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:04:42,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 12:04:43,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=359006.6666666667, ans=0.125 2023-09-29 12:04:45,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:04:47,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:04:51,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:04:57,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:04:57,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 12:04:58,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=359073.3333333333, ans=0.125 2023-09-29 12:04:59,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 12:05:00,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 12:05:05,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:05:06,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:05:08,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:05:10,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:05:10,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 12:05:12,379 INFO [train.py:1039] (3/4) Epoch 11, batch 750, loss[loss=0.2028, simple_loss=0.2449, pruned_loss=0.0804, over 19089.00 frames. ], tot_loss[loss=0.2029, simple_loss=0.2714, pruned_loss=0.06714, over 4591121.25 frames. ], batch size: 388, lr: 9.60e-03, grad_scale: 16.0 2023-09-29 12:05:12,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=359140.0, ans=0.125 2023-09-29 12:05:15,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 12:05:15,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 12:05:15,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 12:05:17,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 12:05:17,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 12:05:17,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:05:18,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 12:05:20,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:05:20,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:05:23,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:05:26,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:05:26,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:05:26,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:05:28,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:05:29,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:05:31,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:05:32,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:05:34,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:05:36,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 12:05:36,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:05:39,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:05:39,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:05:41,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 12:05:43,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 12:05:43,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:05:45,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 12:05:45,819 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 12:05:47,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 12:05:47,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 12:05:47,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 12:05:50,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:05:57,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:05:57,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:05:57,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:06:00,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:06:02,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:06:04,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 12:06:05,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:06:06,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 12:06:07,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:06:12,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:06:14,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 12:06:14,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:06:18,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:06:19,773 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.78 vs. limit=15.0 2023-09-29 12:06:20,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:06:22,360 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 2.014e+02 2.278e+02 2.730e+02 4.361e+02, threshold=4.557e+02, percent-clipped=0.0 2023-09-29 12:06:23,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:06:25,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:06:29,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 12:06:29,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:06:30,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:06:32,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:06:32,780 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=359473.3333333333, ans=0.125 2023-09-29 12:06:33,828 INFO [train.py:1039] (3/4) Epoch 11, batch 800, loss[loss=0.2168, simple_loss=0.2848, pruned_loss=0.07443, over 23964.00 frames. ], tot_loss[loss=0.203, simple_loss=0.2719, pruned_loss=0.06706, over 4613017.97 frames. ], batch size: 86, lr: 9.60e-03, grad_scale: 32.0 2023-09-29 12:06:33,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:06:35,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:06:35,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:06:37,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=359473.3333333333, ans=0.0 2023-09-29 12:06:45,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:06:45,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:06:47,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:06:47,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:06:50,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:06:50,062 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:06:52,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:06:55,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:06:57,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:06:59,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=359540.0, ans=0.1 2023-09-29 12:07:01,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 12:07:01,595 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=359540.0, ans=0.125 2023-09-29 12:07:01,896 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.73 vs. limit=6.0 2023-09-29 12:07:02,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:07:02,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:07:02,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:07:04,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:07:04,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 12:07:04,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:07:04,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 12:07:08,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:07:11,894 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:07:13,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:07:13,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:07:16,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:07:16,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:07:20,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:07:20,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:07:22,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 12:07:22,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=359673.3333333333, ans=0.125 2023-09-29 12:07:23,730 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 12:07:23,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 12:07:23,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:07:23,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:07:25,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=359673.3333333333, ans=0.125 2023-09-29 12:07:27,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:07:27,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:07:32,510 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 12:07:33,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 12:07:35,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:07:37,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:07:41,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:07:44,779 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:07:46,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 12:07:47,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:07:50,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 12:07:55,866 INFO [train.py:1039] (3/4) Epoch 11, batch 850, loss[loss=0.1877, simple_loss=0.2657, pruned_loss=0.0548, over 24465.00 frames. ], tot_loss[loss=0.2036, simple_loss=0.2729, pruned_loss=0.06717, over 4636415.74 frames. ], batch size: 66, lr: 9.60e-03, grad_scale: 16.0 2023-09-29 12:07:56,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:07:57,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:07:59,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 12:08:00,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:08:00,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:08:02,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 12:08:02,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:08:05,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:08:05,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=359806.6666666667, ans=0.0 2023-09-29 12:08:07,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:08:08,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:08:10,247 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:08:10,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 12:08:11,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 12:08:11,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 12:08:13,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:08:13,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:08:15,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:08:15,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:08:16,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:08:21,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:08:21,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:08:21,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 12:08:24,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 12:08:29,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:08:31,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 12:08:34,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 12:08:36,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 12:08:38,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=359940.0, ans=0.0 2023-09-29 12:08:40,051 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 12:08:40,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:08:40,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:08:40,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 12:08:43,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:08:45,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:08:47,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 12:08:48,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:08:50,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:08:50,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:08:50,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 12:08:51,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:08:53,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 12:08:54,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 12:08:58,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:08:58,052 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:08:59,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:08:59,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:09:01,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:09:03,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:09:04,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 12:09:06,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:09:07,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:09:07,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:09:09,284 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.651e+02 2.084e+02 2.353e+02 2.728e+02 3.950e+02, threshold=4.707e+02, percent-clipped=0.0 2023-09-29 12:09:16,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 12:09:17,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=360073.3333333333, ans=0.125 2023-09-29 12:09:18,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:09:19,684 INFO [train.py:1039] (3/4) Epoch 11, batch 900, loss[loss=0.2091, simple_loss=0.2697, pruned_loss=0.07432, over 23818.00 frames. ], tot_loss[loss=0.2061, simple_loss=0.2749, pruned_loss=0.06864, over 4648020.76 frames. ], batch size: 164, lr: 9.59e-03, grad_scale: 16.0 2023-09-29 12:09:19,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 12:09:19,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:09:19,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:09:22,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 12:09:23,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=360140.0, ans=0.2 2023-09-29 12:09:27,636 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:09:30,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:09:32,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 12:09:35,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:09:37,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 12:09:38,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 12:09:38,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:09:38,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:09:40,421 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:09:40,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:09:40,541 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=360206.6666666667, ans=0.1 2023-09-29 12:09:50,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:09:51,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=360273.3333333333, ans=0.125 2023-09-29 12:09:52,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:09:52,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:09:56,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:10:02,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 12:10:04,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:10:07,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:10:07,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:10:09,115 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 12:10:09,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=360340.0, ans=0.1 2023-09-29 12:10:11,100 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 12:10:15,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:10:15,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:10:17,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:10:24,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:10:24,126 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:10:27,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 12:10:27,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:10:28,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 12:10:30,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:10:31,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:10:31,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:10:31,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:10:37,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 12:10:37,957 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 12:10:39,421 INFO [train.py:1039] (3/4) Epoch 11, batch 950, loss[loss=0.2043, simple_loss=0.2876, pruned_loss=0.06053, over 24646.00 frames. ], tot_loss[loss=0.2059, simple_loss=0.2746, pruned_loss=0.06856, over 4668569.68 frames. ], batch size: 73, lr: 9.59e-03, grad_scale: 16.0 2023-09-29 12:10:40,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 12:10:40,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 12:10:43,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:10:46,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 12:10:50,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:10:52,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:10:53,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=360473.3333333333, ans=0.1 2023-09-29 12:10:54,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:10:54,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 12:10:55,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=360540.0, ans=0.0 2023-09-29 12:10:56,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=360540.0, ans=0.2 2023-09-29 12:10:58,477 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 12:11:01,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:11:01,728 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:11:03,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:11:03,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:11:03,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 12:11:04,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 12:11:06,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:11:08,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 12:11:08,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:11:12,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:11:12,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:11:12,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:11:14,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 12:11:16,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 12:11:17,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:11:19,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:11:24,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:11:24,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:11:31,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 12:11:31,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 12:11:31,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:11:31,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:11:31,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:11:31,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:11:37,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 12:11:37,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:11:42,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:11:42,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:11:42,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 12:11:43,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:11:43,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:11:43,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 12:11:45,927 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.39 vs. limit=15.0 2023-09-29 12:11:48,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:11:50,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:11:50,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=360740.0, ans=0.0 2023-09-29 12:11:51,685 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 1.923e+02 2.189e+02 2.546e+02 4.043e+02, threshold=4.378e+02, percent-clipped=0.0 2023-09-29 12:11:52,814 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.49 vs. limit=22.5 2023-09-29 12:11:53,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:11:53,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 12:11:55,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 12:12:00,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:12:02,432 INFO [train.py:1039] (3/4) Epoch 11, batch 1000, loss[loss=0.208, simple_loss=0.2582, pruned_loss=0.07891, over 23424.00 frames. ], tot_loss[loss=0.2044, simple_loss=0.2731, pruned_loss=0.06787, over 4678931.35 frames. ], batch size: 285, lr: 9.58e-03, grad_scale: 16.0 2023-09-29 12:12:05,009 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.12 vs. limit=15.0 2023-09-29 12:12:05,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 12:12:05,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:12:10,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:12:11,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 12:12:11,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 12:12:16,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:12:16,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:12:19,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:12:21,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 12:12:24,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 12:12:26,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 12:12:26,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:12:29,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 12:12:31,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 12:12:31,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 12:12:33,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:12:35,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:12:44,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:12:44,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:12:46,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:12:47,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:12:47,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 12:12:47,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:12:47,965 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=360940.0, ans=0.125 2023-09-29 12:12:48,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:12:49,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:12:49,158 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 12:12:55,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 12:12:55,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 12:12:56,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 12:12:59,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:13:09,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:13:09,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:13:10,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:13:10,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:13:12,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 12:13:13,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:13:13,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 12:13:14,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=361073.3333333333, ans=0.0 2023-09-29 12:13:15,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 12:13:16,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:13:16,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:13:18,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:13:18,829 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=361073.3333333333, ans=0.125 2023-09-29 12:13:20,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:13:21,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:13:23,132 INFO [train.py:1039] (3/4) Epoch 11, batch 1050, loss[loss=0.1856, simple_loss=0.2566, pruned_loss=0.05727, over 24304.00 frames. ], tot_loss[loss=0.2024, simple_loss=0.2713, pruned_loss=0.06678, over 4694887.71 frames. ], batch size: 56, lr: 9.58e-03, grad_scale: 16.0 2023-09-29 12:13:24,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:13:26,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:13:27,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 12:13:29,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:13:32,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:13:35,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:13:36,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:13:40,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:13:41,427 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.06 vs. limit=15.0 2023-09-29 12:13:42,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:13:42,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:13:43,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:13:44,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 12:13:45,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:13:46,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 12:13:49,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:13:49,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 12:13:49,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:13:55,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:13:56,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff2.min_abs, batch_count=361273.3333333333, ans=0.1 2023-09-29 12:13:57,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:13:57,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:14:00,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 12:14:00,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 12:14:00,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:14:03,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 12:14:06,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 12:14:08,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:14:12,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 12:14:12,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 12:14:14,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:14:16,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:14:20,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=361340.0, ans=0.0 2023-09-29 12:14:21,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:14:24,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 12:14:26,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 12:14:26,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 12:14:26,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:14:26,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=361340.0, ans=0.125 2023-09-29 12:14:27,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:14:27,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 12:14:32,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:14:33,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:14:33,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:14:35,258 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.858e+02 2.245e+02 2.592e+02 4.386e+02, threshold=4.489e+02, percent-clipped=1.0 2023-09-29 12:14:35,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:14:35,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:14:37,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=361406.6666666667, ans=0.125 2023-09-29 12:14:39,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:14:39,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 12:14:40,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=361406.6666666667, ans=0.04949747468305833 2023-09-29 12:14:42,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:14:42,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 12:14:42,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 12:14:43,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:14:44,797 INFO [train.py:1039] (3/4) Epoch 11, batch 1100, loss[loss=0.2177, simple_loss=0.2944, pruned_loss=0.07053, over 23398.00 frames. ], tot_loss[loss=0.2015, simple_loss=0.2701, pruned_loss=0.06641, over 4688098.56 frames. ], batch size: 93, lr: 9.57e-03, grad_scale: 16.0 2023-09-29 12:14:47,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:14:52,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:14:56,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=361473.3333333333, ans=0.04949747468305833 2023-09-29 12:14:57,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:14:59,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:14:59,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:15:00,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 12:15:01,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:15:01,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=361540.0, ans=0.125 2023-09-29 12:15:02,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 12:15:05,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:15:08,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:15:08,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 12:15:10,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 12:15:11,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:15:11,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:15:13,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:15:16,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:15:19,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:15:21,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=361606.6666666667, ans=0.125 2023-09-29 12:15:22,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 12:15:24,429 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 12:15:24,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:15:28,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:15:29,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:15:29,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:15:31,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 12:15:31,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:15:31,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:15:31,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:15:32,623 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:15:32,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 12:15:39,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:15:39,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 12:15:40,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:15:44,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:15:50,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 12:15:50,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 12:15:51,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:15:53,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=361740.0, ans=0.1 2023-09-29 12:15:54,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:15:56,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:15:57,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 12:15:59,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:15:59,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:16:00,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 12:16:02,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:16:02,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 12:16:04,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:16:04,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:16:05,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:16:08,845 INFO [train.py:1039] (3/4) Epoch 11, batch 1150, loss[loss=0.29, simple_loss=0.3295, pruned_loss=0.1253, over 19527.00 frames. ], tot_loss[loss=0.2021, simple_loss=0.2705, pruned_loss=0.06681, over 4666887.90 frames. ], batch size: 388, lr: 9.57e-03, grad_scale: 16.0 2023-09-29 12:16:09,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:16:10,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:16:12,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=361806.6666666667, ans=0.0 2023-09-29 12:16:13,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:16:13,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:16:13,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 12:16:15,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:16:18,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 12:16:19,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:16:19,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:16:25,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 12:16:27,418 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:16:30,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:16:32,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:16:32,346 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=361873.3333333333, ans=0.1 2023-09-29 12:16:33,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 12:16:33,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:16:33,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:16:37,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=361873.3333333333, ans=0.125 2023-09-29 12:16:40,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 12:16:40,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:16:41,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:16:41,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=361940.0, ans=0.125 2023-09-29 12:16:43,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=361940.0, ans=0.125 2023-09-29 12:16:51,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:16:58,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:16:59,326 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=10.29 vs. limit=22.5 2023-09-29 12:17:00,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 12:17:00,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:17:00,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:17:08,478 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 12:17:10,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:17:19,613 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 12:17:21,042 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.993e+02 2.217e+02 2.557e+02 3.633e+02, threshold=4.434e+02, percent-clipped=0.0 2023-09-29 12:17:24,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:17:25,029 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.52 vs. limit=6.0 2023-09-29 12:17:25,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:17:25,803 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:17:25,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:17:30,721 INFO [train.py:1039] (3/4) Epoch 11, batch 1200, loss[loss=0.202, simple_loss=0.2709, pruned_loss=0.06657, over 23755.00 frames. ], tot_loss[loss=0.2033, simple_loss=0.2723, pruned_loss=0.06717, over 4680132.72 frames. ], batch size: 135, lr: 9.57e-03, grad_scale: 32.0 2023-09-29 12:17:30,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:17:36,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:17:36,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:17:41,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:17:41,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:17:42,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:17:44,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:17:45,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:17:47,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:17:47,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:17:48,938 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 12:17:51,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 12:17:55,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:17:58,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:18:00,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:18:01,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:18:01,934 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 12:18:03,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:18:12,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:18:12,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:18:12,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 12:18:12,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:18:17,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 12:18:20,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 12:18:20,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:18:20,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:18:22,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:18:22,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:18:23,353 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.37 vs. limit=15.0 2023-09-29 12:18:25,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:18:25,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:18:25,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:18:27,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 12:18:27,737 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=362340.0, ans=0.0 2023-09-29 12:18:28,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:18:28,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:18:29,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:18:30,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:18:30,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:18:35,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 12:18:36,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:18:40,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 12:18:46,480 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 12:18:48,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:18:48,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=362406.6666666667, ans=0.0 2023-09-29 12:18:51,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:18:51,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=362406.6666666667, ans=0.0 2023-09-29 12:18:52,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:18:54,001 INFO [train.py:1039] (3/4) Epoch 11, batch 1250, loss[loss=0.182, simple_loss=0.2589, pruned_loss=0.05251, over 24439.00 frames. ], tot_loss[loss=0.2046, simple_loss=0.2738, pruned_loss=0.06768, over 4698691.92 frames. ], batch size: 63, lr: 9.56e-03, grad_scale: 16.0 2023-09-29 12:18:54,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:18:57,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 12:18:57,524 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=362473.3333333333, ans=0.07 2023-09-29 12:18:59,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=362473.3333333333, ans=0.125 2023-09-29 12:19:01,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:19:03,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:19:05,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 12:19:06,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:19:08,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:19:10,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=362540.0, ans=0.125 2023-09-29 12:19:13,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 12:19:13,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:19:14,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:19:14,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:19:18,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:19:22,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=362540.0, ans=0.125 2023-09-29 12:19:23,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 12:19:23,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 12:19:24,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:19:26,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:19:27,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:19:28,133 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=362606.6666666667, ans=0.125 2023-09-29 12:19:30,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:19:32,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:19:37,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 12:19:37,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:19:37,926 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.58 vs. limit=22.5 2023-09-29 12:19:38,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:19:39,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 12:19:41,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:19:41,022 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 12:19:41,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:19:41,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:19:41,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=362606.6666666667, ans=0.95 2023-09-29 12:19:47,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:19:48,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:19:50,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:19:51,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 12:19:51,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 12:19:54,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 12:19:57,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:19:59,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 12:19:59,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:20:01,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=362740.0, ans=0.025 2023-09-29 12:20:02,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 12:20:02,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:20:05,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 12:20:05,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 12:20:05,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:20:06,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 12:20:06,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:20:08,087 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.898e+02 2.110e+02 2.286e+02 3.124e+02, threshold=4.219e+02, percent-clipped=0.0 2023-09-29 12:20:08,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 12:20:11,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:20:12,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:20:14,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:20:16,416 INFO [train.py:1039] (3/4) Epoch 11, batch 1300, loss[loss=0.1902, simple_loss=0.2615, pruned_loss=0.05948, over 24340.00 frames. ], tot_loss[loss=0.2046, simple_loss=0.2741, pruned_loss=0.06761, over 4702066.35 frames. ], batch size: 61, lr: 9.56e-03, grad_scale: 16.0 2023-09-29 12:20:16,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 12:20:21,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:20:21,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 12:20:27,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:20:29,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:20:30,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:20:31,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:20:31,812 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:20:32,131 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:20:33,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 12:20:39,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:20:40,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:20:41,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 12:20:44,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:20:46,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=362873.3333333333, ans=0.0 2023-09-29 12:20:47,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:20:48,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:20:50,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:20:52,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:20:53,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:20:55,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 12:20:55,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 12:21:01,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:21:01,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:21:02,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=362940.0, ans=0.1 2023-09-29 12:21:04,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 12:21:04,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 12:21:06,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:21:08,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:21:09,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 12:21:09,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:21:09,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 12:21:12,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:21:15,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:21:15,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:21:18,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 12:21:20,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 12:21:20,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=363073.3333333333, ans=0.0 2023-09-29 12:21:21,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 12:21:25,451 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:21:27,163 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=363073.3333333333, ans=0.125 2023-09-29 12:21:28,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 12:21:29,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:21:34,020 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.29 vs. limit=12.0 2023-09-29 12:21:39,068 INFO [train.py:1039] (3/4) Epoch 11, batch 1350, loss[loss=0.2124, simple_loss=0.2687, pruned_loss=0.07806, over 23808.00 frames. ], tot_loss[loss=0.2033, simple_loss=0.2729, pruned_loss=0.06683, over 4711828.85 frames. ], batch size: 150, lr: 9.55e-03, grad_scale: 16.0 2023-09-29 12:21:39,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 12:21:42,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:21:45,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:21:47,800 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.89 vs. limit=15.0 2023-09-29 12:21:48,623 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:21:48,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:21:50,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:21:50,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:21:50,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=363140.0, ans=0.0 2023-09-29 12:21:54,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:21:56,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 12:21:58,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:21:58,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:22:01,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 12:22:03,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:22:03,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:22:03,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 12:22:06,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 12:22:07,830 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=363206.6666666667, ans=0.125 2023-09-29 12:22:09,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 12:22:09,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=363206.6666666667, ans=0.125 2023-09-29 12:22:10,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:22:11,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 12:22:12,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=363273.3333333333, ans=0.125 2023-09-29 12:22:16,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=363273.3333333333, ans=0.2 2023-09-29 12:22:22,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:22:32,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:22:32,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:22:32,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 12:22:36,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:22:37,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 12:22:37,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:22:39,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:22:41,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:22:44,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 12:22:46,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:22:52,344 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.973e+02 2.286e+02 2.663e+02 4.619e+02, threshold=4.571e+02, percent-clipped=1.0 2023-09-29 12:22:52,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 12:22:54,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 12:22:59,929 INFO [train.py:1039] (3/4) Epoch 11, batch 1400, loss[loss=0.1989, simple_loss=0.2626, pruned_loss=0.06757, over 23903.00 frames. ], tot_loss[loss=0.2018, simple_loss=0.2711, pruned_loss=0.06628, over 4710448.86 frames. ], batch size: 212, lr: 9.55e-03, grad_scale: 16.0 2023-09-29 12:23:00,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=363473.3333333333, ans=0.1 2023-09-29 12:23:01,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 12:23:03,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:23:04,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:23:04,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:23:13,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 12:23:14,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 12:23:18,379 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.36 vs. limit=15.0 2023-09-29 12:23:26,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=363540.0, ans=0.1 2023-09-29 12:23:28,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:23:29,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:23:31,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:23:31,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 12:23:35,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:23:36,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 12:23:40,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=363606.6666666667, ans=0.0 2023-09-29 12:23:47,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:23:49,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:23:53,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 12:23:55,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:23:56,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:23:56,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:23:56,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:23:58,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:23:58,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:23:58,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:24:01,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 12:24:01,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:24:05,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=363740.0, ans=0.09899494936611666 2023-09-29 12:24:06,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:24:06,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=363740.0, ans=0.125 2023-09-29 12:24:06,542 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=363740.0, ans=0.125 2023-09-29 12:24:07,035 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.00 vs. limit=22.5 2023-09-29 12:24:09,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:24:14,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 12:24:15,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 12:24:16,506 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=14.47 vs. limit=15.0 2023-09-29 12:24:17,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:24:21,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 12:24:21,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:24:22,654 INFO [train.py:1039] (3/4) Epoch 11, batch 1450, loss[loss=0.2015, simple_loss=0.2814, pruned_loss=0.06077, over 24080.00 frames. ], tot_loss[loss=0.2001, simple_loss=0.2691, pruned_loss=0.06559, over 4702032.22 frames. ], batch size: 80, lr: 9.54e-03, grad_scale: 16.0 2023-09-29 12:24:24,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:24:28,068 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.53 vs. limit=22.5 2023-09-29 12:24:28,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:24:31,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:24:31,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:24:31,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 12:24:36,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:24:37,802 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:24:39,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:24:40,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 12:24:42,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:24:42,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 12:24:43,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:24:43,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:24:43,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 12:24:45,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:24:45,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:24:47,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 12:24:47,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:24:48,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:24:50,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:24:53,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:24:57,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:24:57,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:25:01,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:25:01,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:25:03,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:25:03,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:25:03,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:25:04,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:25:09,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 12:25:11,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:25:15,916 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 12:25:16,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:25:17,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:25:19,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:25:19,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=364006.6666666667, ans=0.125 2023-09-29 12:25:20,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 12:25:23,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:25:25,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 12:25:27,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 12:25:27,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:25:33,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:25:34,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:25:36,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 12:25:38,105 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.910e+02 2.158e+02 2.591e+02 3.926e+02, threshold=4.316e+02, percent-clipped=0.0 2023-09-29 12:25:38,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 12:25:39,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 12:25:41,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:25:42,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:25:43,873 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.99 vs. limit=22.5 2023-09-29 12:25:45,914 INFO [train.py:1039] (3/4) Epoch 11, batch 1500, loss[loss=0.1908, simple_loss=0.2677, pruned_loss=0.05691, over 24665.00 frames. ], tot_loss[loss=0.2008, simple_loss=0.2703, pruned_loss=0.0656, over 4702196.65 frames. ], batch size: 65, lr: 9.54e-03, grad_scale: 16.0 2023-09-29 12:25:53,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 12:25:54,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:25:54,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:25:55,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:25:56,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:25:56,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:25:57,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=364140.0, ans=0.125 2023-09-29 12:25:58,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 12:26:01,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:26:01,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:26:01,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:26:02,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:26:04,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:26:06,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:26:08,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=364206.6666666667, ans=0.125 2023-09-29 12:26:12,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:26:12,546 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 12:26:12,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:26:14,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:26:14,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:26:17,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 12:26:21,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 12:26:23,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:26:24,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 12:26:26,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 12:26:28,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=364273.3333333333, ans=0.125 2023-09-29 12:26:29,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:26:30,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:26:30,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:26:33,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 12:26:34,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:26:34,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:26:36,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 12:26:36,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:26:42,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:26:42,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 12:26:48,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=364340.0, ans=0.125 2023-09-29 12:26:50,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:26:51,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:26:54,828 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 12:26:54,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:26:54,932 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 12:26:56,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:26:57,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:26:59,334 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 12:27:00,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:27:02,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 12:27:05,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:27:06,853 INFO [train.py:1039] (3/4) Epoch 11, batch 1550, loss[loss=0.2221, simple_loss=0.2784, pruned_loss=0.08286, over 22766.00 frames. ], tot_loss[loss=0.2018, simple_loss=0.2712, pruned_loss=0.06622, over 4712351.84 frames. ], batch size: 322, lr: 9.54e-03, grad_scale: 16.0 2023-09-29 12:27:07,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:27:08,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:27:08,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:27:08,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:27:08,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:27:10,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 12:27:12,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 12:27:12,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:27:13,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 12:27:13,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 12:27:17,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:27:20,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:27:20,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:27:21,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:27:21,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:27:22,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:27:25,995 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 12:27:26,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:27:27,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:27:27,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:27:30,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:27:30,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 12:27:32,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:27:32,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 12:27:33,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 12:27:33,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 12:27:33,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:27:35,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:27:39,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:27:42,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 12:27:42,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 12:27:46,155 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=364606.6666666667, ans=0.1 2023-09-29 12:27:48,302 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=364606.6666666667, ans=0.05 2023-09-29 12:27:51,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:27:57,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:27:57,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 12:27:57,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:27:58,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 12:28:01,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:28:03,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:28:06,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:28:08,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:28:08,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:28:08,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 12:28:09,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:28:11,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:28:11,929 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.96 vs. limit=10.0 2023-09-29 12:28:12,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:28:14,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 12:28:14,049 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 12:28:15,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:28:15,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=364740.0, ans=0.1 2023-09-29 12:28:22,361 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.938e+02 2.255e+02 2.720e+02 4.386e+02, threshold=4.510e+02, percent-clipped=1.0 2023-09-29 12:28:22,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 12:28:22,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=364740.0, ans=0.125 2023-09-29 12:28:28,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:28:29,649 INFO [train.py:1039] (3/4) Epoch 11, batch 1600, loss[loss=0.2137, simple_loss=0.2773, pruned_loss=0.07506, over 23676.00 frames. ], tot_loss[loss=0.2032, simple_loss=0.2723, pruned_loss=0.06708, over 4697079.49 frames. ], batch size: 232, lr: 9.53e-03, grad_scale: 16.0 2023-09-29 12:28:29,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:28:31,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 12:28:32,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:28:32,995 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=364806.6666666667, ans=0.125 2023-09-29 12:28:34,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:28:34,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:28:34,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:28:34,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:28:37,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:28:37,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 12:28:39,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 12:28:40,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 12:28:41,915 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.10 vs. limit=15.0 2023-09-29 12:28:43,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:28:45,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 12:28:45,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:28:48,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:28:53,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:28:53,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=364873.3333333333, ans=0.95 2023-09-29 12:28:55,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 12:28:59,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:29:00,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 12:29:01,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:02,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 12:29:08,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 12:29:09,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=364940.0, ans=0.125 2023-09-29 12:29:15,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:29:15,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 12:29:15,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:29:15,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:29:15,451 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:29:18,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 12:29:23,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 12:29:26,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:29:26,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:28,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:30,226 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:29:32,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:29:34,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:29:35,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:29:41,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:43,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:29:46,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 12:29:46,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:29:46,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 12:29:50,778 INFO [train.py:1039] (3/4) Epoch 11, batch 1650, loss[loss=0.2939, simple_loss=0.3369, pruned_loss=0.1254, over 19771.00 frames. ], tot_loss[loss=0.2048, simple_loss=0.2737, pruned_loss=0.068, over 4686114.53 frames. ], batch size: 388, lr: 9.53e-03, grad_scale: 16.0 2023-09-29 12:29:50,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:29:52,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:29:53,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:29:53,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 12:29:53,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 12:29:53,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 12:29:55,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 12:29:57,447 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=365140.0, ans=0.5 2023-09-29 12:29:58,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:29:58,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:30:00,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:30:00,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:30:01,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:30:02,983 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=365140.0, ans=0.2 2023-09-29 12:30:04,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 12:30:07,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:30:07,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:30:07,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:30:07,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:30:08,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 12:30:08,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 12:30:16,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:30:18,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:30:21,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=365273.3333333333, ans=0.07 2023-09-29 12:30:27,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 12:30:27,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:30:29,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 12:30:34,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:30:36,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:30:36,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:30:36,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:30:37,083 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.47 vs. limit=15.0 2023-09-29 12:30:39,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:30:39,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:30:43,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:30:45,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:30:45,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:30:45,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:30:45,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:30:48,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:30:51,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:30:51,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 12:30:52,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:30:54,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 12:30:55,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 12:30:57,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 12:30:57,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:30:57,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:30:58,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:30:58,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:30:58,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 12:31:02,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:31:04,948 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.934e+02 2.140e+02 2.424e+02 3.897e+02, threshold=4.280e+02, percent-clipped=0.0 2023-09-29 12:31:05,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:31:05,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:31:09,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 12:31:10,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=365473.3333333333, ans=0.0 2023-09-29 12:31:11,712 INFO [train.py:1039] (3/4) Epoch 11, batch 1700, loss[loss=0.1931, simple_loss=0.2554, pruned_loss=0.0654, over 23393.00 frames. ], tot_loss[loss=0.204, simple_loss=0.273, pruned_loss=0.0675, over 4692072.02 frames. ], batch size: 106, lr: 9.52e-03, grad_scale: 16.0 2023-09-29 12:31:11,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:31:11,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:31:14,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 12:31:14,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:31:15,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:31:15,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:31:17,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:31:17,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:31:17,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 12:31:20,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:31:22,503 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.59 vs. limit=6.0 2023-09-29 12:31:25,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=365473.3333333333, ans=0.125 2023-09-29 12:31:25,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=365473.3333333333, ans=0.0 2023-09-29 12:31:30,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:31:33,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:31:37,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:31:37,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:31:38,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:31:38,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:31:42,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 12:31:45,974 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:31:46,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:31:48,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:31:49,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:31:51,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 12:31:52,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 12:31:54,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:31:54,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 12:31:56,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:31:57,934 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=365606.6666666667, ans=0.125 2023-09-29 12:32:01,636 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.51 vs. limit=6.0 2023-09-29 12:32:03,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:32:05,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:32:06,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:32:08,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 12:32:08,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 12:32:08,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:32:11,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:32:11,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 12:32:11,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:32:11,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:32:13,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:32:13,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:32:17,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:32:17,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:32:17,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:32:19,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:32:19,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:32:21,390 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.05 vs. limit=10.0 2023-09-29 12:32:24,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:32:24,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 12:32:26,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:32:27,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:32:29,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 12:32:34,316 INFO [train.py:1039] (3/4) Epoch 11, batch 1750, loss[loss=0.1746, simple_loss=0.2535, pruned_loss=0.0479, over 24336.00 frames. ], tot_loss[loss=0.2019, simple_loss=0.2701, pruned_loss=0.0668, over 4671626.79 frames. ], batch size: 61, lr: 9.52e-03, grad_scale: 16.0 2023-09-29 12:32:34,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:32:35,158 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.15 vs. limit=15.0 2023-09-29 12:32:37,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:32:37,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 12:32:38,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 12:32:39,004 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:32:42,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:32:42,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:32:47,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 12:32:50,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:32:54,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 12:32:54,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:32:54,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=365873.3333333333, ans=0.0 2023-09-29 12:32:56,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:32:59,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 12:33:00,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 12:33:02,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:33:02,546 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 12:33:05,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=365940.0, ans=0.125 2023-09-29 12:33:11,721 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:33:14,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:33:14,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:33:15,789 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.65 vs. limit=15.0 2023-09-29 12:33:19,460 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:33:19,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:33:21,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:33:21,841 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.92 vs. limit=12.0 2023-09-29 12:33:25,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:33:27,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:33:29,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:33:29,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 12:33:32,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:33:35,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 12:33:35,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:33:38,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:33:39,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:33:39,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=366073.3333333333, ans=0.1 2023-09-29 12:33:43,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:33:43,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 12:33:45,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:33:45,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=366073.3333333333, ans=0.125 2023-09-29 12:33:46,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:33:49,674 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 2.016e+02 2.265e+02 2.513e+02 4.125e+02, threshold=4.530e+02, percent-clipped=0.0 2023-09-29 12:33:51,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:33:54,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:33:55,903 INFO [train.py:1039] (3/4) Epoch 11, batch 1800, loss[loss=0.1844, simple_loss=0.2587, pruned_loss=0.05509, over 24551.00 frames. ], tot_loss[loss=0.2009, simple_loss=0.2703, pruned_loss=0.06572, over 4694518.78 frames. ], batch size: 60, lr: 9.51e-03, grad_scale: 16.0 2023-09-29 12:33:56,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:33:56,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 12:33:56,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:33:58,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:33:58,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:33:58,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:33:59,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:33:59,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:34:02,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=366140.0, ans=0.0 2023-09-29 12:34:04,026 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:34:04,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:34:05,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 12:34:09,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:34:10,379 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.46 vs. limit=12.0 2023-09-29 12:34:10,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 12:34:14,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:34:17,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:34:18,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:34:20,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:34:20,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:34:23,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:34:23,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 12:34:23,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:34:26,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:34:31,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 12:34:33,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 12:34:35,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 12:34:35,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:34:36,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:34:36,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:34:37,064 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:34:44,461 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 12:34:45,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:34:47,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:34:50,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 12:34:50,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 12:34:52,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:34:53,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:34:55,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:35:00,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 12:35:04,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:35:06,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 12:35:07,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:35:07,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:35:07,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:35:09,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 12:35:12,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:35:12,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:35:16,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 12:35:16,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:35:16,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=366406.6666666667, ans=0.125 2023-09-29 12:35:19,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:35:19,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:35:19,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:35:21,025 INFO [train.py:1039] (3/4) Epoch 11, batch 1850, loss[loss=0.1993, simple_loss=0.2621, pruned_loss=0.06823, over 23402.00 frames. ], tot_loss[loss=0.2016, simple_loss=0.2709, pruned_loss=0.06617, over 4679207.99 frames. ], batch size: 134, lr: 9.51e-03, grad_scale: 16.0 2023-09-29 12:35:21,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:35:21,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:35:24,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:35:24,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:35:27,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:35:28,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:35:29,065 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=366473.3333333333, ans=0.125 2023-09-29 12:35:35,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:35:35,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 12:35:40,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 12:35:40,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=366540.0, ans=0.1 2023-09-29 12:35:43,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 12:35:48,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:35:48,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 12:35:48,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 12:35:53,852 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.24 vs. limit=10.0 2023-09-29 12:35:57,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:35:59,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 12:36:03,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:36:03,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:36:05,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 12:36:05,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:05,599 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 12:36:07,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:36:09,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=366673.3333333333, ans=0.1 2023-09-29 12:36:09,955 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.62 vs. limit=6.0 2023-09-29 12:36:10,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:36:12,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:36:15,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=366673.3333333333, ans=0.025 2023-09-29 12:36:17,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:36:17,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:17,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 12:36:17,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:36:19,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=366673.3333333333, ans=0.035 2023-09-29 12:36:20,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:36:22,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:36:26,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 12:36:27,581 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:36:29,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=366740.0, ans=0.125 2023-09-29 12:36:29,779 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.99 vs. limit=15.0 2023-09-29 12:36:32,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:36:32,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:36:32,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 12:36:32,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 12:36:33,770 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 12:36:35,303 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 12:36:36,670 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 2.114e+02 2.440e+02 2.839e+02 4.239e+02, threshold=4.880e+02, percent-clipped=0.0 2023-09-29 12:36:36,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:36:36,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:36:36,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:36:38,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:38,426 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 12:36:38,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:36:39,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:39,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:36:41,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:36:42,843 INFO [train.py:1039] (3/4) Epoch 11, batch 1900, loss[loss=0.1846, simple_loss=0.2678, pruned_loss=0.05072, over 24427.00 frames. ], tot_loss[loss=0.2023, simple_loss=0.2722, pruned_loss=0.06624, over 4686480.21 frames. ], batch size: 69, lr: 9.51e-03, grad_scale: 16.0 2023-09-29 12:36:43,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:36:43,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 12:36:46,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:36:46,195 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 12:36:46,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:36:47,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:36:55,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:36:56,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:36:58,095 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 12:37:00,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 12:37:00,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:37:01,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:37:01,877 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 12:37:01,918 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 12:37:07,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 12:37:09,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:37:13,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 12:37:15,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 12:37:23,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 12:37:25,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 12:37:25,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:37:26,806 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 12:37:26,814 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 12:37:26,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 12:37:28,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 12:37:28,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:37:33,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 12:37:35,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:37:37,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=367006.6666666667, ans=0.125 2023-09-29 12:37:40,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:37:40,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 12:37:43,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:37:47,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 12:37:47,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:37:52,646 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=367073.3333333333, ans=0.2 2023-09-29 12:37:56,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:37:56,255 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:37:56,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:37:57,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:37:59,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 12:37:59,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 12:38:01,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:38:04,372 INFO [train.py:1039] (3/4) Epoch 11, batch 1950, loss[loss=0.2127, simple_loss=0.2826, pruned_loss=0.07139, over 23421.00 frames. ], tot_loss[loss=0.2031, simple_loss=0.2726, pruned_loss=0.0668, over 4697806.39 frames. ], batch size: 93, lr: 9.50e-03, grad_scale: 16.0 2023-09-29 12:38:04,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:38:04,621 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:38:07,721 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:38:07,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:38:07,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:38:09,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:38:13,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:38:16,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:38:16,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:38:16,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:38:18,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 12:38:19,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 12:38:20,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:38:21,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:38:22,250 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.03 vs. limit=15.0 2023-09-29 12:38:24,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:38:24,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:38:25,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:38:27,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:38:32,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:38:32,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:38:32,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 12:38:32,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:38:35,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:38:39,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:38:39,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:38:39,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 12:38:39,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 12:38:39,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 12:38:40,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:38:41,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:38:41,883 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=24.57 vs. limit=22.5 2023-09-29 12:38:44,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:38:46,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:38:53,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:38:54,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:38:56,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:38:56,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 12:38:56,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:39:00,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:39:01,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:39:02,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:39:09,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:39:11,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:39:14,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:39:17,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:39:19,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:39:19,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:39:21,453 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.994e+02 2.316e+02 2.639e+02 3.669e+02, threshold=4.632e+02, percent-clipped=0.0 2023-09-29 12:39:21,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 12:39:21,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:39:21,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:39:23,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 12:39:24,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:39:25,139 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_na.min_abs, batch_count=367406.6666666667, ans=0.02 2023-09-29 12:39:27,719 INFO [train.py:1039] (3/4) Epoch 11, batch 2000, loss[loss=0.2073, simple_loss=0.2901, pruned_loss=0.06223, over 24646.00 frames. ], tot_loss[loss=0.2033, simple_loss=0.2727, pruned_loss=0.06701, over 4691972.53 frames. ], batch size: 68, lr: 9.50e-03, grad_scale: 32.0 2023-09-29 12:39:29,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:39:30,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:39:32,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:39:33,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:39:35,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:39:37,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 12:39:39,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:39:42,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:39:43,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=367540.0, ans=0.125 2023-09-29 12:39:44,421 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 12:39:46,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 12:39:46,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:39:50,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:39:51,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 12:39:54,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:39:59,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:39:59,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:39:59,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 12:39:59,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 12:40:02,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 12:40:02,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:40:03,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:40:03,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 12:40:03,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:05,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:40:06,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:40:07,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 12:40:11,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 12:40:11,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:40:11,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:40:15,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:40:18,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:40:18,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:40:18,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:40:19,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:40:20,183 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=367673.3333333333, ans=0.125 2023-09-29 12:40:21,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:40:22,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:40:22,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:40:24,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:25,278 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.39 vs. limit=15.0 2023-09-29 12:40:28,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:40:29,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 12:40:35,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:40:35,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:40:39,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:40:39,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:40:41,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:44,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:40:44,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:44,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:40:44,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:40:48,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:40:50,395 INFO [train.py:1039] (3/4) Epoch 11, batch 2050, loss[loss=0.1949, simple_loss=0.2754, pruned_loss=0.05716, over 24628.00 frames. ], tot_loss[loss=0.2035, simple_loss=0.2725, pruned_loss=0.06724, over 4695656.27 frames. ], batch size: 68, lr: 9.49e-03, grad_scale: 32.0 2023-09-29 12:40:50,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:40:54,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:40:55,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:41:01,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:41:03,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:41:03,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:41:05,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:41:07,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 12:41:07,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:41:08,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:41:10,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:41:10,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=367873.3333333333, ans=0.0 2023-09-29 12:41:18,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:41:18,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:41:21,319 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 12:41:24,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:41:25,146 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:41:26,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 12:41:26,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:41:30,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:41:32,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:41:34,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:41:34,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:41:35,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:41:35,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:41:35,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:41:40,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:41:41,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:41:43,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:41:44,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:41:47,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:41:55,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:41:57,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 12:42:01,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:42:02,868 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:42:06,928 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.639e+02 2.014e+02 2.317e+02 2.683e+02 4.007e+02, threshold=4.634e+02, percent-clipped=0.0 2023-09-29 12:42:07,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:42:08,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 12:42:11,248 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=368140.0, ans=0.125 2023-09-29 12:42:12,418 INFO [train.py:1039] (3/4) Epoch 11, batch 2100, loss[loss=0.1827, simple_loss=0.2621, pruned_loss=0.05161, over 24653.00 frames. ], tot_loss[loss=0.2027, simple_loss=0.272, pruned_loss=0.06665, over 4698218.35 frames. ], batch size: 65, lr: 9.49e-03, grad_scale: 16.0 2023-09-29 12:42:14,037 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 12:42:14,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:42:14,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:42:14,359 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=368140.0, ans=0.2 2023-09-29 12:42:15,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:42:15,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:42:15,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 12:42:17,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 12:42:18,803 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:42:19,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=368140.0, ans=0.025 2023-09-29 12:42:21,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:42:21,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:42:24,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:42:25,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:42:25,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 12:42:26,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:42:28,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 12:42:28,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 12:42:31,169 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:42:32,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:42:32,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:42:32,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 12:42:32,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 12:42:34,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=368206.6666666667, ans=0.2 2023-09-29 12:42:37,444 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.15 vs. limit=15.0 2023-09-29 12:42:38,142 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 12:42:38,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:42:38,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=368206.6666666667, ans=0.1 2023-09-29 12:42:39,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:42:40,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:42:40,419 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=368206.6666666667, ans=0.1 2023-09-29 12:42:45,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:42:46,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 12:42:46,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:42:46,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 12:42:48,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 12:42:48,652 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:42:48,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 12:42:50,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 12:42:50,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 12:42:51,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:42:53,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:42:55,728 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.32 vs. limit=6.0 2023-09-29 12:42:56,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:42:58,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 12:42:59,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:01,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:43:01,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 12:43:01,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:43:01,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:43:03,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:03,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 12:43:04,922 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 12:43:06,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 12:43:09,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:43:12,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:43:13,021 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=368340.0, ans=0.0 2023-09-29 12:43:14,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 12:43:14,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=368340.0, ans=0.0 2023-09-29 12:43:14,976 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.15 vs. limit=15.0 2023-09-29 12:43:16,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=368406.6666666667, ans=0.0 2023-09-29 12:43:19,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:43:21,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:43:22,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:43:22,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:43:22,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 12:43:23,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:43:25,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:43:25,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:43:26,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:43:27,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:29,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 12:43:30,681 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 12:43:30,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:43:33,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:43:33,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:43:34,236 INFO [train.py:1039] (3/4) Epoch 11, batch 2150, loss[loss=0.2135, simple_loss=0.2492, pruned_loss=0.08896, over 19001.00 frames. ], tot_loss[loss=0.2024, simple_loss=0.2711, pruned_loss=0.06681, over 4691166.49 frames. ], batch size: 389, lr: 9.48e-03, grad_scale: 16.0 2023-09-29 12:43:34,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:43:34,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:43:41,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 12:43:42,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:43:44,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:44,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:43:44,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:43:45,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:43:51,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:43:51,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:43:51,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:43:56,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:43:56,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 12:44:01,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=368540.0, ans=0.07 2023-09-29 12:44:01,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=368540.0, ans=0.0 2023-09-29 12:44:02,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:44:02,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=368540.0, ans=0.05 2023-09-29 12:44:04,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:44:05,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:05,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:44:05,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:05,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:44:07,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:44:07,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:44:07,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:44:09,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 12:44:12,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:44:12,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:44:14,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:44:14,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:44:16,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:44:17,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:44:17,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:44:19,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:44:19,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 12:44:19,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 12:44:22,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:44:24,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:25,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:44:27,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:44:29,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:44:29,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:29,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 12:44:32,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 12:44:32,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:44:32,960 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 12:44:33,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:44:33,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:44:34,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 12:44:34,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:44:34,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 12:44:34,758 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 12:44:34,759 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 12:44:34,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 12:44:35,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=368673.3333333333, ans=0.1 2023-09-29 12:44:37,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:44:37,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=368673.3333333333, ans=0.125 2023-09-29 12:44:39,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:44:39,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:44:41,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:42,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 12:44:42,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=368740.0, ans=0.125 2023-09-29 12:44:45,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:44:45,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:44:47,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=368740.0, ans=0.0 2023-09-29 12:44:51,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=368740.0, ans=0.035 2023-09-29 12:44:51,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=368740.0, ans=0.125 2023-09-29 12:44:52,249 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.938e+02 2.164e+02 2.545e+02 3.667e+02, threshold=4.328e+02, percent-clipped=0.0 2023-09-29 12:44:54,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:44:55,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 12:44:56,985 INFO [train.py:1039] (3/4) Epoch 11, batch 2200, loss[loss=0.1734, simple_loss=0.259, pruned_loss=0.04387, over 24321.00 frames. ], tot_loss[loss=0.202, simple_loss=0.2711, pruned_loss=0.0665, over 4697695.08 frames. ], batch size: 74, lr: 9.48e-03, grad_scale: 16.0 2023-09-29 12:44:57,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:45:01,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:45:01,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:45:02,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:45:05,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 12:45:07,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:45:08,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:45:08,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 12:45:13,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 12:45:15,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 12:45:20,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 12:45:20,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=368873.3333333333, ans=0.125 2023-09-29 12:45:24,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:45:24,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:45:26,319 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:45:26,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=368873.3333333333, ans=0.125 2023-09-29 12:45:29,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:45:29,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 12:45:30,766 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.39 vs. limit=12.0 2023-09-29 12:45:33,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:45:36,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:45:36,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 12:45:37,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=368940.0, ans=22.5 2023-09-29 12:45:38,996 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=368940.0, ans=0.025 2023-09-29 12:45:40,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:45:41,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:45:43,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:45:45,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:45:49,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 12:45:50,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:45:51,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 12:45:55,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:45:55,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 12:45:55,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:45:57,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:45:58,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:45:58,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:46:00,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:46:01,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:46:01,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:46:04,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 12:46:07,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 12:46:07,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:46:11,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:46:12,645 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 12:46:14,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:46:14,770 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 12:46:17,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 12:46:17,072 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 12:46:19,826 INFO [train.py:1039] (3/4) Epoch 11, batch 2250, loss[loss=0.2347, simple_loss=0.2853, pruned_loss=0.09205, over 23788.00 frames. ], tot_loss[loss=0.2031, simple_loss=0.2725, pruned_loss=0.06685, over 4707392.29 frames. ], batch size: 212, lr: 9.48e-03, grad_scale: 16.0 2023-09-29 12:46:19,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:46:20,062 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 12:46:22,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:46:25,104 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 12:46:25,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:46:26,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:46:33,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:46:33,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:46:36,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:46:38,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:46:38,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 12:46:39,281 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.58 vs. limit=15.0 2023-09-29 12:46:41,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 12:46:42,772 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:46:42,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:46:44,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 12:46:46,526 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:46:46,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:46:48,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 12:46:52,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:46:53,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 12:46:53,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:46:55,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 12:46:57,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:46:59,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:47:05,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:47:06,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:47:07,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:47:07,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:47:10,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:47:12,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:47:15,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=369340.0, ans=0.125 2023-09-29 12:47:16,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:47:19,323 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=369340.0, ans=0.125 2023-09-29 12:47:20,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 12:47:25,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 12:47:25,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:47:25,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:47:26,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=369406.6666666667, ans=0.1 2023-09-29 12:47:32,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 12:47:32,710 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:47:35,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:47:35,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 12:47:35,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:47:36,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:47:38,355 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.418e+02 2.043e+02 2.273e+02 2.723e+02 4.405e+02, threshold=4.547e+02, percent-clipped=1.0 2023-09-29 12:47:38,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 12:47:42,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:47:43,474 INFO [train.py:1039] (3/4) Epoch 11, batch 2300, loss[loss=0.2257, simple_loss=0.2832, pruned_loss=0.08405, over 23656.00 frames. ], tot_loss[loss=0.2035, simple_loss=0.2729, pruned_loss=0.06704, over 4702123.81 frames. ], batch size: 232, lr: 9.47e-03, grad_scale: 16.0 2023-09-29 12:47:43,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:47:48,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:47:49,850 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:47:51,445 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 12:47:52,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:47:53,348 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=369473.3333333333, ans=0.04949747468305833 2023-09-29 12:48:00,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:48:00,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:48:01,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:48:03,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:48:03,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 12:48:03,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:48:05,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:48:05,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:48:05,784 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=369540.0, ans=0.125 2023-09-29 12:48:08,937 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.58 vs. limit=15.0 2023-09-29 12:48:12,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:48:12,893 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.60 vs. limit=15.0 2023-09-29 12:48:13,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:48:16,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:48:21,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:48:21,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:48:24,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:48:26,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:48:26,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=369606.6666666667, ans=0.125 2023-09-29 12:48:29,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:48:29,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:48:29,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:48:31,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 12:48:38,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 12:48:38,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:48:38,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:48:38,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:48:38,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:48:40,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 12:48:40,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 12:48:41,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 12:48:41,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:48:41,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:48:41,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 12:48:49,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:48:52,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:48:55,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:48:56,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:48:57,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 12:49:00,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:49:00,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:49:02,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:49:04,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 12:49:05,505 INFO [train.py:1039] (3/4) Epoch 11, batch 2350, loss[loss=0.2075, simple_loss=0.2831, pruned_loss=0.06598, over 23231.00 frames. ], tot_loss[loss=0.2049, simple_loss=0.2739, pruned_loss=0.06791, over 4703380.75 frames. ], batch size: 105, lr: 9.47e-03, grad_scale: 16.0 2023-09-29 12:49:11,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:49:12,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 12:49:17,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 12:49:22,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:49:25,869 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:49:27,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:49:27,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:49:27,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:49:28,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 12:49:32,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:49:36,070 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=14.03 vs. limit=15.0 2023-09-29 12:49:36,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 12:49:38,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:49:42,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:49:42,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:49:46,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:49:47,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 12:49:47,862 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=369940.0, ans=0.1 2023-09-29 12:49:48,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:49:50,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:49:50,518 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:49:50,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:49:55,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:49:57,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 12:49:57,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:50:00,024 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.29 vs. limit=15.0 2023-09-29 12:50:01,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:50:01,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:50:04,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 12:50:05,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:50:05,936 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=370006.6666666667, ans=0.04949747468305833 2023-09-29 12:50:07,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 12:50:08,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:50:11,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 12:50:15,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 12:50:16,833 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=370073.3333333333, ans=0.125 2023-09-29 12:50:17,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:50:17,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 12:50:18,000 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 12:50:18,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 12:50:19,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 12:50:22,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:50:23,820 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.674e+02 2.084e+02 2.454e+02 3.278e+02 4.890e+02, threshold=4.908e+02, percent-clipped=1.0 2023-09-29 12:50:25,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:50:29,135 INFO [train.py:1039] (3/4) Epoch 11, batch 2400, loss[loss=0.2044, simple_loss=0.2807, pruned_loss=0.06407, over 24666.00 frames. ], tot_loss[loss=0.2051, simple_loss=0.2741, pruned_loss=0.06807, over 4700105.16 frames. ], batch size: 65, lr: 9.46e-03, grad_scale: 32.0 2023-09-29 12:50:30,803 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:50:31,805 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.72 vs. limit=15.0 2023-09-29 12:50:32,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:50:33,026 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 12:50:34,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 12:50:40,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 12:50:40,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:50:43,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 12:50:43,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:50:45,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:50:47,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 12:50:54,657 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:50:56,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 12:50:58,216 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 12:51:01,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=370273.3333333333, ans=0.2 2023-09-29 12:51:02,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 12:51:07,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 12:51:12,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:51:13,081 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.60 vs. limit=15.0 2023-09-29 12:51:13,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:51:17,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:51:18,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 12:51:20,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 12:51:27,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:51:30,132 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.54 vs. limit=12.0 2023-09-29 12:51:30,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:51:32,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:51:33,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:51:33,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 12:51:33,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:51:33,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:51:35,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:51:35,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 12:51:40,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:51:40,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 12:51:40,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 12:51:42,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=370406.6666666667, ans=0.07 2023-09-29 12:51:43,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 12:51:46,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:51:46,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:51:46,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 12:51:48,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 12:51:48,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 12:51:48,292 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 12:51:49,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 12:51:51,319 INFO [train.py:1039] (3/4) Epoch 11, batch 2450, loss[loss=0.1964, simple_loss=0.277, pruned_loss=0.0579, over 24645.00 frames. ], tot_loss[loss=0.2027, simple_loss=0.2719, pruned_loss=0.06678, over 4693304.48 frames. ], batch size: 73, lr: 9.46e-03, grad_scale: 16.0 2023-09-29 12:51:51,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:51:51,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:51:51,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:51:54,665 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 12:51:54,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:51:54,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 12:51:59,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:51:59,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:52:03,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:52:03,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:52:04,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 12:52:10,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:52:10,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:52:14,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:52:14,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:52:14,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:52:16,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 12:52:19,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:52:21,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:52:22,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:52:25,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 12:52:25,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:52:27,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:52:27,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:52:30,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 12:52:30,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:52:33,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=370606.6666666667, ans=0.09899494936611666 2023-09-29 12:52:35,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=370606.6666666667, ans=0.125 2023-09-29 12:52:38,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:52:40,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:52:40,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:52:41,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 12:52:41,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:52:45,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:52:46,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 12:52:48,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 12:52:50,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:52:53,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:52:53,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:52:59,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 12:52:59,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 12:53:01,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:53:01,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:53:01,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 12:53:02,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:53:02,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:53:06,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=370740.0, ans=0.125 2023-09-29 12:53:07,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:53:11,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:53:11,515 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:53:12,849 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 2.006e+02 2.269e+02 2.537e+02 3.932e+02, threshold=4.538e+02, percent-clipped=0.0 2023-09-29 12:53:14,459 INFO [train.py:1039] (3/4) Epoch 11, batch 2500, loss[loss=0.2155, simple_loss=0.2786, pruned_loss=0.07626, over 23916.00 frames. ], tot_loss[loss=0.2026, simple_loss=0.2717, pruned_loss=0.06674, over 4698887.82 frames. ], batch size: 195, lr: 9.45e-03, grad_scale: 8.0 2023-09-29 12:53:16,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 12:53:16,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 12:53:18,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=370806.6666666667, ans=0.025 2023-09-29 12:53:22,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:53:27,095 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.97 vs. limit=15.0 2023-09-29 12:53:32,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 12:53:32,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:53:35,183 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.84 vs. limit=15.0 2023-09-29 12:53:35,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:53:35,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 12:53:45,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 12:53:45,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:53:47,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 12:53:47,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 12:53:48,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 12:53:50,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:53:50,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:53:50,594 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 12:53:50,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:53:52,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 12:53:52,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:53:55,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=370940.0, ans=0.1 2023-09-29 12:53:56,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:53:56,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:53:59,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=370940.0, ans=0.1 2023-09-29 12:54:00,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 12:54:00,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 12:54:00,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:54:00,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=370940.0, ans=0.125 2023-09-29 12:54:04,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:54:07,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:54:11,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:54:15,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:54:20,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 12:54:23,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 12:54:23,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:54:24,505 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.37 vs. limit=12.0 2023-09-29 12:54:25,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 12:54:26,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 12:54:26,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 12:54:28,436 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 12:54:28,437 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 12:54:28,445 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 12:54:31,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:54:33,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 12:54:33,794 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 12:54:35,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 12:54:36,978 INFO [train.py:1039] (3/4) Epoch 11, batch 2550, loss[loss=0.2182, simple_loss=0.2834, pruned_loss=0.07647, over 23802.00 frames. ], tot_loss[loss=0.2028, simple_loss=0.2723, pruned_loss=0.06671, over 4699472.62 frames. ], batch size: 232, lr: 9.45e-03, grad_scale: 8.0 2023-09-29 12:54:37,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 12:54:40,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 12:54:42,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:54:45,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:54:45,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:54:46,994 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=371140.0, ans=0.125 2023-09-29 12:54:48,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:54:48,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 12:54:50,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:54:54,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 12:54:54,595 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=371206.6666666667, ans=0.0 2023-09-29 12:54:55,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:54:57,511 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:54:59,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=371206.6666666667, ans=0.0 2023-09-29 12:55:00,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:55:00,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 12:55:00,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:55:00,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:55:02,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:55:05,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:55:05,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 12:55:05,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 12:55:06,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:06,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 12:55:19,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 12:55:24,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:55:24,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:24,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:55:26,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 12:55:26,920 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.08 vs. limit=15.0 2023-09-29 12:55:34,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:55:37,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 12:55:37,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:55:37,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 12:55:37,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 12:55:37,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 12:55:38,353 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.02 vs. limit=6.0 2023-09-29 12:55:40,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:55:40,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:46,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=371406.6666666667, ans=0.125 2023-09-29 12:55:48,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:55:48,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 12:55:48,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:55:49,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:55:49,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 12:55:51,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 12:55:51,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:55:57,878 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 1.868e+02 2.141e+02 2.517e+02 4.100e+02, threshold=4.283e+02, percent-clipped=0.0 2023-09-29 12:55:59,467 INFO [train.py:1039] (3/4) Epoch 11, batch 2600, loss[loss=0.2086, simple_loss=0.2798, pruned_loss=0.06865, over 23323.00 frames. ], tot_loss[loss=0.2019, simple_loss=0.2719, pruned_loss=0.06599, over 4703495.07 frames. ], batch size: 93, lr: 9.45e-03, grad_scale: 8.0 2023-09-29 12:55:59,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:56:01,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:56:04,885 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 12:56:06,524 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 12:56:06,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 12:56:08,030 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 12:56:08,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 12:56:08,196 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 12:56:11,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:56:11,270 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 12:56:12,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 12:56:16,358 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 12:56:18,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 12:56:20,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 12:56:21,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 12:56:23,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 12:56:23,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 12:56:25,994 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 12:56:26,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 12:56:31,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=371606.6666666667, ans=0.2 2023-09-29 12:56:34,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:56:34,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:56:34,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:56:34,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 12:56:36,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 12:56:42,654 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 12:56:48,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:56:48,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:56:50,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 12:56:52,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:56:52,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:56:52,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 12:56:55,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:56:57,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:56:58,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:57:03,913 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 12:57:03,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:57:04,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 12:57:07,788 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.99 vs. limit=12.0 2023-09-29 12:57:08,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:57:10,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 12:57:10,170 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 12:57:11,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:57:13,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:57:13,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:57:19,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 12:57:20,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:57:21,935 INFO [train.py:1039] (3/4) Epoch 11, batch 2650, loss[loss=0.1985, simple_loss=0.2803, pruned_loss=0.05834, over 24384.00 frames. ], tot_loss[loss=0.2016, simple_loss=0.2719, pruned_loss=0.06568, over 4719021.96 frames. ], batch size: 77, lr: 9.44e-03, grad_scale: 8.0 2023-09-29 12:57:23,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 12:57:28,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 12:57:28,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:57:30,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 12:57:30,372 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 12:57:30,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:57:32,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:57:34,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 12:57:35,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:57:37,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=371873.3333333333, ans=0.0 2023-09-29 12:57:38,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:57:38,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 12:57:40,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 12:57:40,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:57:43,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 12:57:44,768 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 12:57:45,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=371873.3333333333, ans=0.125 2023-09-29 12:57:48,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:57:49,831 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 12:57:49,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:57:49,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 12:57:55,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:57:55,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 12:57:55,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:57:56,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:00,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 12:58:00,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 12:58:03,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:58:07,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 12:58:07,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:58:08,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:10,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:58:10,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:58:10,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:58:12,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=372006.6666666667, ans=0.125 2023-09-29 12:58:13,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:58:13,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:58:15,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 12:58:15,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 12:58:16,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 12:58:19,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=372006.6666666667, ans=0.125 2023-09-29 12:58:20,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:58:20,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 12:58:20,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=372006.6666666667, ans=0.0 2023-09-29 12:58:21,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:58:23,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:58:23,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 12:58:25,763 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.73 vs. limit=15.0 2023-09-29 12:58:27,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:27,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 12:58:27,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:58:29,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 12:58:36,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:58:38,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:38,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:58:39,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:58:39,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 12:58:41,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:58:43,265 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.976e+02 2.292e+02 2.609e+02 3.713e+02, threshold=4.584e+02, percent-clipped=0.0 2023-09-29 12:58:44,824 INFO [train.py:1039] (3/4) Epoch 11, batch 2700, loss[loss=0.2168, simple_loss=0.2783, pruned_loss=0.07766, over 23818.00 frames. ], tot_loss[loss=0.2035, simple_loss=0.2739, pruned_loss=0.06659, over 4709652.70 frames. ], batch size: 195, lr: 9.44e-03, grad_scale: 8.0 2023-09-29 12:58:44,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:58:44,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 12:58:46,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:58:48,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 12:58:48,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=372140.0, ans=0.125 2023-09-29 12:58:49,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 12:58:49,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:58:49,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:58:51,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 12:58:51,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:58:51,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 12:58:51,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 12:58:53,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 12:58:54,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 12:58:56,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 12:58:57,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 12:58:59,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:59:02,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 12:59:04,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 12:59:04,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:59:08,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 12:59:08,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:59:14,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 12:59:14,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 12:59:14,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 12:59:14,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 12:59:19,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:59:21,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:59:21,204 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 12:59:21,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 12:59:21,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=372273.3333333333, ans=0.0 2023-09-29 12:59:26,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:59:27,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 12:59:37,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 12:59:38,914 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 12:59:41,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 12:59:41,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 12:59:45,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:59:45,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 12:59:45,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=372340.0, ans=0.2 2023-09-29 12:59:46,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=372340.0, ans=0.1 2023-09-29 12:59:47,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 12:59:50,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 12:59:52,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 12:59:52,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 12:59:55,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 12:59:55,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:59:56,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 12:59:59,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 13:00:01,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:00:03,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:00:03,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 13:00:03,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=372406.6666666667, ans=0.0 2023-09-29 13:00:05,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 13:00:06,682 INFO [train.py:1039] (3/4) Epoch 11, batch 2750, loss[loss=0.1936, simple_loss=0.2714, pruned_loss=0.05795, over 24575.00 frames. ], tot_loss[loss=0.2029, simple_loss=0.2734, pruned_loss=0.0662, over 4709862.57 frames. ], batch size: 71, lr: 9.43e-03, grad_scale: 8.0 2023-09-29 13:00:06,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:00:10,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:00:10,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:00:13,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:13,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 13:00:13,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:17,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:00:17,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 13:00:17,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:00:17,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:17,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 13:00:18,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=372473.3333333333, ans=0.125 2023-09-29 13:00:19,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:00:19,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:00:24,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 13:00:27,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:00:27,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:27,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:00:29,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:00:29,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:00:31,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:00:32,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:00:32,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:00:36,881 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.95 vs. limit=15.0 2023-09-29 13:00:37,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:00:37,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:00:37,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:00:39,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:41,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:00:44,435 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:00:47,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:00:49,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:00:50,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:00:54,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=372673.3333333333, ans=0.0 2023-09-29 13:00:56,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=372673.3333333333, ans=0.125 2023-09-29 13:00:57,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:00:57,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:00:57,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:00:59,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=372673.3333333333, ans=0.125 2023-09-29 13:01:02,345 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:01:03,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:01:03,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 13:01:08,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:01:09,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 13:01:15,147 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 13:01:15,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=372740.0, ans=0.125 2023-09-29 13:01:18,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:01:19,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 13:01:20,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:01:23,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:01:23,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 13:01:24,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:01:25,972 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.991e+02 2.210e+02 2.554e+02 4.000e+02, threshold=4.420e+02, percent-clipped=0.0 2023-09-29 13:01:27,642 INFO [train.py:1039] (3/4) Epoch 11, batch 2800, loss[loss=0.2124, simple_loss=0.2866, pruned_loss=0.06905, over 24644.00 frames. ], tot_loss[loss=0.2023, simple_loss=0.2728, pruned_loss=0.06593, over 4707925.58 frames. ], batch size: 65, lr: 9.43e-03, grad_scale: 16.0 2023-09-29 13:01:27,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 13:01:27,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:01:29,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:01:29,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 13:01:29,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:01:29,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:01:32,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:01:33,592 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 13:01:33,593 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 13:01:36,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:01:38,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:01:39,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:01:42,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:01:44,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 13:01:47,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 13:01:49,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 13:01:51,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:01:52,249 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.16 vs. limit=15.0 2023-09-29 13:01:52,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:01:52,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:01:56,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:01:56,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:01:56,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 13:01:57,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:01:59,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=372940.0, ans=0.0 2023-09-29 13:02:04,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:02:07,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:02:10,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:02:12,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:02:13,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:02:14,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=372940.0, ans=0.0 2023-09-29 13:02:17,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=373006.6666666667, ans=0.0 2023-09-29 13:02:18,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:02:18,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 13:02:20,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:02:20,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:02:20,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:02:25,912 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:02:25,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:02:30,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:02:32,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:02:33,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:02:33,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 13:02:33,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 13:02:34,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:02:35,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:02:35,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 13:02:36,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:02:38,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:02:38,060 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:02:38,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 13:02:40,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:02:40,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:02:40,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:02:41,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 13:02:49,515 INFO [train.py:1039] (3/4) Epoch 11, batch 2850, loss[loss=0.2044, simple_loss=0.2687, pruned_loss=0.07004, over 23592.00 frames. ], tot_loss[loss=0.2015, simple_loss=0.2714, pruned_loss=0.06577, over 4692786.97 frames. ], batch size: 256, lr: 9.43e-03, grad_scale: 16.0 2023-09-29 13:02:49,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:02:49,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 13:02:51,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:02:53,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:02:58,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:02:58,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:02:58,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:03:01,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:03:01,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:03:03,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:03:03,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 13:03:10,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 13:03:10,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:03:10,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=373206.6666666667, ans=0.125 2023-09-29 13:03:11,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 13:03:13,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:03:14,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 13:03:14,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 13:03:15,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=373206.6666666667, ans=0.0 2023-09-29 13:03:16,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:03:20,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=373206.6666666667, ans=0.0 2023-09-29 13:03:30,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:03:30,835 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.48 vs. limit=15.0 2023-09-29 13:03:32,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:03:32,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:03:32,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=373273.3333333333, ans=0.0 2023-09-29 13:03:33,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:03:33,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:03:35,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:03:35,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=373273.3333333333, ans=0.1 2023-09-29 13:03:36,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:03:41,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 13:03:43,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:03:44,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:03:44,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:03:46,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:03:48,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:03:48,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:03:50,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:03:53,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:03:55,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:03:55,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:03:56,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:03:58,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:04:04,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:04:06,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 13:04:06,637 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 13:04:10,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 13:04:10,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:04:10,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 13:04:10,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:04:11,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:04:12,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=373406.6666666667, ans=0.2 2023-09-29 13:04:13,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:04:13,391 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:04:13,392 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 13:04:13,456 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 13:04:14,712 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.974e+02 2.166e+02 2.699e+02 4.540e+02, threshold=4.331e+02, percent-clipped=1.0 2023-09-29 13:04:14,812 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:04:14,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:04:15,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=373473.3333333333, ans=0.125 2023-09-29 13:04:16,419 INFO [train.py:1039] (3/4) Epoch 11, batch 2900, loss[loss=0.1973, simple_loss=0.2787, pruned_loss=0.05794, over 23954.00 frames. ], tot_loss[loss=0.2012, simple_loss=0.2714, pruned_loss=0.06555, over 4701184.91 frames. ], batch size: 80, lr: 9.42e-03, grad_scale: 16.0 2023-09-29 13:04:18,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 13:04:18,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:04:18,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:04:19,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 13:04:20,468 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.46 vs. limit=15.0 2023-09-29 13:04:25,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:04:26,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 13:04:26,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 13:04:28,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:04:28,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:04:30,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:04:30,916 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.74 vs. limit=15.0 2023-09-29 13:04:31,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:04:35,468 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.96 vs. limit=15.0 2023-09-29 13:04:36,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:04:36,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:04:40,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 13:04:40,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 13:04:41,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:04:43,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:04:45,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 13:04:45,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 13:04:48,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:04:48,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 13:04:48,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:04:49,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:04:50,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 13:04:52,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:04:53,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:04:58,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:05:01,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:05:04,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 13:05:04,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 13:05:04,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:05:07,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:05:09,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=373673.3333333333, ans=0.1 2023-09-29 13:05:10,480 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.21 vs. limit=15.0 2023-09-29 13:05:10,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 13:05:11,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=373673.3333333333, ans=0.0 2023-09-29 13:05:13,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:05:19,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:05:27,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:05:28,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:05:30,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 13:05:34,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:05:34,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 13:05:34,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:05:35,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:05:36,234 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.66 vs. limit=22.5 2023-09-29 13:05:37,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=373806.6666666667, ans=0.1 2023-09-29 13:05:39,100 INFO [train.py:1039] (3/4) Epoch 11, batch 2950, loss[loss=0.194, simple_loss=0.2727, pruned_loss=0.0576, over 24323.00 frames. ], tot_loss[loss=0.202, simple_loss=0.2724, pruned_loss=0.06575, over 4696091.32 frames. ], batch size: 61, lr: 9.42e-03, grad_scale: 16.0 2023-09-29 13:05:43,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:05:43,828 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=373806.6666666667, ans=0.1 2023-09-29 13:05:45,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 13:05:45,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:05:45,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:05:46,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:05:49,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:05:49,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 13:05:50,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 13:05:52,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:05:52,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:06:00,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:06:01,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:06:04,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:06:04,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:06:08,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:06:08,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:06:11,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:06:12,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:06:12,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:06:14,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 13:06:15,925 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.18 vs. limit=6.0 2023-09-29 13:06:20,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 13:06:20,917 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 13:06:21,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=373940.0, ans=0.2 2023-09-29 13:06:22,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:06:24,017 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 13:06:24,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 13:06:25,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:06:27,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:06:27,069 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 13:06:27,078 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:06:29,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 13:06:31,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:06:32,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:06:34,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:06:34,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:06:36,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:06:36,094 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 13:06:36,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:06:37,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 13:06:38,449 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.13 vs. limit=15.0 2023-09-29 13:06:45,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:06:45,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:06:45,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=374073.3333333333, ans=0.07 2023-09-29 13:06:47,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 13:06:47,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:06:48,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 13:06:50,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:06:52,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:06:53,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:06:55,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:06:55,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 13:06:57,270 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=374073.3333333333, ans=0.0 2023-09-29 13:06:58,224 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 2.184e+02 2.499e+02 3.162e+02 5.312e+02, threshold=4.998e+02, percent-clipped=4.0 2023-09-29 13:06:58,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:06:59,739 INFO [train.py:1039] (3/4) Epoch 11, batch 3000, loss[loss=0.1899, simple_loss=0.2716, pruned_loss=0.05407, over 24469.00 frames. ], tot_loss[loss=0.2019, simple_loss=0.2727, pruned_loss=0.06557, over 4709076.87 frames. ], batch size: 66, lr: 9.41e-03, grad_scale: 16.0 2023-09-29 13:06:59,740 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 13:07:13,674 INFO [train.py:1071] (3/4) Epoch 11, validation: loss=0.3146, simple_loss=0.2865, pruned_loss=0.1713, over 1125622.00 frames. 2023-09-29 13:07:13,675 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 13:07:13,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:07:13,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:07:13,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:07:15,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:07:16,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:07:18,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:07:18,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 13:07:20,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:07:23,111 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:07:23,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:07:28,291 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 13:07:28,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 13:07:30,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:07:30,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=374206.6666666667, ans=0.2 2023-09-29 13:07:31,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:07:31,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 13:07:31,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:07:36,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 13:07:46,728 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:07:52,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 13:07:52,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:07:54,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:07:55,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:07:55,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:07:57,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:07:57,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 13:08:00,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 13:08:01,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:08:02,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 13:08:04,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:08:04,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:08:07,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:08:07,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:08:08,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=374340.0, ans=0.0 2023-09-29 13:08:10,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:08:10,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:08:10,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:08:13,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:08:17,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 13:08:18,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:08:18,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:08:18,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:08:23,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:08:23,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:08:23,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=374406.6666666667, ans=0.125 2023-09-29 13:08:23,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=374406.6666666667, ans=0.1 2023-09-29 13:08:26,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 13:08:26,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 13:08:28,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:08:28,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 13:08:28,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:08:30,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 13:08:34,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:08:35,955 INFO [train.py:1039] (3/4) Epoch 11, batch 3050, loss[loss=0.2013, simple_loss=0.2665, pruned_loss=0.0681, over 23844.00 frames. ], tot_loss[loss=0.2022, simple_loss=0.2736, pruned_loss=0.06545, over 4721102.43 frames. ], batch size: 195, lr: 9.41e-03, grad_scale: 16.0 2023-09-29 13:08:36,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:08:36,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 13:08:37,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 13:08:37,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 13:08:39,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:08:39,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:08:39,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 13:08:39,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:08:39,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:08:43,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=374473.3333333333, ans=0.0 2023-09-29 13:08:44,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 13:08:45,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:08:47,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:08:49,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:08:49,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=374473.3333333333, ans=0.0 2023-09-29 13:08:52,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:08:56,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 13:09:02,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 13:09:02,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 13:09:02,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:09:06,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:09:06,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=374540.0, ans=0.1 2023-09-29 13:09:09,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:09:09,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:09:10,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:09:12,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:09:12,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:09:13,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:09:13,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:09:13,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:09:15,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:09:18,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:09:21,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=374606.6666666667, ans=0.125 2023-09-29 13:09:22,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:09:22,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 13:09:24,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:09:24,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:09:24,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=374673.3333333333, ans=0.125 2023-09-29 13:09:24,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=374673.3333333333, ans=0.07 2023-09-29 13:09:27,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:09:29,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:09:29,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:09:29,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:09:34,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:09:34,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:09:35,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=374673.3333333333, ans=10.0 2023-09-29 13:09:38,231 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=374673.3333333333, ans=0.1 2023-09-29 13:09:42,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:09:42,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:09:42,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:09:45,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:09:45,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:09:47,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:09:47,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 13:09:49,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:09:49,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:09:50,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 13:09:52,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:09:54,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=374740.0, ans=0.0 2023-09-29 13:09:57,393 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 2.035e+02 2.257e+02 2.557e+02 3.814e+02, threshold=4.513e+02, percent-clipped=0.0 2023-09-29 13:09:58,900 INFO [train.py:1039] (3/4) Epoch 11, batch 3100, loss[loss=0.2023, simple_loss=0.2864, pruned_loss=0.05913, over 24583.00 frames. ], tot_loss[loss=0.2017, simple_loss=0.2728, pruned_loss=0.06524, over 4723622.58 frames. ], batch size: 71, lr: 9.41e-03, grad_scale: 16.0 2023-09-29 13:09:59,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:10:00,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:10:04,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 13:10:04,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 13:10:07,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 13:10:07,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 13:10:08,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=374806.6666666667, ans=0.125 2023-09-29 13:10:10,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:10:13,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:10:14,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:10:15,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 13:10:16,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=374873.3333333333, ans=0.1 2023-09-29 13:10:20,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:10:26,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 13:10:29,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=374873.3333333333, ans=0.125 2023-09-29 13:10:30,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:10:31,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:10:31,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:10:31,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:10:33,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 13:10:36,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:10:37,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 13:10:37,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:10:38,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:10:41,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 13:10:41,736 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=374940.0, ans=0.0 2023-09-29 13:10:41,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=374940.0, ans=0.125 2023-09-29 13:10:43,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:10:46,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:10:48,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 13:10:48,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 13:10:49,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:10:51,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:10:54,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:10:54,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:10:54,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:10:55,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:10:55,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:10:57,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:10:57,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:10:57,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:10:57,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 13:11:01,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:11:03,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 13:11:06,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:11:06,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 13:11:07,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:11:07,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:11:07,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 13:11:08,190 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=375073.3333333333, ans=0.0 2023-09-29 13:11:19,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 13:11:21,190 INFO [train.py:1039] (3/4) Epoch 11, batch 3150, loss[loss=0.1612, simple_loss=0.2346, pruned_loss=0.04393, over 24305.00 frames. ], tot_loss[loss=0.2013, simple_loss=0.2722, pruned_loss=0.06515, over 4720242.86 frames. ], batch size: 56, lr: 9.40e-03, grad_scale: 16.0 2023-09-29 13:11:22,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:11:22,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:11:25,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:11:25,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:11:27,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 13:11:27,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=375140.0, ans=0.0 2023-09-29 13:11:28,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:11:28,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 13:11:30,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 13:11:31,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:11:33,358 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 13:11:36,855 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.59 vs. limit=15.0 2023-09-29 13:11:37,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 13:11:37,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:11:40,286 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 13:11:40,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 13:11:41,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 13:11:43,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 13:11:43,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 13:11:43,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:11:43,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:11:44,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:11:47,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 13:11:50,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:11:50,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:11:52,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:11:53,713 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 13:11:56,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 13:11:56,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:11:58,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:11:58,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=375273.3333333333, ans=0.125 2023-09-29 13:12:00,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:12:00,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 13:12:01,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 13:12:03,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:12:03,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 13:12:03,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:12:03,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=375273.3333333333, ans=0.1 2023-09-29 13:12:04,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:12:04,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:12:07,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:12:07,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=375273.3333333333, ans=0.125 2023-09-29 13:12:08,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:12:08,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 13:12:10,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:12:10,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:13,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:12:13,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:12:13,681 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 13:12:15,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:12:16,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 13:12:16,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:18,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 13:12:19,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 13:12:21,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:12:21,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:12:23,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 13:12:24,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 13:12:26,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:12:28,237 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=375406.6666666667, ans=0.125 2023-09-29 13:12:28,855 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.31 vs. limit=15.0 2023-09-29 13:12:29,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:12:31,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:31,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:12:37,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:12:37,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:39,897 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.896e+02 2.240e+02 2.702e+02 3.896e+02, threshold=4.479e+02, percent-clipped=0.0 2023-09-29 13:12:40,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 13:12:41,487 INFO [train.py:1039] (3/4) Epoch 11, batch 3200, loss[loss=0.2062, simple_loss=0.2677, pruned_loss=0.07238, over 23818.00 frames. ], tot_loss[loss=0.2003, simple_loss=0.2709, pruned_loss=0.06484, over 4724801.52 frames. ], batch size: 179, lr: 9.40e-03, grad_scale: 32.0 2023-09-29 13:12:45,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:12:45,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 13:12:49,231 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=375473.3333333333, ans=0.0 2023-09-29 13:12:50,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:12:50,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:12:50,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 13:12:53,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:12:59,309 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.73 vs. limit=15.0 2023-09-29 13:13:00,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:13:03,900 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:13:04,178 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:13:07,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=375540.0, ans=0.125 2023-09-29 13:13:11,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:13:21,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 13:13:23,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:13:25,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 13:13:25,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=375606.6666666667, ans=0.125 2023-09-29 13:13:26,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 13:13:29,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:13:30,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:13:30,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:13:34,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 13:13:35,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 13:13:38,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 13:13:41,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 13:13:43,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:13:49,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:13:49,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:13:49,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=375740.0, ans=0.09899494936611666 2023-09-29 13:13:50,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:13:50,489 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 13:13:50,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:13:56,387 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.66 vs. limit=15.0 2023-09-29 13:13:57,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:13:57,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 13:13:58,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 13:14:00,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 13:14:01,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 13:14:04,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:14:04,850 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.29 vs. limit=10.0 2023-09-29 13:14:06,029 INFO [train.py:1039] (3/4) Epoch 11, batch 3250, loss[loss=0.2043, simple_loss=0.2654, pruned_loss=0.07158, over 23588.00 frames. ], tot_loss[loss=0.2002, simple_loss=0.2708, pruned_loss=0.06486, over 4722956.69 frames. ], batch size: 285, lr: 9.39e-03, grad_scale: 32.0 2023-09-29 13:14:06,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 13:14:07,594 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 13:14:07,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:14:07,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:10,588 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 13:14:15,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:14:18,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:14:18,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=375806.6666666667, ans=0.0 2023-09-29 13:14:20,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=375873.3333333333, ans=0.1 2023-09-29 13:14:26,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:14:26,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 13:14:26,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:14:26,648 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:14:26,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:14:28,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:14:28,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:14:29,828 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.35 vs. limit=15.0 2023-09-29 13:14:31,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:31,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:14:32,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:14:33,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:33,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:33,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:14:37,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:14:37,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:14:39,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:14:40,455 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:14:42,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:14:42,629 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:14:42,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:14:47,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 13:14:48,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:14:48,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:14:50,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:14:50,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:14:57,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:14:59,107 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=376006.6666666667, ans=0.0 2023-09-29 13:15:05,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:15:06,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:15:06,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 13:15:06,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:15:06,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 13:15:06,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:15:10,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 13:15:10,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 13:15:12,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:15:12,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:15:13,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:15:13,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 13:15:15,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:15:15,363 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=376073.3333333333, ans=0.125 2023-09-29 13:15:15,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=376073.3333333333, ans=0.0 2023-09-29 13:15:18,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:15:18,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:15:21,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 13:15:21,816 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:15:24,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:15:24,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 13:15:26,170 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.864e+02 2.159e+02 2.577e+02 4.318e+02, threshold=4.318e+02, percent-clipped=0.0 2023-09-29 13:15:26,997 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.58 vs. limit=15.0 2023-09-29 13:15:27,703 INFO [train.py:1039] (3/4) Epoch 11, batch 3300, loss[loss=0.2067, simple_loss=0.2702, pruned_loss=0.07165, over 23463.00 frames. ], tot_loss[loss=0.2006, simple_loss=0.2713, pruned_loss=0.06498, over 4731774.30 frames. ], batch size: 149, lr: 9.39e-03, grad_scale: 32.0 2023-09-29 13:15:27,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:15:27,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 13:15:29,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 13:15:29,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 13:15:29,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:15:34,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:15:36,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:15:36,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:15:38,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 13:15:38,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:15:42,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:15:43,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:15:45,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=376206.6666666667, ans=0.5 2023-09-29 13:15:48,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 13:15:48,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:15:48,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:15:50,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:15:51,692 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 13:15:53,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:15:53,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 13:15:55,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:15:55,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:15:55,382 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 13:15:58,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:15:58,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:16:01,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:16:01,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 13:16:01,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=376273.3333333333, ans=0.1 2023-09-29 13:16:01,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=376273.3333333333, ans=0.0 2023-09-29 13:16:03,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 13:16:03,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:16:05,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:16:08,216 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 13:16:09,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 13:16:09,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:16:12,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 13:16:15,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:16:16,851 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=376340.0, ans=0.0 2023-09-29 13:16:16,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=376340.0, ans=0.05 2023-09-29 13:16:19,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 13:16:19,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:16:22,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:16:22,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:16:22,482 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:16:22,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:16:24,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:16:24,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:16:25,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:16:27,921 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 13:16:29,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 13:16:30,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:16:30,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:16:30,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:16:33,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:16:33,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:16:36,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:16:36,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:16:36,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 13:16:38,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:16:40,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 13:16:43,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 13:16:43,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:16:46,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:16:46,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:16:48,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:16:50,012 INFO [train.py:1039] (3/4) Epoch 11, batch 3350, loss[loss=0.1978, simple_loss=0.2792, pruned_loss=0.0582, over 23992.00 frames. ], tot_loss[loss=0.2017, simple_loss=0.2727, pruned_loss=0.06536, over 4730935.80 frames. ], batch size: 80, lr: 9.38e-03, grad_scale: 32.0 2023-09-29 13:16:50,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:16:52,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:16:52,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:16:54,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:16:55,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:16:58,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:17:02,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:05,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:17:05,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:17:06,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:17:08,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 13:17:08,294 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 13:17:08,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:17:11,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 13:17:13,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 13:17:14,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:17:14,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:17:16,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:17:16,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 13:17:16,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:17,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:17:19,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:22,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:17:22,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:24,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:17:28,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:17:31,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:17:32,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:17:33,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=376606.6666666667, ans=0.125 2023-09-29 13:17:33,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=376606.6666666667, ans=0.2 2023-09-29 13:17:37,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:17:37,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:17:40,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:17:40,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:17:41,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:17:44,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 13:17:44,284 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 13:17:44,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 13:17:45,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:17:47,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 13:17:48,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:17:50,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:17:57,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:17:57,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 13:17:59,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:18:01,258 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:18:01,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:18:05,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:18:08,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 13:18:09,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:18:09,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:18:10,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:18:10,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 13:18:11,973 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.933e+02 2.082e+02 2.375e+02 4.063e+02, threshold=4.164e+02, percent-clipped=0.0 2023-09-29 13:18:12,038 INFO [train.py:1039] (3/4) Epoch 11, batch 3400, loss[loss=0.2362, simple_loss=0.2963, pruned_loss=0.08804, over 23378.00 frames. ], tot_loss[loss=0.2036, simple_loss=0.2744, pruned_loss=0.06645, over 4727704.80 frames. ], batch size: 285, lr: 9.38e-03, grad_scale: 16.0 2023-09-29 13:18:12,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:18:12,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 13:18:12,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=376806.6666666667, ans=0.1 2023-09-29 13:18:13,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:18:15,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:18:15,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 13:18:17,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:18:17,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 13:18:17,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=376806.6666666667, ans=0.0 2023-09-29 13:18:22,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 13:18:22,667 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 13:18:22,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:18:22,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=376806.6666666667, ans=0.0 2023-09-29 13:18:27,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:18:27,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:18:27,399 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:18:28,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:18:35,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:18:36,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=376873.3333333333, ans=0.125 2023-09-29 13:18:37,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 13:18:40,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:18:42,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:18:42,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:18:43,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 13:18:50,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:18:50,727 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=376940.0, ans=0.125 2023-09-29 13:18:55,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 13:19:03,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:19:04,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:19:04,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 13:19:05,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:19:07,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:19:07,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:19:07,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:19:07,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=377006.6666666667, ans=0.125 2023-09-29 13:19:12,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:19:15,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:19:15,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:19:21,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:19:24,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 13:19:30,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 13:19:34,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=377140.0, ans=0.2 2023-09-29 13:19:35,256 INFO [train.py:1039] (3/4) Epoch 11, batch 3450, loss[loss=0.2262, simple_loss=0.3018, pruned_loss=0.0753, over 24365.00 frames. ], tot_loss[loss=0.2031, simple_loss=0.2738, pruned_loss=0.06617, over 4744676.66 frames. ], batch size: 77, lr: 9.38e-03, grad_scale: 16.0 2023-09-29 13:19:35,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 13:19:38,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 13:19:40,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:19:41,630 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:19:41,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 13:19:41,908 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:19:44,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:19:49,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:19:53,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:19:54,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:19:55,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:19:55,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:19:57,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:20:04,879 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.65 vs. limit=10.0 2023-09-29 13:20:06,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 13:20:10,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 13:20:10,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 13:20:12,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:20:13,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:20:19,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 13:20:21,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:20:26,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:20:27,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:20:29,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 13:20:29,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:20:31,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 13:20:31,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:20:32,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:20:35,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:20:37,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 13:20:38,194 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=377340.0, ans=0.125 2023-09-29 13:20:41,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:20:41,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=377406.6666666667, ans=0.1 2023-09-29 13:20:44,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:20:47,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:20:51,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:20:56,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:20:56,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:20:58,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:20:58,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:20:59,827 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 2.022e+02 2.344e+02 2.789e+02 4.683e+02, threshold=4.688e+02, percent-clipped=2.0 2023-09-29 13:20:59,869 INFO [train.py:1039] (3/4) Epoch 11, batch 3500, loss[loss=0.1862, simple_loss=0.2704, pruned_loss=0.05097, over 24660.00 frames. ], tot_loss[loss=0.2013, simple_loss=0.2716, pruned_loss=0.06552, over 4724089.32 frames. ], batch size: 73, lr: 9.37e-03, grad_scale: 16.0 2023-09-29 13:21:00,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:21:04,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:21:04,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 13:21:05,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=377473.3333333333, ans=0.125 2023-09-29 13:21:08,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 13:21:09,847 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=377473.3333333333, ans=0.125 2023-09-29 13:21:11,310 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 13:21:12,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:21:12,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 13:21:20,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:21:20,845 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:21:22,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:21:22,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:21:23,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 13:21:23,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:24,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:21:24,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 13:21:25,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=377540.0, ans=0.0 2023-09-29 13:21:27,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:27,825 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 13:21:29,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:21:33,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:35,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 13:21:35,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:21:38,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:21:40,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:21:42,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:42,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=377606.6666666667, ans=0.0 2023-09-29 13:21:45,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:21:45,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:21:46,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 13:21:46,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 13:21:48,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 13:21:48,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:21:50,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:21:50,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:21:51,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:21:55,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 13:21:55,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:22:02,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:22:03,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 13:22:03,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 13:22:03,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:22:05,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:22:06,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:22:08,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:22:11,840 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 13:22:11,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:22:12,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=377740.0, ans=0.1 2023-09-29 13:22:14,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:22:16,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 13:22:17,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 13:22:19,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:22:19,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:22:21,010 INFO [train.py:1039] (3/4) Epoch 11, batch 3550, loss[loss=0.2049, simple_loss=0.2783, pruned_loss=0.06573, over 23360.00 frames. ], tot_loss[loss=0.2009, simple_loss=0.2713, pruned_loss=0.06523, over 4714548.05 frames. ], batch size: 93, lr: 9.37e-03, grad_scale: 16.0 2023-09-29 13:22:21,117 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:22:21,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:22:24,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:22:35,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:22:36,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 13:22:39,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:22:41,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:22:42,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:22:44,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:22:44,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:22:47,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:22:47,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:22:47,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:22:47,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 13:22:49,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:22:52,633 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=377940.0, ans=0.1 2023-09-29 13:22:55,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:22:55,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:22:58,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:22:58,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:23:00,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:23:00,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 13:23:00,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:23:01,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:23:03,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 13:23:04,149 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.88 vs. limit=10.0 2023-09-29 13:23:10,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:23:12,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:23:13,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:23:13,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 13:23:15,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:23:18,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 13:23:18,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=378006.6666666667, ans=0.2 2023-09-29 13:23:19,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:23:21,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:23:21,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:23:25,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 13:23:27,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:23:28,993 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=378073.3333333333, ans=0.1 2023-09-29 13:23:29,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=378073.3333333333, ans=0.09899494936611666 2023-09-29 13:23:31,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:23:33,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 13:23:33,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:23:38,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:23:39,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 13:23:43,826 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.951e+02 2.213e+02 2.629e+02 3.694e+02, threshold=4.426e+02, percent-clipped=0.0 2023-09-29 13:23:43,869 INFO [train.py:1039] (3/4) Epoch 11, batch 3600, loss[loss=0.1953, simple_loss=0.2671, pruned_loss=0.06181, over 23480.00 frames. ], tot_loss[loss=0.2005, simple_loss=0.271, pruned_loss=0.06502, over 4724801.88 frames. ], batch size: 106, lr: 9.36e-03, grad_scale: 32.0 2023-09-29 13:23:45,569 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 13:23:45,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:23:45,993 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_abs, batch_count=378140.0, ans=0.5 2023-09-29 13:23:47,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:23:49,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:23:49,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:23:50,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:23:54,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:23:55,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:23:57,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:23:57,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:23:57,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=378140.0, ans=0.2 2023-09-29 13:23:59,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:23:59,310 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 13:24:03,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 13:24:05,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:24:08,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:24:12,026 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:24:13,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:24:13,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:24:13,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 13:24:15,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:24:18,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:24:21,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:24:22,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:24:24,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:24:25,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:24:25,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 13:24:31,160 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.30 vs. limit=15.0 2023-09-29 13:24:35,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:24:36,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:24:37,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 13:24:41,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:24:45,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:24:48,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:24:55,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:24:56,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:24:56,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 13:24:57,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 13:24:59,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 13:25:02,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:25:02,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:25:03,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 13:25:03,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:25:03,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:25:03,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:25:04,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 13:25:06,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 13:25:07,544 INFO [train.py:1039] (3/4) Epoch 11, batch 3650, loss[loss=0.1912, simple_loss=0.2688, pruned_loss=0.05682, over 23565.00 frames. ], tot_loss[loss=0.2, simple_loss=0.2712, pruned_loss=0.06441, over 4725102.38 frames. ], batch size: 149, lr: 9.36e-03, grad_scale: 32.0 2023-09-29 13:25:07,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:25:08,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 13:25:11,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=378473.3333333333, ans=0.09899494936611666 2023-09-29 13:25:14,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 13:25:14,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=378473.3333333333, ans=0.125 2023-09-29 13:25:16,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:25:16,570 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=378473.3333333333, ans=0.1 2023-09-29 13:25:20,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 13:25:23,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 13:25:27,743 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:25:27,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:25:27,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:25:31,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 13:25:32,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:25:32,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 13:25:33,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:25:34,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:25:34,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 13:25:36,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 13:25:37,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:25:37,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:25:37,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:25:38,492 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.75 vs. limit=15.0 2023-09-29 13:25:39,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 13:25:41,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 13:25:43,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:25:44,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 13:25:46,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:25:46,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:25:48,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=378606.6666666667, ans=0.125 2023-09-29 13:25:54,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:25:56,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:25:56,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:25:57,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:25:59,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:26:02,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:26:04,743 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:26:06,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:26:06,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:26:06,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 13:26:08,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:26:09,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:26:15,157 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 13:26:19,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:26:19,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:26:20,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:26:21,019 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:26:21,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:26:23,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:26:23,733 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=378740.0, ans=0.5 2023-09-29 13:26:23,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=378740.0, ans=0.125 2023-09-29 13:26:24,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 13:26:24,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:26:26,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:26:29,432 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:26:30,855 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 2.035e+02 2.257e+02 2.600e+02 3.794e+02, threshold=4.515e+02, percent-clipped=0.0 2023-09-29 13:26:30,918 INFO [train.py:1039] (3/4) Epoch 11, batch 3700, loss[loss=0.215, simple_loss=0.2758, pruned_loss=0.07708, over 23853.00 frames. ], tot_loss[loss=0.2014, simple_loss=0.2719, pruned_loss=0.06547, over 4724709.84 frames. ], batch size: 195, lr: 9.36e-03, grad_scale: 32.0 2023-09-29 13:26:31,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:26:31,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=378806.6666666667, ans=0.0 2023-09-29 13:26:34,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:26:34,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 13:26:34,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:26:36,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 13:26:36,842 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 13:26:39,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:26:41,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:26:42,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:26:43,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:26:44,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:26:44,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 13:26:46,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:26:48,291 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 13:26:57,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:26:57,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:26:57,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=378873.3333333333, ans=0.0 2023-09-29 13:26:59,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:26:59,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 13:26:59,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:27:04,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:27:04,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 13:27:08,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:27:10,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:27:13,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:27:13,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:27:14,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 13:27:19,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:27:19,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 13:27:19,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:27:19,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 13:27:23,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:27:25,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:27:28,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:27:28,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 13:27:31,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:27:31,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:27:31,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:27:31,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:27:35,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:27:36,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 13:27:38,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 13:27:38,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:27:38,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:27:38,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=379073.3333333333, ans=0.0 2023-09-29 13:27:40,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:27:42,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:27:47,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:27:49,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:27:51,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:27:53,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 13:27:55,184 INFO [train.py:1039] (3/4) Epoch 11, batch 3750, loss[loss=0.185, simple_loss=0.2609, pruned_loss=0.05452, over 24431.00 frames. ], tot_loss[loss=0.2019, simple_loss=0.2725, pruned_loss=0.06564, over 4722002.74 frames. ], batch size: 66, lr: 9.35e-03, grad_scale: 32.0 2023-09-29 13:27:55,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 13:27:57,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 13:27:58,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 13:27:58,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:28:00,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:28:01,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:28:03,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:28:06,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:28:09,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:28:10,082 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.56 vs. limit=10.0 2023-09-29 13:28:11,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:28:11,893 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=379206.6666666667, ans=0.0 2023-09-29 13:28:13,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:28:16,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:28:18,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 13:28:20,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:28:20,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:28:21,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:28:23,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 13:28:28,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 13:28:30,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:28:31,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:28:33,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:28:33,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=379273.3333333333, ans=0.125 2023-09-29 13:28:39,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:28:42,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 13:28:44,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 13:28:48,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:28:53,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:28:53,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:28:58,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:29:02,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 13:29:02,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=379406.6666666667, ans=0.015 2023-09-29 13:29:03,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 13:29:05,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:29:05,755 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=379406.6666666667, ans=0.125 2023-09-29 13:29:07,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:29:08,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:29:09,155 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=379406.6666666667, ans=0.2 2023-09-29 13:29:16,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:29:18,121 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 2.004e+02 2.192e+02 2.441e+02 3.152e+02, threshold=4.385e+02, percent-clipped=0.0 2023-09-29 13:29:18,173 INFO [train.py:1039] (3/4) Epoch 11, batch 3800, loss[loss=0.2003, simple_loss=0.247, pruned_loss=0.07686, over 22669.00 frames. ], tot_loss[loss=0.2017, simple_loss=0.2718, pruned_loss=0.06583, over 4715161.73 frames. ], batch size: 322, lr: 9.35e-03, grad_scale: 32.0 2023-09-29 13:29:19,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:29:19,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 13:29:22,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 13:29:23,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:29:23,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:29:25,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 13:29:25,790 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=379473.3333333333, ans=0.125 2023-09-29 13:29:27,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 13:29:27,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:29:29,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:29:30,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:29:32,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:29:32,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:29:34,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 13:29:39,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 13:29:39,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:29:39,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=379540.0, ans=0.0 2023-09-29 13:29:40,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:29:40,874 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:29:41,137 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.51 vs. limit=15.0 2023-09-29 13:29:45,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:29:45,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:29:45,955 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.34 vs. limit=15.0 2023-09-29 13:29:47,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:29:47,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:29:49,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:29:51,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:29:56,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 13:29:56,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 13:29:58,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:29:59,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=379606.6666666667, ans=0.125 2023-09-29 13:30:05,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:30:10,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:30:12,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 13:30:12,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=379673.3333333333, ans=0.125 2023-09-29 13:30:14,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 13:30:15,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:30:17,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:30:17,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:30:18,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 13:30:20,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=379673.3333333333, ans=0.04949747468305833 2023-09-29 13:30:20,935 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=17.08 vs. limit=15.0 2023-09-29 13:30:21,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=379673.3333333333, ans=0.0 2023-09-29 13:30:23,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 13:30:23,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 13:30:24,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:30:26,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:30:31,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:30:33,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:30:34,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=379740.0, ans=0.1 2023-09-29 13:30:37,087 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=379740.0, ans=0.0 2023-09-29 13:30:39,774 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=25.34 vs. limit=22.5 2023-09-29 13:30:40,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:30:40,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 13:30:42,627 INFO [train.py:1039] (3/4) Epoch 11, batch 3850, loss[loss=0.2133, simple_loss=0.2931, pruned_loss=0.06671, over 24116.00 frames. ], tot_loss[loss=0.2009, simple_loss=0.2697, pruned_loss=0.066, over 4697620.10 frames. ], batch size: 80, lr: 9.34e-03, grad_scale: 32.0 2023-09-29 13:30:42,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:30:42,868 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:30:45,060 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.24 vs. limit=15.0 2023-09-29 13:30:45,240 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.57 vs. limit=15.0 2023-09-29 13:30:48,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:30:51,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:30:54,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 13:30:56,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 13:30:58,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=379873.3333333333, ans=0.0 2023-09-29 13:31:02,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:05,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:31:06,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:31:06,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:31:07,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=379873.3333333333, ans=0.1 2023-09-29 13:31:08,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=379873.3333333333, ans=0.125 2023-09-29 13:31:10,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:12,088 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:31:12,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:31:12,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:31:12,950 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.54 vs. limit=10.0 2023-09-29 13:31:13,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:31:16,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:31:18,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:18,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:31:18,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 13:31:18,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 13:31:18,639 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=379940.0, ans=0.125 2023-09-29 13:31:19,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:31:19,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:21,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:31:22,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:23,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 13:31:26,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 13:31:28,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:31:30,816 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 13:31:33,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 13:31:40,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:31:40,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:31:45,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:31:45,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 13:31:49,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 13:31:50,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:31:50,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:31:55,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:31:55,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:31:55,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:31:56,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:31:56,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:31:56,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 13:31:58,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:32:01,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 13:32:01,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:01,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:32:04,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:32:05,649 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.657e+02 2.057e+02 2.295e+02 2.790e+02 3.822e+02, threshold=4.589e+02, percent-clipped=0.0 2023-09-29 13:32:05,696 INFO [train.py:1039] (3/4) Epoch 11, batch 3900, loss[loss=0.1729, simple_loss=0.2479, pruned_loss=0.04892, over 24482.00 frames. ], tot_loss[loss=0.1996, simple_loss=0.2693, pruned_loss=0.06495, over 4709123.47 frames. ], batch size: 58, lr: 9.34e-03, grad_scale: 32.0 2023-09-29 13:32:05,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:07,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:32:07,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:32:07,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:32:09,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:32:10,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 13:32:10,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:15,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:32:16,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:32:17,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:32:17,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:32:20,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:32:20,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:23,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:32:23,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 13:32:25,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:32:27,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 13:32:27,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:32:29,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 13:32:29,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 13:32:34,894 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=380206.6666666667, ans=0.1 2023-09-29 13:32:35,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:32:37,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:32:37,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:32:37,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:32:40,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:32:42,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:32:43,490 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.71 vs. limit=15.0 2023-09-29 13:32:44,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:32:44,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:32:45,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:32:49,897 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.64 vs. limit=22.5 2023-09-29 13:32:52,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:32:52,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:33:01,366 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.65 vs. limit=6.0 2023-09-29 13:33:02,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 13:33:04,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:33:13,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:33:17,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:33:17,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 13:33:17,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 13:33:17,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:33:20,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 13:33:21,083 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.81 vs. limit=15.0 2023-09-29 13:33:21,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:33:23,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 13:33:27,133 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:33:28,107 INFO [train.py:1039] (3/4) Epoch 11, batch 3950, loss[loss=0.2096, simple_loss=0.2766, pruned_loss=0.07135, over 23439.00 frames. ], tot_loss[loss=0.1993, simple_loss=0.2694, pruned_loss=0.06459, over 4710581.52 frames. ], batch size: 105, lr: 9.34e-03, grad_scale: 16.0 2023-09-29 13:33:28,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=380473.3333333333, ans=0.125 2023-09-29 13:33:29,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:33:31,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 13:33:31,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:33:34,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:33:36,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:33:38,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=380473.3333333333, ans=0.0 2023-09-29 13:33:45,429 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 13:33:45,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:33:45,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 13:33:47,106 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 13:33:48,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:33:51,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:33:51,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:33:51,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:33:54,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 13:33:57,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:33:59,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:33:59,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:34:01,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:34:01,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:34:06,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=380606.6666666667, ans=0.125 2023-09-29 13:34:12,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:34:14,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:34:19,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 13:34:23,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=380673.3333333333, ans=0.0 2023-09-29 13:34:25,873 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 13:34:25,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 13:34:25,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:34:27,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:34:36,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:34:36,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:34:37,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:34:37,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:34:37,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 13:34:42,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:34:43,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:34:44,262 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=380740.0, ans=0.2 2023-09-29 13:34:47,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 13:34:50,665 INFO [train.py:1039] (3/4) Epoch 11, batch 4000, loss[loss=0.2041, simple_loss=0.2864, pruned_loss=0.06093, over 24688.00 frames. ], tot_loss[loss=0.2001, simple_loss=0.2703, pruned_loss=0.06497, over 4715370.81 frames. ], batch size: 73, lr: 9.33e-03, grad_scale: 32.0 2023-09-29 13:34:52,648 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.905e+02 2.140e+02 2.457e+02 3.925e+02, threshold=4.280e+02, percent-clipped=0.0 2023-09-29 13:34:58,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:35:02,564 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.09 vs. limit=15.0 2023-09-29 13:35:05,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:35:10,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:35:10,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:35:10,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:35:10,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 13:35:12,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:35:12,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 13:35:12,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:35:12,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 13:35:12,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=380873.3333333333, ans=0.0 2023-09-29 13:35:14,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:35:18,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:35:18,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:35:18,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:35:18,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:35:18,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 13:35:20,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:35:21,813 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.92 vs. limit=15.0 2023-09-29 13:35:22,433 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 13:35:22,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:35:22,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:35:26,465 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 13:35:26,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:35:26,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:35:29,926 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=380940.0, ans=0.125 2023-09-29 13:35:34,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 13:35:34,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:35:37,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:35:38,839 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 13:35:40,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:35:40,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 13:35:40,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:35:42,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:35:42,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:35:45,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:35:45,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:35:46,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:35:47,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 13:35:48,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:35:49,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=381006.6666666667, ans=0.2 2023-09-29 13:35:50,597 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 13:35:53,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:35:57,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 13:35:58,122 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=381073.3333333333, ans=0.0 2023-09-29 13:35:59,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:35:59,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:36:00,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:36:02,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:36:06,845 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:36:09,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 13:36:11,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 13:36:13,375 INFO [train.py:1039] (3/4) Epoch 11, batch 4050, loss[loss=0.2139, simple_loss=0.2921, pruned_loss=0.06791, over 24644.00 frames. ], tot_loss[loss=0.2024, simple_loss=0.2723, pruned_loss=0.06622, over 4716173.50 frames. ], batch size: 73, lr: 9.33e-03, grad_scale: 32.0 2023-09-29 13:36:13,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:36:13,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:36:14,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:36:16,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:36:19,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:36:22,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:36:25,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:36:26,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 13:36:28,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:36:28,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:36:32,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:36:32,712 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=381206.6666666667, ans=0.2 2023-09-29 13:36:35,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:36:38,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 13:36:40,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 13:36:40,157 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 13:36:41,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:36:47,154 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=381273.3333333333, ans=0.125 2023-09-29 13:36:51,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 13:36:52,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:36:53,720 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.32 vs. limit=10.0 2023-09-29 13:36:56,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:36:59,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:36:59,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:36:59,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:37:03,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:37:06,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=381340.0, ans=0.5 2023-09-29 13:37:07,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 13:37:07,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:37:09,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:37:12,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 13:37:14,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=381340.0, ans=0.125 2023-09-29 13:37:15,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=381340.0, ans=0.125 2023-09-29 13:37:16,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:37:26,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 13:37:26,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:37:26,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:37:29,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 13:37:29,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 13:37:29,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:37:32,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:37:34,412 INFO [train.py:1039] (3/4) Epoch 11, batch 4100, loss[loss=0.2189, simple_loss=0.2909, pruned_loss=0.07349, over 24380.00 frames. ], tot_loss[loss=0.2031, simple_loss=0.2733, pruned_loss=0.06644, over 4724190.09 frames. ], batch size: 77, lr: 9.32e-03, grad_scale: 32.0 2023-09-29 13:37:34,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:37:34,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:37:35,983 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.675e+02 2.063e+02 2.315e+02 3.202e+02 5.550e+02, threshold=4.630e+02, percent-clipped=7.0 2023-09-29 13:37:40,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=381473.3333333333, ans=0.125 2023-09-29 13:37:41,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 13:37:43,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 13:37:44,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 13:37:46,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 13:37:46,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:37:48,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:37:48,390 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:37:48,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:37:49,949 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 13:37:52,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:37:54,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:37:54,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:37:55,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:37:59,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:38:01,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:38:01,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:38:01,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 13:38:02,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:38:02,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:38:02,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:38:02,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:38:04,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 13:38:05,256 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.59 vs. limit=15.0 2023-09-29 13:38:08,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:38:08,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 13:38:10,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:38:12,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=381606.6666666667, ans=0.125 2023-09-29 13:38:14,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:38:14,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 13:38:16,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:38:17,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:38:17,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:38:19,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 13:38:21,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:38:21,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:38:21,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=381606.6666666667, ans=0.125 2023-09-29 13:38:24,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 13:38:24,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:38:24,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:38:27,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:38:33,954 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.03 vs. limit=15.0 2023-09-29 13:38:34,459 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:38:37,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:38:39,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:38:47,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:38:47,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:38:47,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=381740.0, ans=0.125 2023-09-29 13:38:50,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:38:53,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:38:57,282 INFO [train.py:1039] (3/4) Epoch 11, batch 4150, loss[loss=0.1999, simple_loss=0.2807, pruned_loss=0.05952, over 24286.00 frames. ], tot_loss[loss=0.2034, simple_loss=0.2735, pruned_loss=0.06666, over 4726842.95 frames. ], batch size: 74, lr: 9.32e-03, grad_scale: 32.0 2023-09-29 13:38:58,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:39:00,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:39:02,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:39:02,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:39:05,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 13:39:06,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:39:06,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 13:39:07,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 13:39:07,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 13:39:08,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:39:13,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=381873.3333333333, ans=0.0 2023-09-29 13:39:14,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:39:14,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:39:18,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:39:19,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:39:21,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:39:23,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 13:39:23,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:39:25,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 13:39:30,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:39:34,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:39:34,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 13:39:38,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 13:39:38,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:39:39,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 13:39:39,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:39:39,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:39:44,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:39:44,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:39:48,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 13:39:51,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:39:51,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:39:51,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=382006.6666666667, ans=0.125 2023-09-29 13:39:52,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 13:39:54,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:39:56,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 13:39:59,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:40:00,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:40:02,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:40:03,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 13:40:03,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:03,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 13:40:04,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 13:40:04,447 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=382073.3333333333, ans=0.07 2023-09-29 13:40:07,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 13:40:07,271 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:40:07,277 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:40:08,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:40:08,803 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 13:40:08,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:40:08,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 13:40:10,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:40:13,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:40:13,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 13:40:14,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 13:40:18,773 INFO [train.py:1039] (3/4) Epoch 11, batch 4200, loss[loss=0.2211, simple_loss=0.2804, pruned_loss=0.08085, over 23778.00 frames. ], tot_loss[loss=0.203, simple_loss=0.2726, pruned_loss=0.06669, over 4715387.79 frames. ], batch size: 164, lr: 9.32e-03, grad_scale: 32.0 2023-09-29 13:40:20,254 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.930e+02 2.193e+02 2.587e+02 4.330e+02, threshold=4.386e+02, percent-clipped=0.0 2023-09-29 13:40:20,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:40:21,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 13:40:24,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:40:26,542 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:40:26,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:40:28,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:40:28,089 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:40:28,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=382140.0, ans=0.125 2023-09-29 13:40:30,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 13:40:33,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 13:40:35,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:36,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:40:38,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:40:41,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 13:40:44,718 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:40:44,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:45,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 13:40:45,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:40:45,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=382206.6666666667, ans=0.0 2023-09-29 13:40:46,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:47,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:40:48,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:40:48,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:40:48,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=382206.6666666667, ans=0.1 2023-09-29 13:40:51,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 13:40:51,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:40:55,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 13:40:56,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:41:00,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:41:00,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:41:02,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=382273.3333333333, ans=0.0 2023-09-29 13:41:05,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:41:05,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 13:41:05,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:41:05,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:41:11,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 13:41:14,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:41:22,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:41:24,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 13:41:27,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:41:28,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=382406.6666666667, ans=0.2 2023-09-29 13:41:30,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:41:33,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:41:34,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 13:41:38,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 13:41:41,371 INFO [train.py:1039] (3/4) Epoch 11, batch 4250, loss[loss=0.2036, simple_loss=0.2805, pruned_loss=0.06333, over 24049.00 frames. ], tot_loss[loss=0.2022, simple_loss=0.2722, pruned_loss=0.06613, over 4710543.25 frames. ], batch size: 86, lr: 9.31e-03, grad_scale: 32.0 2023-09-29 13:41:44,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 13:41:44,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 13:41:48,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:41:52,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=382473.3333333333, ans=0.1 2023-09-29 13:41:54,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:41:54,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 13:41:55,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:41:57,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:42:01,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:42:06,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:42:07,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:42:07,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:42:07,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:42:08,644 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=15.52 vs. limit=15.0 2023-09-29 13:42:10,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:42:10,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:42:10,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:42:14,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:42:14,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:42:16,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 13:42:16,771 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.22 vs. limit=12.0 2023-09-29 13:42:20,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 13:42:20,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:42:22,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:42:22,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:42:23,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:42:25,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:42:25,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:42:29,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 13:42:31,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:42:36,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:42:38,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:42:38,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 13:42:39,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:42:41,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 13:42:42,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:42:44,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 13:42:44,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:42:44,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:42:48,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 13:42:49,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 13:42:51,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 13:42:54,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:42:56,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:42:56,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=382740.0, ans=0.1 2023-09-29 13:42:57,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:42:59,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:43:00,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:43:02,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:43:03,986 INFO [train.py:1039] (3/4) Epoch 11, batch 4300, loss[loss=0.213, simple_loss=0.2716, pruned_loss=0.07722, over 22714.00 frames. ], tot_loss[loss=0.2008, simple_loss=0.2709, pruned_loss=0.0653, over 4716022.30 frames. ], batch size: 322, lr: 9.31e-03, grad_scale: 16.0 2023-09-29 13:43:04,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:43:04,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 13:43:05,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:43:07,086 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.989e+02 2.378e+02 2.757e+02 5.301e+02, threshold=4.756e+02, percent-clipped=4.0 2023-09-29 13:43:12,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:43:12,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:43:17,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:43:23,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:43:23,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 13:43:25,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:43:27,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:43:27,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:43:28,827 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 13:43:32,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=382873.3333333333, ans=0.1 2023-09-29 13:43:33,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:43:33,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:43:35,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 13:43:37,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:43:37,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 13:43:40,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 13:43:42,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:43:44,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:43:44,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:43:46,098 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:43:47,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:43:47,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:43:47,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 13:43:48,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=382940.0, ans=0.2 2023-09-29 13:43:49,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 13:43:51,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=382940.0, ans=0.0 2023-09-29 13:43:52,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:43:53,840 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.75 vs. limit=22.5 2023-09-29 13:43:54,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:43:54,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:43:54,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:43:54,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=383006.6666666667, ans=0.05 2023-09-29 13:43:56,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:43:56,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 13:43:56,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 13:43:56,267 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 13:43:57,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:43:57,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 13:43:59,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 13:44:02,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:44:05,300 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 13:44:06,716 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:44:07,125 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:44:08,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:44:09,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:44:13,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 13:44:13,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 13:44:13,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:44:14,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:44:14,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:44:14,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:44:18,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:44:18,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=383073.3333333333, ans=0.125 2023-09-29 13:44:21,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:44:23,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:44:23,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:44:26,388 INFO [train.py:1039] (3/4) Epoch 11, batch 4350, loss[loss=0.2072, simple_loss=0.2736, pruned_loss=0.07039, over 23321.00 frames. ], tot_loss[loss=0.2019, simple_loss=0.2721, pruned_loss=0.06579, over 4718805.70 frames. ], batch size: 105, lr: 9.30e-03, grad_scale: 16.0 2023-09-29 13:44:30,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 13:44:31,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 13:44:34,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:44:37,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:44:40,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:44:40,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:44:45,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:44:49,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:44:51,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=383206.6666666667, ans=0.125 2023-09-29 13:44:52,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:44:52,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:44:54,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:44:57,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:44:59,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 13:45:00,256 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.29 vs. limit=15.0 2023-09-29 13:45:07,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 13:45:07,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=383273.3333333333, ans=0.0 2023-09-29 13:45:08,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:45:08,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:13,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:16,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 13:45:19,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:45:21,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 13:45:25,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=383340.0, ans=0.2 2023-09-29 13:45:26,591 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 13:45:28,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:45:28,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:45:29,785 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 13:45:29,892 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 13:45:29,900 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:45:29,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:45:31,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:45:31,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:45:33,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:45:34,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:45:37,620 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 13:45:37,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:37,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:45:37,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:37,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 13:45:39,328 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 13:45:39,336 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 13:45:40,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 13:45:43,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:45:43,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:45:43,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:45:45,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:45:45,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 13:45:47,234 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 13:45:47,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:47,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=383473.3333333333, ans=0.125 2023-09-29 13:45:48,655 INFO [train.py:1039] (3/4) Epoch 11, batch 4400, loss[loss=0.2565, simple_loss=0.304, pruned_loss=0.1045, over 22680.00 frames. ], tot_loss[loss=0.2017, simple_loss=0.2722, pruned_loss=0.06556, over 4722368.92 frames. ], batch size: 322, lr: 9.30e-03, grad_scale: 32.0 2023-09-29 13:45:50,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:45:50,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:45:51,787 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.907e+02 2.226e+02 2.866e+02 4.775e+02, threshold=4.452e+02, percent-clipped=1.0 2023-09-29 13:45:52,126 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:45:55,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 13:45:55,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 13:45:55,816 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 13:45:57,254 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 13:45:57,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 13:45:57,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:46:00,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 13:46:02,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:46:03,451 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.36 vs. limit=15.0 2023-09-29 13:46:04,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:46:04,236 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 13:46:08,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:46:08,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 13:46:09,955 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 13:46:14,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 13:46:14,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 13:46:15,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 13:46:16,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:46:17,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:46:19,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:46:20,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:46:20,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 13:46:20,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 13:46:22,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:46:23,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 13:46:24,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:46:25,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:46:27,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:46:27,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 13:46:27,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=383606.6666666667, ans=0.0 2023-09-29 13:46:28,467 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 13:46:31,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=383606.6666666667, ans=0.125 2023-09-29 13:46:33,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:46:40,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:46:41,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 13:46:46,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:46:48,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:46:52,803 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:46:52,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 13:46:54,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:46:54,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:46:54,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:46:54,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 13:46:59,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 13:47:00,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=383740.0, ans=0.2 2023-09-29 13:47:02,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 13:47:03,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 13:47:03,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:47:03,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 13:47:03,825 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:47:07,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:47:09,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 13:47:11,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=383806.6666666667, ans=10.0 2023-09-29 13:47:12,114 INFO [train.py:1039] (3/4) Epoch 11, batch 4450, loss[loss=0.2014, simple_loss=0.2807, pruned_loss=0.0611, over 24000.00 frames. ], tot_loss[loss=0.2024, simple_loss=0.273, pruned_loss=0.06594, over 4723455.68 frames. ], batch size: 80, lr: 9.30e-03, grad_scale: 32.0 2023-09-29 13:47:12,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:47:16,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:47:17,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 13:47:18,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_ff2.min_abs, batch_count=383806.6666666667, ans=0.1 2023-09-29 13:47:23,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:47:24,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:47:26,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:47:26,603 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:47:29,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:47:31,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:47:31,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:47:32,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 13:47:32,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:47:34,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:47:34,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:47:35,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 13:47:38,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 13:47:39,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=383873.3333333333, ans=0.125 2023-09-29 13:47:45,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:47:46,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:47:48,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:47:49,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:47:49,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:47:57,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 13:47:57,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 13:47:58,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 13:47:58,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:48:02,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:48:02,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 13:48:06,162 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.25 vs. limit=22.5 2023-09-29 13:48:06,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 13:48:08,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=384006.6666666667, ans=0.125 2023-09-29 13:48:09,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:48:11,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 13:48:11,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:48:11,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:48:11,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:48:12,144 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.39 vs. limit=22.5 2023-09-29 13:48:12,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:48:13,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:48:16,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 13:48:16,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 13:48:19,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 13:48:21,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:48:23,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:48:23,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:48:25,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 13:48:27,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:48:30,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 13:48:32,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:48:36,510 INFO [train.py:1039] (3/4) Epoch 11, batch 4500, loss[loss=0.1983, simple_loss=0.2582, pruned_loss=0.06919, over 23680.00 frames. ], tot_loss[loss=0.2024, simple_loss=0.2727, pruned_loss=0.06609, over 4716442.31 frames. ], batch size: 232, lr: 9.29e-03, grad_scale: 16.0 2023-09-29 13:48:38,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:48:39,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 13:48:39,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 13:48:41,277 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 2.027e+02 2.276e+02 2.770e+02 4.229e+02, threshold=4.551e+02, percent-clipped=0.0 2023-09-29 13:48:41,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:48:47,548 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:48:48,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:48:49,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 13:48:50,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:48:50,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:48:50,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:48:53,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=384206.6666666667, ans=0.125 2023-09-29 13:49:01,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=384206.6666666667, ans=0.2 2023-09-29 13:49:03,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:49:05,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:49:07,669 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.88 vs. limit=15.0 2023-09-29 13:49:08,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:49:09,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 13:49:09,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:49:13,689 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.86 vs. limit=15.0 2023-09-29 13:49:15,098 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=10.59 vs. limit=15.0 2023-09-29 13:49:17,526 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:49:20,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:49:25,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:49:27,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=384340.0, ans=0.125 2023-09-29 13:49:28,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:49:29,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 13:49:30,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:49:30,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:49:34,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:49:34,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:49:36,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:49:37,980 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 13:49:37,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 13:49:37,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:49:42,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:49:42,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 13:49:44,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:49:47,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 13:49:47,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:49:50,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 13:49:51,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 13:49:51,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 13:49:56,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 13:49:58,055 INFO [train.py:1039] (3/4) Epoch 11, batch 4550, loss[loss=0.196, simple_loss=0.2357, pruned_loss=0.0782, over 19701.00 frames. ], tot_loss[loss=0.2013, simple_loss=0.2718, pruned_loss=0.06545, over 4716935.04 frames. ], batch size: 388, lr: 9.29e-03, grad_scale: 16.0 2023-09-29 13:49:58,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 13:49:59,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:50:03,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:50:04,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:50:07,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:50:10,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:50:14,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:50:17,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:50:17,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:50:17,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:50:20,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:50:21,637 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:50:24,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:50:27,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 13:50:27,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 13:50:29,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 13:50:32,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 13:50:34,498 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.43 vs. limit=15.0 2023-09-29 13:50:35,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 13:50:35,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:50:40,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 13:50:42,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:50:45,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:50:45,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:50:45,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 13:50:49,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 13:50:51,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:50:52,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:50:52,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:50:53,753 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=6.43 vs. limit=10.0 2023-09-29 13:50:55,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:50:57,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 13:50:57,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 13:50:58,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:50:58,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 13:51:01,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 13:51:01,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:51:03,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:03,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:51:04,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:51:04,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:51:05,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 13:51:06,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 13:51:08,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:51:08,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 13:51:08,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 13:51:08,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:51:08,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 13:51:11,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=384740.0, ans=0.125 2023-09-29 13:51:11,323 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.62 vs. limit=15.0 2023-09-29 13:51:12,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:51:12,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:51:14,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:51:14,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:51:14,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 13:51:17,374 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.26 vs. limit=15.0 2023-09-29 13:51:18,026 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:51:18,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 13:51:20,997 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.08 vs. limit=12.0 2023-09-29 13:51:21,626 INFO [train.py:1039] (3/4) Epoch 11, batch 4600, loss[loss=0.1906, simple_loss=0.2768, pruned_loss=0.05219, over 24561.00 frames. ], tot_loss[loss=0.2002, simple_loss=0.2704, pruned_loss=0.06503, over 4717896.39 frames. ], batch size: 71, lr: 9.28e-03, grad_scale: 16.0 2023-09-29 13:51:21,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:23,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:51:25,952 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 2.102e+02 2.367e+02 2.907e+02 4.657e+02, threshold=4.735e+02, percent-clipped=1.0 2023-09-29 13:51:26,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:51:26,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:51:27,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:51:29,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 13:51:30,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:51:35,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:51:36,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:51:40,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:47,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 13:51:48,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:51,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:51:55,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:51:55,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:51:57,213 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.15 vs. limit=15.0 2023-09-29 13:52:01,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 13:52:01,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:52:02,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:52:07,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:52:07,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:52:08,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 13:52:12,922 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.93 vs. limit=22.5 2023-09-29 13:52:13,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 13:52:13,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 13:52:20,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:52:20,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:52:22,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:52:22,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 13:52:22,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:52:23,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 13:52:23,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:52:25,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:52:27,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:52:27,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:52:29,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:52:30,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 13:52:30,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 13:52:32,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 13:52:32,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:52:33,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:52:35,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:52:35,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:52:42,923 INFO [train.py:1039] (3/4) Epoch 11, batch 4650, loss[loss=0.197, simple_loss=0.2794, pruned_loss=0.05729, over 24653.00 frames. ], tot_loss[loss=0.1997, simple_loss=0.2706, pruned_loss=0.0644, over 4734325.25 frames. ], batch size: 68, lr: 9.28e-03, grad_scale: 16.0 2023-09-29 13:52:46,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:52:49,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:52:50,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:52:50,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:52:52,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:52:52,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:52:52,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:52:58,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 13:53:01,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:53:03,859 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.73 vs. limit=15.0 2023-09-29 13:53:05,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 13:53:05,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:53:05,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 13:53:05,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:53:06,731 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 13:53:06,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 13:53:06,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:53:06,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 13:53:09,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 13:53:11,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:53:11,546 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 13:53:14,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:53:16,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 13:53:19,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:53:19,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:53:20,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 13:53:22,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:53:25,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:53:29,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:53:34,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:53:37,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:53:40,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:53:40,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 13:53:41,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 13:53:41,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 13:53:43,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 13:53:43,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 13:53:45,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:53:50,560 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.59 vs. limit=15.0 2023-09-29 13:53:51,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:53:51,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:53:51,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 13:53:51,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:53:52,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:53:52,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:53:54,298 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 13:53:57,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 13:53:57,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:53:59,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:54:02,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:54:02,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:54:04,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 13:54:04,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 13:54:06,321 INFO [train.py:1039] (3/4) Epoch 11, batch 4700, loss[loss=0.2162, simple_loss=0.2765, pruned_loss=0.0779, over 23627.00 frames. ], tot_loss[loss=0.1999, simple_loss=0.2712, pruned_loss=0.06426, over 4739218.50 frames. ], batch size: 256, lr: 9.28e-03, grad_scale: 16.0 2023-09-29 13:54:06,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 13:54:06,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 13:54:08,597 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=385473.3333333333, ans=0.125 2023-09-29 13:54:11,830 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.872e+02 2.042e+02 2.233e+02 3.363e+02, threshold=4.084e+02, percent-clipped=0.0 2023-09-29 13:54:13,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:54:15,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:54:16,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:54:18,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:54:20,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 13:54:25,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 13:54:27,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 13:54:30,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:54:30,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:54:31,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:54:37,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:54:37,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=385606.6666666667, ans=0.0 2023-09-29 13:54:44,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:54:45,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 13:54:47,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:54:49,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=385606.6666666667, ans=0.125 2023-09-29 13:54:55,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 13:54:55,775 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 13:54:58,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:00,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 13:55:03,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:55:07,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:55:07,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 13:55:08,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:08,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:55:11,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=385740.0, ans=0.2 2023-09-29 13:55:12,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:55:12,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:55:12,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 13:55:12,922 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 13:55:14,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:55:14,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:14,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:14,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 13:55:16,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:55:22,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 13:55:26,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:55:27,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:55:29,199 INFO [train.py:1039] (3/4) Epoch 11, batch 4750, loss[loss=0.1912, simple_loss=0.2712, pruned_loss=0.05563, over 24558.00 frames. ], tot_loss[loss=0.1995, simple_loss=0.2715, pruned_loss=0.06373, over 4747463.06 frames. ], batch size: 71, lr: 9.27e-03, grad_scale: 16.0 2023-09-29 13:55:31,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=385806.6666666667, ans=0.0 2023-09-29 13:55:32,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:55:33,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:55:35,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 13:55:35,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:55:35,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=385806.6666666667, ans=0.125 2023-09-29 13:55:39,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 13:55:40,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 13:55:40,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:55:43,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:55:43,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=385806.6666666667, ans=0.125 2023-09-29 13:55:47,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 13:55:51,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 13:55:53,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 13:55:54,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:55:55,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=385873.3333333333, ans=0.0 2023-09-29 13:55:56,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:55:56,427 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:55:56,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:55:57,939 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 13:55:57,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 13:56:04,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 13:56:09,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:56:12,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:56:14,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:56:14,486 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 13:56:14,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:56:18,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 13:56:21,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 13:56:21,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 13:56:23,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 13:56:23,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:56:23,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:56:24,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:56:24,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 13:56:26,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 13:56:28,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 13:56:31,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:56:33,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:56:33,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 13:56:35,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:56:36,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:56:38,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 13:56:39,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:56:39,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 13:56:43,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:56:44,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 13:56:44,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 13:56:46,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 13:56:48,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 13:56:49,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:56:50,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 13:56:52,872 INFO [train.py:1039] (3/4) Epoch 11, batch 4800, loss[loss=0.2276, simple_loss=0.284, pruned_loss=0.08561, over 23640.00 frames. ], tot_loss[loss=0.2009, simple_loss=0.2724, pruned_loss=0.06468, over 4741948.65 frames. ], batch size: 232, lr: 9.27e-03, grad_scale: 32.0 2023-09-29 13:56:54,712 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:56:54,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:56:55,117 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=386140.0, ans=0.1 2023-09-29 13:56:57,598 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.978e+02 2.285e+02 2.567e+02 3.711e+02, threshold=4.569e+02, percent-clipped=0.0 2023-09-29 13:57:00,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 13:57:01,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:57:01,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:57:02,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 13:57:03,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:57:03,815 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.24 vs. limit=12.0 2023-09-29 13:57:04,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 13:57:05,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 13:57:09,681 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:57:12,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:57:12,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 13:57:14,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:57:14,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 13:57:15,515 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:57:17,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:57:18,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:57:22,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:57:25,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 13:57:25,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:57:26,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 13:57:28,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:57:30,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 13:57:30,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 13:57:31,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:57:31,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:57:31,838 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=386273.3333333333, ans=0.125 2023-09-29 13:57:33,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:57:33,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:57:33,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 13:57:36,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 13:57:36,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 13:57:41,258 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:57:41,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:57:43,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:57:48,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 13:57:48,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:57:49,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:57:49,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:57:49,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:57:54,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:57:56,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 13:57:56,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:57:56,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 13:57:57,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 13:57:58,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 13:57:59,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=386406.6666666667, ans=0.125 2023-09-29 13:58:02,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:58:02,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:58:02,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:58:03,122 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.69 vs. limit=22.5 2023-09-29 13:58:04,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 13:58:06,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=386406.6666666667, ans=10.0 2023-09-29 13:58:07,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 13:58:07,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:58:07,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:58:08,196 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.12 vs. limit=12.0 2023-09-29 13:58:08,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:58:08,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:58:12,629 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:58:14,046 INFO [train.py:1039] (3/4) Epoch 11, batch 4850, loss[loss=0.1911, simple_loss=0.2524, pruned_loss=0.06492, over 23563.00 frames. ], tot_loss[loss=0.2011, simple_loss=0.2725, pruned_loss=0.06485, over 4734843.13 frames. ], batch size: 256, lr: 9.26e-03, grad_scale: 16.0 2023-09-29 13:58:22,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 13:58:24,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:58:27,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:58:27,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 13:58:27,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:58:33,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 13:58:33,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 13:58:34,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 13:58:34,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 13:58:37,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=386540.0, ans=0.1 2023-09-29 13:58:40,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:58:43,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 13:58:43,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 13:58:45,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 13:58:45,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 13:58:47,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 13:58:47,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:58:52,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:58:52,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 13:58:52,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 13:58:53,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 13:58:59,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=386606.6666666667, ans=0.125 2023-09-29 13:59:00,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 13:59:00,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 13:59:01,113 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 13:59:01,495 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.69 vs. limit=22.5 2023-09-29 13:59:02,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 13:59:02,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 13:59:06,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 13:59:08,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 13:59:08,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:59:09,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 13:59:09,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:59:11,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:59:12,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 13:59:22,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 13:59:27,631 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.32 vs. limit=6.0 2023-09-29 13:59:30,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 13:59:30,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:59:36,685 INFO [train.py:1039] (3/4) Epoch 11, batch 4900, loss[loss=0.2007, simple_loss=0.2431, pruned_loss=0.07916, over 19017.00 frames. ], tot_loss[loss=0.2, simple_loss=0.2706, pruned_loss=0.06464, over 4729799.92 frames. ], batch size: 390, lr: 9.26e-03, grad_scale: 16.0 2023-09-29 13:59:36,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 13:59:36,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 13:59:37,541 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.26 vs. limit=15.0 2023-09-29 13:59:42,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 13:59:42,728 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.93 vs. limit=15.0 2023-09-29 13:59:43,964 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.984e+02 2.247e+02 2.564e+02 4.606e+02, threshold=4.494e+02, percent-clipped=1.0 2023-09-29 13:59:44,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:59:44,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 13:59:44,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=386806.6666666667, ans=0.0 2023-09-29 13:59:46,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=386806.6666666667, ans=0.0 2023-09-29 13:59:47,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 13:59:51,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 13:59:56,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 13:59:57,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 13:59:58,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 13:59:58,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 13:59:58,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 13:59:58,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 13:59:58,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 13:59:58,959 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=386873.3333333333, ans=0.1 2023-09-29 14:00:00,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 14:00:03,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 14:00:04,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 14:00:06,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:00:08,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:00:09,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:00:09,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:00:11,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:00:11,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 14:00:13,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:00:14,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:00:14,936 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=386940.0, ans=0.0 2023-09-29 14:00:16,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 14:00:16,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 14:00:19,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 14:00:21,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:00:21,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:00:21,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:00:21,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=386940.0, ans=0.125 2023-09-29 14:00:21,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=386940.0, ans=0.125 2023-09-29 14:00:23,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:00:23,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 14:00:23,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:00:24,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 14:00:27,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:00:29,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:00:31,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:00:34,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 14:00:34,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:00:36,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 14:00:37,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 14:00:45,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:00:47,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:00:49,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 14:00:49,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 14:00:49,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:00:54,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:00:57,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:00:57,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:00:59,087 INFO [train.py:1039] (3/4) Epoch 11, batch 4950, loss[loss=0.1998, simple_loss=0.28, pruned_loss=0.05974, over 24478.00 frames. ], tot_loss[loss=0.199, simple_loss=0.2698, pruned_loss=0.06407, over 4745056.65 frames. ], batch size: 69, lr: 9.26e-03, grad_scale: 16.0 2023-09-29 14:00:59,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:00:59,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 14:00:59,699 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=387140.0, ans=0.125 2023-09-29 14:01:00,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 14:01:03,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:01:03,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 14:01:07,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=387140.0, ans=0.0 2023-09-29 14:01:08,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 14:01:08,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 14:01:10,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:01:10,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 14:01:10,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:10,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:01:11,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:01:11,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:01:14,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:01:14,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:01:16,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:01:17,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:01:20,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:20,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:01:23,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=387206.6666666667, ans=0.0 2023-09-29 14:01:24,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 14:01:29,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:32,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:01:33,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:33,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:01:36,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:01:37,019 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 14:01:38,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 14:01:42,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:01:43,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:01:43,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:01:43,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:01:45,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:01:45,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:01:48,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:01:49,403 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=387340.0, ans=15.0 2023-09-29 14:01:50,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:01:51,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:01:53,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:01:53,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:01:55,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 14:01:55,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:01:56,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:02:01,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:02:03,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:02:03,609 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:02:03,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:02:05,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:02:05,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:02:08,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:02:09,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:02:09,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:02:10,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 14:02:16,594 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:02:19,903 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=387473.3333333333, ans=0.125 2023-09-29 14:02:21,060 INFO [train.py:1039] (3/4) Epoch 11, batch 5000, loss[loss=0.2286, simple_loss=0.3013, pruned_loss=0.07798, over 23970.00 frames. ], tot_loss[loss=0.1988, simple_loss=0.2697, pruned_loss=0.06396, over 4745262.55 frames. ], batch size: 86, lr: 9.25e-03, grad_scale: 8.0 2023-09-29 14:02:21,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 14:02:21,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 14:02:27,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:02:27,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:02:29,463 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.930e+02 2.192e+02 2.539e+02 4.135e+02, threshold=4.383e+02, percent-clipped=0.0 2023-09-29 14:02:29,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 14:02:29,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 14:02:31,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:02:35,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 14:02:37,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:02:37,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:02:37,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 14:02:37,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:02:37,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=387540.0, ans=0.0 2023-09-29 14:02:38,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:02:38,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 14:02:38,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:02:40,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:02:40,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 14:02:41,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 14:02:41,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:02:42,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=387540.0, ans=0.0 2023-09-29 14:02:42,552 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.48 vs. limit=6.0 2023-09-29 14:02:43,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 14:02:43,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 14:02:43,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:02:43,499 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:02:43,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 14:02:43,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 14:02:45,302 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=387540.0, ans=0.125 2023-09-29 14:02:47,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 14:02:47,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:02:47,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:02:50,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 14:02:50,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:02:50,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:02:51,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:02:53,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 14:02:55,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 14:02:56,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:02:57,379 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.86 vs. limit=15.0 2023-09-29 14:02:57,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:03:01,834 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 14:03:05,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:03:06,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:03:06,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:07,814 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.73 vs. limit=22.5 2023-09-29 14:03:09,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 14:03:09,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=387606.6666666667, ans=0.125 2023-09-29 14:03:10,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:03:10,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:03:10,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:03:13,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 14:03:15,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:03:19,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:03:19,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:03:21,727 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.41 vs. limit=15.0 2023-09-29 14:03:25,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 14:03:28,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:39,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:03:40,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:40,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:03:40,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:03:40,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:03:40,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:03:41,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=387740.0, ans=0.07 2023-09-29 14:03:42,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:45,667 INFO [train.py:1039] (3/4) Epoch 11, batch 5050, loss[loss=0.2087, simple_loss=0.2702, pruned_loss=0.07361, over 23836.00 frames. ], tot_loss[loss=0.2009, simple_loss=0.2712, pruned_loss=0.06531, over 4728009.76 frames. ], batch size: 195, lr: 9.25e-03, grad_scale: 8.0 2023-09-29 14:03:47,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:03:47,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 14:03:48,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:03:51,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:03:51,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:03:52,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 14:03:54,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:03:54,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:03:57,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:03:58,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:03:59,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:04:08,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 14:04:08,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:04:09,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:04:11,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 14:04:11,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:04:14,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:04:14,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:04:16,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:04:16,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 14:04:16,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 14:04:17,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:04:19,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:04:22,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:04:24,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 14:04:26,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:04:29,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 14:04:30,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:04:32,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:04:32,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:04:33,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:04:35,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:04:37,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:04:38,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:04:38,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:04:38,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:04:39,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 14:04:40,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:04:41,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:04:46,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:04:46,683 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 14:04:46,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:04:48,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:04:50,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:04:50,483 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 14:04:54,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:04:54,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 14:04:54,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:04:58,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:04:58,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:04:58,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 14:04:59,273 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.68 vs. limit=15.0 2023-09-29 14:05:02,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 14:05:05,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:05:05,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:05:06,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:05:08,071 INFO [train.py:1039] (3/4) Epoch 11, batch 5100, loss[loss=0.2009, simple_loss=0.2859, pruned_loss=0.05796, over 23994.00 frames. ], tot_loss[loss=0.2003, simple_loss=0.271, pruned_loss=0.0648, over 4732691.32 frames. ], batch size: 80, lr: 9.24e-03, grad_scale: 8.0 2023-09-29 14:05:08,246 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 14:05:11,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:05:14,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 14:05:14,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 14:05:15,752 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.929e+02 2.127e+02 2.436e+02 3.285e+02, threshold=4.254e+02, percent-clipped=0.0 2023-09-29 14:05:15,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:05:17,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:05:20,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:05:20,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=388140.0, ans=0.125 2023-09-29 14:05:22,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 14:05:22,724 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 14:05:27,058 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.65 vs. limit=15.0 2023-09-29 14:05:29,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:05:30,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:05:33,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:05:36,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 14:05:36,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:05:38,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:05:38,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 14:05:41,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:05:41,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:05:41,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 14:05:44,721 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 14:05:46,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:05:46,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 14:05:46,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 14:05:49,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:05:59,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:06:02,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 14:06:02,811 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 14:06:02,835 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 14:06:05,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 14:06:05,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:06:08,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 14:06:08,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=388340.0, ans=0.2 2023-09-29 14:06:12,731 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 14:06:16,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 14:06:17,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:06:19,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 14:06:20,674 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.70 vs. limit=10.0 2023-09-29 14:06:21,839 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.30 vs. limit=22.5 2023-09-29 14:06:22,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:06:23,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 14:06:24,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=388406.6666666667, ans=0.2 2023-09-29 14:06:26,780 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.53 vs. limit=15.0 2023-09-29 14:06:27,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:06:28,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:06:28,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:06:30,038 INFO [train.py:1039] (3/4) Epoch 11, batch 5150, loss[loss=0.1776, simple_loss=0.2637, pruned_loss=0.0457, over 24456.00 frames. ], tot_loss[loss=0.201, simple_loss=0.272, pruned_loss=0.06506, over 4730406.11 frames. ], batch size: 63, lr: 9.24e-03, grad_scale: 8.0 2023-09-29 14:06:30,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:06:30,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:06:30,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:06:30,960 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.45 vs. limit=12.0 2023-09-29 14:06:32,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 14:06:32,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 14:06:32,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 14:06:34,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:06:34,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 14:06:35,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:06:37,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 14:06:38,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:06:40,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:06:45,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 14:06:45,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 14:06:47,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:06:47,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:06:49,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:06:49,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:06:49,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:06:50,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:06:50,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:06:52,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 14:06:53,608 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.57 vs. limit=15.0 2023-09-29 14:06:53,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:06:54,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:06:55,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 14:06:58,609 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 14:06:58,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:07:05,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:07:05,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 14:07:12,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:07:17,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:07:19,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:07:23,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:07:23,746 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:07:23,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=388673.3333333333, ans=0.125 2023-09-29 14:07:26,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 14:07:29,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:07:30,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:07:30,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:07:33,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:07:34,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:07:35,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 14:07:40,074 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.05 vs. limit=15.0 2023-09-29 14:07:40,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:07:42,928 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 14:07:46,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:07:47,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:07:49,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:07:49,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:07:49,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:07:49,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:07:49,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=388740.0, ans=0.125 2023-09-29 14:07:53,755 INFO [train.py:1039] (3/4) Epoch 11, batch 5200, loss[loss=0.1935, simple_loss=0.2576, pruned_loss=0.06467, over 23637.00 frames. ], tot_loss[loss=0.2015, simple_loss=0.2724, pruned_loss=0.0653, over 4736458.26 frames. ], batch size: 149, lr: 9.24e-03, grad_scale: 16.0 2023-09-29 14:07:53,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:07:55,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:07:57,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:08:02,216 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.626e+02 2.017e+02 2.564e+02 3.234e+02 5.917e+02, threshold=5.129e+02, percent-clipped=10.0 2023-09-29 14:08:02,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 14:08:02,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:08:03,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:08:08,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:08:08,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:08:08,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:08:10,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=388873.3333333333, ans=0.125 2023-09-29 14:08:11,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 14:08:13,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 14:08:15,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:08:16,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 14:08:19,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:08:20,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 14:08:22,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 14:08:22,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 14:08:25,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 14:08:25,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:08:25,178 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 14:08:26,548 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:08:28,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:08:28,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:08:28,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 14:08:29,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:08:32,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:08:35,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 14:08:36,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 14:08:36,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 14:08:41,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 14:08:41,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:08:44,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=389006.6666666667, ans=0.125 2023-09-29 14:08:48,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:08:49,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:08:51,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 14:08:51,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:08:52,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 14:08:52,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:08:52,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:08:57,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:08:57,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:09:01,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:09:02,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:09:02,842 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:09:09,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:09:10,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 14:09:12,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:09:12,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:09:14,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:09:14,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:09:15,393 INFO [train.py:1039] (3/4) Epoch 11, batch 5250, loss[loss=0.184, simple_loss=0.2663, pruned_loss=0.05082, over 24469.00 frames. ], tot_loss[loss=0.2012, simple_loss=0.2718, pruned_loss=0.06526, over 4724930.91 frames. ], batch size: 63, lr: 9.23e-03, grad_scale: 16.0 2023-09-29 14:09:17,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:09:17,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:09:20,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:09:20,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:09:22,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:09:27,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:09:29,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:09:31,163 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=389206.6666666667, ans=0.125 2023-09-29 14:09:32,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:09:35,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:09:37,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 14:09:37,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:09:39,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:09:43,136 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.57 vs. limit=15.0 2023-09-29 14:09:43,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=389206.6666666667, ans=0.0 2023-09-29 14:09:46,652 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=389273.3333333333, ans=0.07 2023-09-29 14:09:53,997 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.40 vs. limit=22.5 2023-09-29 14:10:29,779 INFO [train.py:1039] (3/4) Epoch 11, batch 5300, loss[loss=0.1781, simple_loss=0.2202, pruned_loss=0.06797, over 18840.00 frames. ], tot_loss[loss=0.2001, simple_loss=0.2704, pruned_loss=0.06492, over 4719786.26 frames. ], batch size: 388, lr: 9.23e-03, grad_scale: 16.0 2023-09-29 14:10:30,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=389473.3333333333, ans=0.125 2023-09-29 14:10:31,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=389473.3333333333, ans=0.125 2023-09-29 14:10:34,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=389473.3333333333, ans=0.2 2023-09-29 14:10:36,667 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 2.030e+02 2.219e+02 2.602e+02 3.750e+02, threshold=4.437e+02, percent-clipped=0.0 2023-09-29 14:10:44,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:10:45,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 14:10:45,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 14:10:45,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:10:45,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:10:45,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:10:45,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:10:45,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:10:45,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:10:45,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:10:45,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:10:46,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:10:46,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 14:10:46,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 14:10:46,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 14:10:46,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:10:46,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 14:10:47,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 14:10:47,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:10:47,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:10:48,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:10:48,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:10:48,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:10:49,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:10:49,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:10:49,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:10:49,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:10:49,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:10:49,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:10:49,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:10:49,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:10:50,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 14:10:50,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:10:50,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:10:50,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 14:10:50,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 14:10:51,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:10:51,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:10:51,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 14:10:51,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 14:10:51,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 14:10:52,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:10:52,959 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:10:53,107 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 14:10:53,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 14:10:53,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:10:53,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:10:53,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 14:10:53,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 14:10:53,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 14:10:54,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 14:11:03,755 INFO [train.py:1039] (3/4) Epoch 12, batch 0, loss[loss=0.1774, simple_loss=0.2518, pruned_loss=0.05149, over 24445.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2518, pruned_loss=0.05149, over 24445.00 frames. ], batch size: 58, lr: 8.84e-03, grad_scale: 32.0 2023-09-29 14:11:03,756 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 14:11:19,091 INFO [train.py:1071] (3/4) Epoch 12, validation: loss=0.305, simple_loss=0.2807, pruned_loss=0.1647, over 1125622.00 frames. 2023-09-29 14:11:19,092 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 14:11:23,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 14:11:25,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:11:26,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:11:30,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:11:30,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:11:30,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:11:31,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 14:11:33,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 14:11:34,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:11:36,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:11:38,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=389620.0, ans=0.2 2023-09-29 14:11:40,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:11:40,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:11:40,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:11:40,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:11:40,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=389620.0, ans=0.2 2023-09-29 14:11:40,934 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=8.53 vs. limit=15.0 2023-09-29 14:11:41,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 14:11:43,489 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:11:52,581 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:11:52,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:11:53,991 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 14:11:57,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:11:57,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:12:00,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:12:05,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:12:08,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:12:09,166 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.74 vs. limit=22.5 2023-09-29 14:12:15,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 14:12:18,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 14:12:18,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:12:18,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:12:19,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:12:19,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:12:22,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 14:12:23,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=389820.0, ans=0.125 2023-09-29 14:12:24,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:12:26,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:12:31,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:12:33,290 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 14:12:36,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:12:39,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:12:40,564 INFO [train.py:1039] (3/4) Epoch 12, batch 50, loss[loss=0.1695, simple_loss=0.245, pruned_loss=0.047, over 24566.00 frames. ], tot_loss[loss=0.195, simple_loss=0.2682, pruned_loss=0.0609, over 1073033.00 frames. ], batch size: 60, lr: 8.84e-03, grad_scale: 16.0 2023-09-29 14:12:42,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:12:42,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 14:12:42,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:12:42,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:12:44,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:12:46,142 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:12:47,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:12:52,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 14:12:52,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:12:57,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:12:58,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 14:13:01,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 14:13:02,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:13:04,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:13:04,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:13:05,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:13:07,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:13:07,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 14:13:07,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:13:13,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:13:15,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:13:16,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:13:17,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 14:13:20,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:13:20,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:13:20,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 14:13:22,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:13:23,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 14:13:33,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:13:33,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:13:34,567 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.37 vs. limit=10.0 2023-09-29 14:13:35,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:13:35,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=390086.6666666667, ans=0.0 2023-09-29 14:13:36,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:13:36,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:13:40,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 14:13:40,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 14:13:42,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:13:42,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:13:43,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:13:43,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:13:45,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 14:13:45,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 14:13:48,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 14:13:49,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:13:49,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:13:49,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 14:13:49,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 14:13:51,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:13:53,162 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 2.076e+02 2.460e+02 3.514e+02 7.647e+02, threshold=4.919e+02, percent-clipped=15.0 2023-09-29 14:13:53,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:13:54,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 14:13:54,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:13:58,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:13:58,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=390153.3333333333, ans=0.09899494936611666 2023-09-29 14:14:01,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:14:02,865 INFO [train.py:1039] (3/4) Epoch 12, batch 100, loss[loss=0.1851, simple_loss=0.2711, pruned_loss=0.04953, over 24657.00 frames. ], tot_loss[loss=0.1991, simple_loss=0.2718, pruned_loss=0.06316, over 1891345.98 frames. ], batch size: 68, lr: 8.83e-03, grad_scale: 16.0 2023-09-29 14:14:04,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:14:06,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 14:14:06,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:14:10,322 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:14:11,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:14:11,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:14:11,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:14:11,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:14:12,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=390220.0, ans=0.2 2023-09-29 14:14:15,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 14:14:16,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:14:18,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:14:18,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:14:18,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:14:20,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=390286.6666666667, ans=0.2 2023-09-29 14:14:23,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 14:14:24,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:14:26,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:14:26,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 14:14:28,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:14:30,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=390286.6666666667, ans=0.125 2023-09-29 14:14:30,750 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=22.03 vs. limit=22.5 2023-09-29 14:14:31,505 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 14:14:31,531 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 14:14:34,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:14:34,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:14:36,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:14:39,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:14:41,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:14:49,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:14:51,027 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 14:14:53,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 14:14:54,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer_ff3.min_abs, batch_count=390420.0, ans=0.2 2023-09-29 14:14:57,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:14:57,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:15:00,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:15:02,638 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:15:07,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:15:08,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:15:10,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:15:12,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:15:13,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:15:13,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:15:13,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:15:13,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 14:15:13,769 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 14:15:13,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:15:14,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=390486.6666666667, ans=0.125 2023-09-29 14:15:15,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:15:15,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:15,406 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:15:15,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=390486.6666666667, ans=0.125 2023-09-29 14:15:17,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 14:15:17,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 14:15:17,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:15:17,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:19,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:15:20,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:15:22,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:15:22,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:15:25,207 INFO [train.py:1039] (3/4) Epoch 12, batch 150, loss[loss=0.1847, simple_loss=0.2703, pruned_loss=0.04955, over 24481.00 frames. ], tot_loss[loss=0.2006, simple_loss=0.2727, pruned_loss=0.06429, over 2516961.68 frames. ], batch size: 66, lr: 8.83e-03, grad_scale: 16.0 2023-09-29 14:15:25,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:15:30,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:15:30,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:15:30,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:34,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:15:34,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:36,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:15:38,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:43,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 14:15:43,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 14:15:43,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 14:15:46,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:15:46,559 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:15:48,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:15:49,590 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:15:49,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:15:49,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:49,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:15:51,207 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 14:15:54,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:16:00,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:16:05,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:16:06,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 14:16:09,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:16:09,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:16:09,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:16:12,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:16:13,383 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.92 vs. limit=15.0 2023-09-29 14:16:14,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:16:15,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:16:18,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:16:19,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 14:16:19,867 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=390753.3333333333, ans=0.0 2023-09-29 14:16:21,434 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:16:24,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:16:25,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:16:25,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:16:25,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:16:29,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:16:30,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 14:16:34,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:16:35,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:16:36,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:16:37,591 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 1.933e+02 2.162e+02 2.482e+02 3.211e+02, threshold=4.324e+02, percent-clipped=0.0 2023-09-29 14:16:39,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:16:41,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 14:16:41,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:16:41,299 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 14:16:44,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:16:47,360 INFO [train.py:1039] (3/4) Epoch 12, batch 200, loss[loss=0.2, simple_loss=0.2699, pruned_loss=0.06502, over 24676.00 frames. ], tot_loss[loss=0.2007, simple_loss=0.2729, pruned_loss=0.06428, over 3013970.40 frames. ], batch size: 65, lr: 8.83e-03, grad_scale: 16.0 2023-09-29 14:16:50,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:16:50,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:16:52,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 14:16:53,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:16:53,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:16:57,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 14:16:57,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:16:58,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:17:00,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:17:04,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:17:04,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:17:04,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:17:07,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=390953.3333333333, ans=0.0 2023-09-29 14:17:25,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:17:26,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:17:26,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:17:28,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:17:28,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 14:17:30,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:17:32,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:17:33,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:17:34,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:17:35,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:17:37,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 14:17:39,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 14:17:39,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:17:44,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:17:50,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:18:00,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:00,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:18:07,858 INFO [train.py:1039] (3/4) Epoch 12, batch 250, loss[loss=0.2003, simple_loss=0.2832, pruned_loss=0.05872, over 24081.00 frames. ], tot_loss[loss=0.2006, simple_loss=0.2722, pruned_loss=0.06445, over 3385056.79 frames. ], batch size: 80, lr: 8.82e-03, grad_scale: 16.0 2023-09-29 14:18:07,991 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:10,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 14:18:11,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:18:11,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:18:11,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:18:11,842 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:18:13,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 14:18:13,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:18:15,599 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 14:18:15,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:17,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:18:17,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=391220.0, ans=0.0 2023-09-29 14:18:18,764 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:20,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:18:20,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=391220.0, ans=0.125 2023-09-29 14:18:21,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:18:21,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:18:22,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=391220.0, ans=0.1 2023-09-29 14:18:23,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:18:30,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:18:40,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:18:42,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:18:43,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:18:49,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:18:50,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:18:50,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:18:52,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:18:52,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:18:52,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:18:54,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:18:57,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:19:00,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 14:19:01,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:19:04,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:19:04,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:19:04,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:19:05,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:19:05,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:19:05,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:19:09,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:19:09,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:19:09,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=391420.0, ans=0.025 2023-09-29 14:19:10,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:19:14,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:19:17,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:19:19,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:19:21,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=391486.6666666667, ans=0.0 2023-09-29 14:19:22,366 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.950e+02 2.182e+02 2.665e+02 5.527e+02, threshold=4.363e+02, percent-clipped=2.0 2023-09-29 14:19:25,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:19:26,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:19:29,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 14:19:31,301 INFO [train.py:1039] (3/4) Epoch 12, batch 300, loss[loss=0.1931, simple_loss=0.2755, pruned_loss=0.0554, over 24635.00 frames. ], tot_loss[loss=0.1991, simple_loss=0.2707, pruned_loss=0.06379, over 3675358.29 frames. ], batch size: 68, lr: 8.82e-03, grad_scale: 16.0 2023-09-29 14:19:32,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:19:33,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:19:34,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 14:19:35,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 14:19:36,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:19:36,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 14:19:40,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=391553.3333333333, ans=0.125 2023-09-29 14:19:41,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:19:43,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:19:46,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:19:46,728 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=391620.0, ans=0.1 2023-09-29 14:19:48,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 14:19:49,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:19:51,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:19:51,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 14:19:51,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:19:56,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 14:19:59,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:20:00,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=391620.0, ans=0.125 2023-09-29 14:20:01,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 14:20:02,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 14:20:04,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:04,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:20:09,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:09,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 14:20:09,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:20:12,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:20:14,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:20:14,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:20:19,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 14:20:19,914 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 14:20:21,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:20:24,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:26,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 14:20:27,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:20:32,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:20:34,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:20:34,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 14:20:38,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:38,873 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:20:40,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:42,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:20:43,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 14:20:44,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 14:20:44,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:20:45,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 14:20:47,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:20:47,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:20:49,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:20:50,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:20:50,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:20:54,292 INFO [train.py:1039] (3/4) Epoch 12, batch 350, loss[loss=0.1904, simple_loss=0.2628, pruned_loss=0.059, over 23478.00 frames. ], tot_loss[loss=0.1971, simple_loss=0.2685, pruned_loss=0.06281, over 3909344.56 frames. ], batch size: 106, lr: 8.82e-03, grad_scale: 16.0 2023-09-29 14:20:55,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:20:55,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 14:20:59,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:04,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=391886.6666666667, ans=0.2 2023-09-29 14:21:07,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:21:10,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:21:10,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:12,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 14:21:13,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:21:15,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 14:21:17,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:17,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 14:21:18,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:21:22,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 14:21:23,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:21:27,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:21:29,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:21:29,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:21:29,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:21:30,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:21:30,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:21:30,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:21:32,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:21:32,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:40,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:21:40,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:21:40,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:21:42,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:21:46,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 14:21:46,914 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:21:52,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:21:52,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:21:53,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:21:55,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 14:21:57,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:21:57,182 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 14:22:00,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 14:22:00,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:22:05,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:22:05,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 14:22:07,471 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.628e+02 1.933e+02 2.139e+02 2.473e+02 3.749e+02, threshold=4.278e+02, percent-clipped=0.0 2023-09-29 14:22:07,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:22:09,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:22:10,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:22:10,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:22:10,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:22:13,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:22:16,742 INFO [train.py:1039] (3/4) Epoch 12, batch 400, loss[loss=0.2116, simple_loss=0.2521, pruned_loss=0.08554, over 19117.00 frames. ], tot_loss[loss=0.1971, simple_loss=0.2687, pruned_loss=0.06278, over 4096415.01 frames. ], batch size: 388, lr: 8.81e-03, grad_scale: 32.0 2023-09-29 14:22:17,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:22:18,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:22:20,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 14:22:20,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:22:20,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:22:23,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:22:23,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:22:26,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:22:29,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:22:31,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 14:22:31,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=392286.6666666667, ans=0.1 2023-09-29 14:22:31,620 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=392286.6666666667, ans=0.125 2023-09-29 14:22:32,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 14:22:32,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:22:37,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 14:22:37,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:22:40,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:22:40,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:22:40,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 14:22:41,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:22:41,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:22:41,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:22:43,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:22:45,008 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 14:22:46,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 14:22:51,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:22:52,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:22:52,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=392353.3333333333, ans=0.125 2023-09-29 14:22:54,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 14:22:55,515 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 14:22:57,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:23:00,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:23:08,772 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 14:23:12,448 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:23:14,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 14:23:15,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:23:18,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:23:18,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 14:23:21,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:23:24,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:23:26,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:23:27,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:23:27,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 14:23:29,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 14:23:31,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 14:23:34,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:23:34,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:23:36,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 14:23:37,564 INFO [train.py:1039] (3/4) Epoch 12, batch 450, loss[loss=0.1864, simple_loss=0.2677, pruned_loss=0.05256, over 24480.00 frames. ], tot_loss[loss=0.198, simple_loss=0.2696, pruned_loss=0.06322, over 4223053.55 frames. ], batch size: 66, lr: 8.81e-03, grad_scale: 32.0 2023-09-29 14:23:39,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:23:39,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:23:39,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 14:23:42,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 14:23:42,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:23:44,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:23:46,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:23:46,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 14:23:47,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:23:48,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:23:51,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:23:56,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=392620.0, ans=0.125 2023-09-29 14:24:00,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:24:00,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:24:02,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 14:24:03,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 14:24:08,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:24:11,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:24:13,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:24:16,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:24:16,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:24:20,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 14:24:22,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 14:24:23,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 14:24:23,959 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:24:25,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:24:25,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:24:27,093 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 14:24:27,107 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 14:24:28,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:24:30,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:24:31,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 14:24:32,105 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.14 vs. limit=12.0 2023-09-29 14:24:33,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=392753.3333333333, ans=0.125 2023-09-29 14:24:34,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:24:34,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:24:36,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 14:24:36,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 14:24:39,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:24:40,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 14:24:40,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:24:43,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 14:24:48,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:24:48,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 14:24:50,224 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.855e+02 2.154e+02 2.453e+02 3.354e+02, threshold=4.308e+02, percent-clipped=0.0 2023-09-29 14:24:50,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 14:24:52,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:24:58,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:24:59,748 INFO [train.py:1039] (3/4) Epoch 12, batch 500, loss[loss=0.1691, simple_loss=0.2369, pruned_loss=0.05071, over 15118.00 frames. ], tot_loss[loss=0.1984, simple_loss=0.2699, pruned_loss=0.06343, over 4323225.49 frames. ], batch size: 31, lr: 8.80e-03, grad_scale: 16.0 2023-09-29 14:24:59,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:25:01,485 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:25:02,785 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 14:25:07,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:25:07,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:25:08,273 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.01 vs. limit=15.0 2023-09-29 14:25:08,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:25:08,872 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 14:25:10,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 14:25:10,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:25:13,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 14:25:18,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 14:25:20,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:25:20,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:25:20,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:25:21,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:25:28,514 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:25:33,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:25:33,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:25:34,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:25:34,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:25:34,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 14:25:34,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:25:38,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:25:38,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:25:38,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=393020.0, ans=0.0 2023-09-29 14:25:39,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:25:39,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:25:39,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 14:25:43,907 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 14:25:45,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:25:47,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:25:47,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:25:47,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:25:49,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:25:49,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=393086.6666666667, ans=0.0 2023-09-29 14:25:50,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 14:25:55,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:25:56,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:26:01,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:26:04,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:26:09,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:26:12,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 14:26:12,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:26:12,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:26:15,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 14:26:16,051 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:26:17,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 14:26:18,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:26:20,306 INFO [train.py:1039] (3/4) Epoch 12, batch 550, loss[loss=0.1792, simple_loss=0.2518, pruned_loss=0.05334, over 22379.00 frames. ], tot_loss[loss=0.2002, simple_loss=0.2714, pruned_loss=0.06454, over 4408683.58 frames. ], batch size: 49, lr: 8.80e-03, grad_scale: 16.0 2023-09-29 14:26:22,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 14:26:25,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 14:26:25,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:26:25,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 14:26:27,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:26:27,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:26:28,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:26:28,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:26:28,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:26:28,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:26:32,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:26:32,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 14:26:34,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:26:39,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:26:39,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:26:39,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=393286.6666666667, ans=0.125 2023-09-29 14:26:42,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:26:44,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:26:49,255 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 14:26:50,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 14:26:52,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:26:52,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=393353.3333333333, ans=0.0 2023-09-29 14:26:56,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:26:56,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:26:59,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:27:03,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:27:03,526 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 14:27:03,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:27:05,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 14:27:06,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:27:08,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:27:08,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:27:10,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:27:11,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 14:27:12,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 14:27:12,985 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.32 vs. limit=15.0 2023-09-29 14:27:13,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:27:13,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:27:13,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:27:13,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:27:15,491 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.36 vs. limit=12.0 2023-09-29 14:27:17,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:27:19,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:27:21,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:27:22,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:27:22,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 14:27:24,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:27:25,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:27:26,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=393486.6666666667, ans=0.1 2023-09-29 14:27:27,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:27:29,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:27:29,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:27:31,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 14:27:34,358 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 2.077e+02 2.338e+02 2.752e+02 4.124e+02, threshold=4.676e+02, percent-clipped=0.0 2023-09-29 14:27:37,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 14:27:41,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 14:27:43,114 INFO [train.py:1039] (3/4) Epoch 12, batch 600, loss[loss=0.1987, simple_loss=0.2799, pruned_loss=0.05875, over 24638.00 frames. ], tot_loss[loss=0.2009, simple_loss=0.272, pruned_loss=0.06487, over 4461954.37 frames. ], batch size: 68, lr: 8.80e-03, grad_scale: 16.0 2023-09-29 14:27:43,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:27:44,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:27:44,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:27:45,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=393553.3333333333, ans=0.1 2023-09-29 14:27:51,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:27:54,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:27:55,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 14:27:58,619 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:27:59,671 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.11 vs. limit=15.0 2023-09-29 14:28:00,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:28:01,914 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:28:05,145 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.68 vs. limit=5.0 2023-09-29 14:28:05,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 14:28:05,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:28:10,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 14:28:13,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:28:13,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:28:13,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:28:20,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:28:20,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:28:22,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:28:22,996 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.50 vs. limit=10.0 2023-09-29 14:28:28,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:28:32,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:28:32,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:28:32,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:28:38,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=393753.3333333333, ans=0.125 2023-09-29 14:28:41,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 14:28:46,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:28:46,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:28:52,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 14:28:53,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:28:56,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 14:28:56,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:28:58,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:29:01,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=393820.0, ans=0.125 2023-09-29 14:29:03,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 14:29:04,488 INFO [train.py:1039] (3/4) Epoch 12, batch 650, loss[loss=0.1971, simple_loss=0.2749, pruned_loss=0.05964, over 23721.00 frames. ], tot_loss[loss=0.2002, simple_loss=0.2709, pruned_loss=0.06479, over 4511847.36 frames. ], batch size: 85, lr: 8.79e-03, grad_scale: 8.0 2023-09-29 14:29:04,704 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 14:29:07,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:29:10,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:29:12,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:29:15,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 14:29:16,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:29:22,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:29:22,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:29:23,786 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.66 vs. limit=10.0 2023-09-29 14:29:26,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:29:29,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 14:29:31,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:29:31,928 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.73 vs. limit=15.0 2023-09-29 14:29:32,803 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:29:35,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:29:35,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 14:29:36,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=394020.0, ans=0.125 2023-09-29 14:29:39,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:29:39,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:29:40,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 14:29:40,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:29:42,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:29:44,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:29:45,572 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 14:29:45,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:29:45,630 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:29:49,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:29:49,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:29:49,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:29:51,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:29:51,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 14:29:53,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:29:53,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:29:55,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:29:55,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=394086.6666666667, ans=0.125 2023-09-29 14:29:57,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:29:58,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 14:30:00,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 14:30:02,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 14:30:02,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:02,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:30:03,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:30:03,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:30:05,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:30:06,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_ff2.min_abs, batch_count=394086.6666666667, ans=0.1 2023-09-29 14:30:11,264 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=394153.3333333333, ans=0.1 2023-09-29 14:30:12,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:12,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:30:14,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:30:16,283 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=394153.3333333333, ans=6.0 2023-09-29 14:30:17,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:30:17,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 14:30:17,309 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:30:20,387 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.997e+02 2.199e+02 2.485e+02 3.515e+02, threshold=4.397e+02, percent-clipped=0.0 2023-09-29 14:30:26,974 INFO [train.py:1039] (3/4) Epoch 12, batch 700, loss[loss=0.211, simple_loss=0.2831, pruned_loss=0.06944, over 24684.00 frames. ], tot_loss[loss=0.1988, simple_loss=0.2699, pruned_loss=0.06381, over 4572671.30 frames. ], batch size: 65, lr: 8.79e-03, grad_scale: 8.0 2023-09-29 14:30:27,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:30:27,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:30:27,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:30:27,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:30:32,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 14:30:34,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 14:30:36,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 14:30:37,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:39,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:30:42,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 14:30:46,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:30:49,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:30:51,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:30:52,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=394286.6666666667, ans=0.0 2023-09-29 14:30:53,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:30:53,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:30:57,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:31:00,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 14:31:00,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:31:03,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 14:31:05,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 14:31:08,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:31:10,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:31:12,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:31:16,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:31:17,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=394420.0, ans=0.2 2023-09-29 14:31:18,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 14:31:21,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:31:21,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:31:21,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 14:31:23,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=394420.0, ans=0.125 2023-09-29 14:31:26,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:31:26,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:31:29,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:31:36,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:31:36,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 14:31:40,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 14:31:41,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 14:31:44,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:31:46,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:31:46,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:31:48,629 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:31:48,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 14:31:48,898 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=394553.3333333333, ans=0.125 2023-09-29 14:31:50,002 INFO [train.py:1039] (3/4) Epoch 12, batch 750, loss[loss=0.2131, simple_loss=0.2886, pruned_loss=0.06884, over 24438.00 frames. ], tot_loss[loss=0.1979, simple_loss=0.269, pruned_loss=0.0634, over 4596496.47 frames. ], batch size: 69, lr: 8.79e-03, grad_scale: 8.0 2023-09-29 14:31:50,589 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=394553.3333333333, ans=0.125 2023-09-29 14:31:53,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 14:31:53,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 14:31:53,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 14:31:54,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 14:31:56,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 14:31:56,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:31:56,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 14:31:57,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:31:58,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=394553.3333333333, ans=0.125 2023-09-29 14:31:59,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:32:00,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:32:02,429 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:32:03,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:32:05,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:32:08,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:32:10,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:32:11,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:32:12,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=394620.0, ans=0.0 2023-09-29 14:32:13,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:32:13,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:32:13,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=394620.0, ans=0.0 2023-09-29 14:32:15,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 14:32:16,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:32:18,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:32:20,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:32:23,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 14:32:25,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 14:32:25,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:32:25,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 14:32:25,659 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 14:32:27,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 14:32:27,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:32:27,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 14:32:30,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:32:37,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:32:37,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:32:37,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:32:40,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:32:41,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:32:42,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 14:32:42,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=394753.3333333333, ans=0.125 2023-09-29 14:32:42,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=394753.3333333333, ans=0.0 2023-09-29 14:32:43,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:32:45,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 14:32:47,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:32:47,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=394753.3333333333, ans=0.125 2023-09-29 14:32:49,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:32:49,365 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:32:50,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 14:32:50,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:32:55,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:32:57,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:32:57,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:00,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:33:02,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 14:33:02,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:33:03,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:33:05,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:33:05,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:33:05,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=394820.0, ans=0.125 2023-09-29 14:33:06,841 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 2.038e+02 2.318e+02 2.858e+02 4.234e+02, threshold=4.635e+02, percent-clipped=0.0 2023-09-29 14:33:09,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:33:09,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:33:12,275 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=394886.6666666667, ans=0.2 2023-09-29 14:33:13,521 INFO [train.py:1039] (3/4) Epoch 12, batch 800, loss[loss=0.1749, simple_loss=0.2462, pruned_loss=0.05179, over 24385.00 frames. ], tot_loss[loss=0.1989, simple_loss=0.27, pruned_loss=0.06393, over 4627725.37 frames. ], batch size: 58, lr: 8.78e-03, grad_scale: 16.0 2023-09-29 14:33:21,096 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.04 vs. limit=15.0 2023-09-29 14:33:22,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:33:22,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:33:23,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:33:23,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:33:25,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:33:26,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:27,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:33:32,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:33:32,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:33:34,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 14:33:34,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=394953.3333333333, ans=0.0 2023-09-29 14:33:35,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:36,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:33:37,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:33:37,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:33:38,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 14:33:38,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:33:38,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 14:33:43,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:33:43,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=394953.3333333333, ans=0.04949747468305833 2023-09-29 14:33:46,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:33:46,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:33:48,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:33:52,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:52,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:33:57,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:33:57,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:33:58,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 14:34:00,391 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 14:34:00,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 14:34:00,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:34:00,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:34:03,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:34:03,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:34:07,270 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 14:34:08,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 14:34:08,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:34:11,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:34:14,021 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=13.49 vs. limit=15.0 2023-09-29 14:34:15,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:34:18,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:34:21,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 14:34:22,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:34:25,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 14:34:31,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=395153.3333333333, ans=10.0 2023-09-29 14:34:33,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:34:34,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=395153.3333333333, ans=0.1 2023-09-29 14:34:36,933 INFO [train.py:1039] (3/4) Epoch 12, batch 850, loss[loss=0.1595, simple_loss=0.2348, pruned_loss=0.04207, over 24338.00 frames. ], tot_loss[loss=0.1986, simple_loss=0.27, pruned_loss=0.06353, over 4657594.99 frames. ], batch size: 56, lr: 8.78e-03, grad_scale: 16.0 2023-09-29 14:34:37,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:34:37,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 14:34:37,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:34:37,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:34:40,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 14:34:40,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:34:40,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:34:42,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:34:45,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:34:47,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:34:48,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 14:34:48,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 14:34:48,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 14:34:50,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:34:50,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:34:53,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:34:53,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:34:53,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:34:56,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=395286.6666666667, ans=0.09899494936611666 2023-09-29 14:35:00,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:35:00,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:35:00,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 14:35:06,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 14:35:09,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:35:10,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 14:35:15,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 14:35:16,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 14:35:18,910 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 14:35:18,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:35:18,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:35:20,319 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 14:35:21,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:35:23,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:35:24,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 14:35:26,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:35:26,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:35:28,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:35:28,111 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:35:30,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:35:32,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 14:35:32,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 14:35:37,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:35:37,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:35:38,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:35:38,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:35:40,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:35:41,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=395486.6666666667, ans=0.125 2023-09-29 14:35:43,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:35:44,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:35:46,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:35:48,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:35:48,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 14:35:53,426 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.869e+02 2.098e+02 2.387e+02 5.753e+02, threshold=4.196e+02, percent-clipped=1.0 2023-09-29 14:35:53,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 14:35:56,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:35:56,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 14:35:56,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=395486.6666666667, ans=0.125 2023-09-29 14:35:57,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:35:57,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:35:59,374 INFO [train.py:1039] (3/4) Epoch 12, batch 900, loss[loss=0.1702, simple_loss=0.2473, pruned_loss=0.04658, over 24475.00 frames. ], tot_loss[loss=0.1993, simple_loss=0.2705, pruned_loss=0.06398, over 4665103.04 frames. ], batch size: 63, lr: 8.77e-03, grad_scale: 16.0 2023-09-29 14:36:00,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 14:36:07,873 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:36:10,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:36:10,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 14:36:13,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:36:14,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 14:36:16,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 14:36:16,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:36:16,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:36:16,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 14:36:17,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:36:27,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:36:27,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:36:27,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:36:28,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=395620.0, ans=0.1 2023-09-29 14:36:30,626 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.77 vs. limit=22.5 2023-09-29 14:36:31,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:36:31,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=395686.6666666667, ans=0.0 2023-09-29 14:36:36,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 14:36:40,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:36:44,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:36:46,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:36:46,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=395686.6666666667, ans=0.0 2023-09-29 14:36:47,933 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 14:36:49,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 14:36:51,751 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.99 vs. limit=15.0 2023-09-29 14:36:53,629 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.08 vs. limit=15.0 2023-09-29 14:36:54,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:36:55,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:36:55,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:37:00,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:37:00,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:37:02,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=395753.3333333333, ans=0.0 2023-09-29 14:37:04,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 14:37:04,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:37:07,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 14:37:10,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:37:10,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:37:10,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:37:11,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:37:12,990 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.36 vs. limit=15.0 2023-09-29 14:37:15,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 14:37:15,785 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 14:37:17,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 14:37:19,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 14:37:21,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:37:22,861 INFO [train.py:1039] (3/4) Epoch 12, batch 950, loss[loss=0.2125, simple_loss=0.2731, pruned_loss=0.07598, over 23711.00 frames. ], tot_loss[loss=0.1996, simple_loss=0.2708, pruned_loss=0.06425, over 4675270.83 frames. ], batch size: 232, lr: 8.77e-03, grad_scale: 16.0 2023-09-29 14:37:24,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 14:37:26,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=395886.6666666667, ans=0.125 2023-09-29 14:37:29,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:37:32,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:37:32,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:37:32,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 14:37:35,463 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 14:37:37,472 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:37:39,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:37:39,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:37:39,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:37:40,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:37:40,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 14:37:42,391 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 14:37:46,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:37:48,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 14:37:49,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:37:52,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:37:53,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:37:53,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:37:55,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 14:37:56,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 14:37:58,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:37:59,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:38:01,811 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=396020.0, ans=0.0 2023-09-29 14:38:04,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:38:04,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:38:09,148 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 14:38:12,032 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 14:38:12,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:38:12,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:38:13,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:38:13,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:38:19,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 14:38:19,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:38:22,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:38:22,687 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:38:24,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 14:38:24,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:38:24,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:38:25,354 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.25 vs. limit=6.0 2023-09-29 14:38:26,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 14:38:32,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:38:34,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:38:38,826 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.985e+02 2.175e+02 2.387e+02 3.582e+02, threshold=4.351e+02, percent-clipped=0.0 2023-09-29 14:38:39,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:38:39,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=396153.3333333333, ans=0.0 2023-09-29 14:38:42,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 14:38:42,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 14:38:44,930 INFO [train.py:1039] (3/4) Epoch 12, batch 1000, loss[loss=0.1808, simple_loss=0.2504, pruned_loss=0.05565, over 24312.00 frames. ], tot_loss[loss=0.1988, simple_loss=0.2693, pruned_loss=0.06414, over 4683680.33 frames. ], batch size: 56, lr: 8.77e-03, grad_scale: 16.0 2023-09-29 14:38:46,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:38:48,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 14:38:49,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:38:53,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=396220.0, ans=0.125 2023-09-29 14:38:55,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:38:55,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=396220.0, ans=0.125 2023-09-29 14:38:57,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 14:38:57,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 14:39:04,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:39:04,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:39:04,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:39:09,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 14:39:10,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 14:39:13,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 14:39:15,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:39:15,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 14:39:17,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 14:39:17,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 14:39:17,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:39:18,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:39:28,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:39:30,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:39:31,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:39:32,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:39:32,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 14:39:32,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:39:34,032 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:39:34,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:39:35,646 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 14:39:38,211 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.44 vs. limit=15.0 2023-09-29 14:39:40,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 14:39:40,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 14:39:42,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 14:39:44,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:39:52,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:39:52,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:39:52,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:39:52,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:39:55,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 14:39:56,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:39:56,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 14:39:58,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 14:40:00,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:40:00,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:40:02,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:40:04,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:40:07,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:40:08,965 INFO [train.py:1039] (3/4) Epoch 12, batch 1050, loss[loss=0.1959, simple_loss=0.2819, pruned_loss=0.05491, over 24475.00 frames. ], tot_loss[loss=0.1973, simple_loss=0.2681, pruned_loss=0.06327, over 4693387.13 frames. ], batch size: 69, lr: 8.76e-03, grad_scale: 16.0 2023-09-29 14:40:10,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:40:12,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:40:13,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:40:15,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:40:18,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:40:21,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 14:40:23,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 14:40:23,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=396620.0, ans=0.125 2023-09-29 14:40:25,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:40:25,269 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=396620.0, ans=0.125 2023-09-29 14:40:26,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:40:26,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:40:28,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:40:29,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 14:40:29,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:40:30,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 14:40:34,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:40:34,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 14:40:34,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 14:40:43,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:40:44,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:40:44,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:40:44,988 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=396686.6666666667, ans=0.125 2023-09-29 14:40:46,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 14:40:47,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 14:40:47,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:40:51,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 14:40:55,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 14:40:55,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:40:58,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 14:41:00,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 14:41:00,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:41:00,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:41:01,366 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.48 vs. limit=15.0 2023-09-29 14:41:05,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:41:08,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 14:41:09,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 14:41:10,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 14:41:10,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:41:10,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:41:14,100 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 14:41:17,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:41:20,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:41:20,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:41:22,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:41:22,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:41:24,160 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.921e+02 2.123e+02 2.375e+02 5.047e+02, threshold=4.247e+02, percent-clipped=1.0 2023-09-29 14:41:26,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:41:26,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 14:41:28,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 14:41:29,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 14:41:29,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 14:41:29,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:41:30,463 INFO [train.py:1039] (3/4) Epoch 12, batch 1100, loss[loss=0.2094, simple_loss=0.2893, pruned_loss=0.06471, over 24144.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.2679, pruned_loss=0.06298, over 4693624.72 frames. ], batch size: 86, lr: 8.76e-03, grad_scale: 16.0 2023-09-29 14:41:33,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:41:38,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:41:43,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:41:45,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:41:46,452 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:41:46,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 14:41:48,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:41:48,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=396953.3333333333, ans=0.125 2023-09-29 14:41:50,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 14:41:54,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:41:57,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:41:57,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 14:41:59,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 14:41:59,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:41:59,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:42:00,099 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.86 vs. limit=12.0 2023-09-29 14:42:02,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:42:04,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 14:42:08,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:42:09,119 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:42:11,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 14:42:11,975 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 14:42:13,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:42:16,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:42:17,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:42:17,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:42:18,203 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=397086.6666666667, ans=0.07 2023-09-29 14:42:19,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 14:42:21,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:42:21,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:42:21,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:42:23,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:42:23,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 14:42:30,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:42:31,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 14:42:33,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:42:37,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:42:41,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 14:42:41,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 14:42:41,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:42:44,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:42:44,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:42:44,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 14:42:44,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:42:44,772 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:42:46,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 14:42:46,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:42:47,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 14:42:48,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:42:48,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:42:49,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:42:52,855 INFO [train.py:1039] (3/4) Epoch 12, batch 1150, loss[loss=0.1958, simple_loss=0.2747, pruned_loss=0.05848, over 24342.00 frames. ], tot_loss[loss=0.198, simple_loss=0.2685, pruned_loss=0.06379, over 4682403.97 frames. ], batch size: 77, lr: 8.76e-03, grad_scale: 16.0 2023-09-29 14:42:53,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=397220.0, ans=0.025 2023-09-29 14:42:55,348 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=397220.0, ans=0.1 2023-09-29 14:42:57,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:43:00,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=397220.0, ans=0.0 2023-09-29 14:43:01,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:43:03,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:43:03,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:43:03,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 14:43:03,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:43:06,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 14:43:08,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:43:08,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:43:14,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 14:43:15,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:43:16,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=397286.6666666667, ans=0.0 2023-09-29 14:43:20,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:43:21,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:43:21,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 14:43:21,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:43:21,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:43:25,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 14:43:27,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:43:29,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:43:39,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:43:44,703 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=397420.0, ans=0.125 2023-09-29 14:43:45,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:43:45,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 14:43:47,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:43:47,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:43:54,872 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 14:43:56,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:43:56,749 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=397486.6666666667, ans=0.125 2023-09-29 14:44:03,167 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 14:44:03,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=397486.6666666667, ans=0.125 2023-09-29 14:44:03,747 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.87 vs. limit=15.0 2023-09-29 14:44:08,541 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.949e+02 2.168e+02 2.522e+02 3.297e+02, threshold=4.336e+02, percent-clipped=0.0 2023-09-29 14:44:09,391 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:44:09,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:44:09,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:44:10,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:44:12,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=397486.6666666667, ans=0.1 2023-09-29 14:44:14,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:44:15,312 INFO [train.py:1039] (3/4) Epoch 12, batch 1200, loss[loss=0.193, simple_loss=0.2729, pruned_loss=0.0565, over 24003.00 frames. ], tot_loss[loss=0.1985, simple_loss=0.2692, pruned_loss=0.0639, over 4689661.37 frames. ], batch size: 80, lr: 8.75e-03, grad_scale: 32.0 2023-09-29 14:44:19,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:44:20,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:44:21,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:44:21,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:44:23,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:44:24,093 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=11.71 vs. limit=15.0 2023-09-29 14:44:24,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:44:24,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=397553.3333333333, ans=0.2 2023-09-29 14:44:26,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:44:29,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:44:29,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:44:32,223 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 14:44:35,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 14:44:37,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=397620.0, ans=0.2 2023-09-29 14:44:39,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:44:42,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:44:44,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:44:46,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:44:46,528 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 14:44:48,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:44:54,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 14:44:54,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:44:55,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 14:44:55,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:44:56,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=397686.6666666667, ans=0.1 2023-09-29 14:45:00,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 14:45:02,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=397753.3333333333, ans=0.0 2023-09-29 14:45:03,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 14:45:05,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:45:05,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:45:05,620 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=397753.3333333333, ans=0.025 2023-09-29 14:45:06,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:45:06,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:45:09,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:45:09,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:45:10,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:45:11,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 14:45:11,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:45:11,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:45:11,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 14:45:14,096 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:45:14,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:45:21,088 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 14:45:22,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:45:25,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 14:45:30,640 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 14:45:30,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:45:33,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:45:35,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:45:36,959 INFO [train.py:1039] (3/4) Epoch 12, batch 1250, loss[loss=0.2088, simple_loss=0.2884, pruned_loss=0.06466, over 23729.00 frames. ], tot_loss[loss=0.1997, simple_loss=0.2702, pruned_loss=0.06457, over 4698213.22 frames. ], batch size: 85, lr: 8.75e-03, grad_scale: 32.0 2023-09-29 14:45:37,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:45:40,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 14:45:43,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=397886.6666666667, ans=0.2 2023-09-29 14:45:46,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:45:48,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:45:48,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 14:45:51,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:45:52,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:45:57,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 14:45:58,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:45:58,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:45:58,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:46:01,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 14:46:05,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 14:46:05,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:46:05,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:46:05,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=397953.3333333333, ans=0.0 2023-09-29 14:46:07,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:46:07,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:46:11,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:46:12,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 14:46:14,767 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.42 vs. limit=22.5 2023-09-29 14:46:17,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 14:46:19,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:46:22,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:46:22,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 14:46:22,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:46:22,592 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 14:46:23,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:46:23,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:46:29,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:46:32,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:46:32,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:46:34,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 14:46:34,534 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 14:46:34,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 14:46:39,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:46:39,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 14:46:39,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:46:42,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 14:46:42,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:46:43,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 14:46:44,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=398153.3333333333, ans=0.125 2023-09-29 14:46:45,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 14:46:45,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:46:46,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 14:46:46,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:46:48,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 14:46:50,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:46:52,056 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 2.018e+02 2.304e+02 2.594e+02 4.435e+02, threshold=4.607e+02, percent-clipped=1.0 2023-09-29 14:46:52,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:46:53,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:46:57,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 14:46:58,618 INFO [train.py:1039] (3/4) Epoch 12, batch 1300, loss[loss=0.2066, simple_loss=0.2865, pruned_loss=0.06335, over 24362.00 frames. ], tot_loss[loss=0.1988, simple_loss=0.2696, pruned_loss=0.06401, over 4711484.58 frames. ], batch size: 77, lr: 8.75e-03, grad_scale: 32.0 2023-09-29 14:47:02,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:47:02,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 14:47:05,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:47:07,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 14:47:08,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:47:10,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:47:11,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 14:47:13,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 14:47:19,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:47:20,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:47:22,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 14:47:27,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 14:47:29,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=398353.3333333333, ans=0.0 2023-09-29 14:47:30,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:47:30,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:47:33,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:47:36,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:47:38,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:47:38,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 14:47:38,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 14:47:43,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:47:43,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 14:47:44,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 14:47:46,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 14:47:47,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:47:50,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:47:50,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 14:47:52,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:47:52,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 14:47:53,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:47:59,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:47:59,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:48:02,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 14:48:03,743 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 14:48:05,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 14:48:12,028 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:48:14,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 14:48:15,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:48:20,336 INFO [train.py:1039] (3/4) Epoch 12, batch 1350, loss[loss=0.1708, simple_loss=0.2475, pruned_loss=0.04707, over 24665.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.2685, pruned_loss=0.06262, over 4720946.12 frames. ], batch size: 65, lr: 8.74e-03, grad_scale: 32.0 2023-09-29 14:48:20,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 14:48:23,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:48:25,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:48:30,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:48:30,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:48:32,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:48:33,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:48:36,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 14:48:38,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 14:48:39,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:48:42,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:48:44,673 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.88 vs. limit=15.0 2023-09-29 14:48:45,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 14:48:45,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:48:47,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:48:47,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 14:48:48,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 14:48:49,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 14:48:52,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:48:52,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 14:49:05,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:49:16,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:49:16,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:49:16,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 14:49:21,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:49:22,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 14:49:22,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 14:49:23,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:49:25,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:49:27,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 14:49:28,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:49:33,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 14:49:35,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 14:49:36,895 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.617e+02 1.916e+02 2.100e+02 2.366e+02 3.252e+02, threshold=4.201e+02, percent-clipped=0.0 2023-09-29 14:49:42,870 INFO [train.py:1039] (3/4) Epoch 12, batch 1400, loss[loss=0.1773, simple_loss=0.2571, pruned_loss=0.04872, over 24339.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.2679, pruned_loss=0.0626, over 4718881.50 frames. ], batch size: 61, lr: 8.74e-03, grad_scale: 32.0 2023-09-29 14:49:43,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 14:49:44,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:49:48,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:49:49,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:49:55,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 14:49:57,061 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 14:50:03,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=398953.3333333333, ans=0.125 2023-09-29 14:50:07,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:50:08,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:50:11,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:50:11,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 14:50:16,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:50:18,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 14:50:27,418 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:50:29,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:50:34,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 14:50:34,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:50:34,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:50:36,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:50:37,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:50:39,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 14:50:39,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:50:39,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:50:41,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 14:50:41,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:50:47,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:50:50,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:50:51,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=399153.3333333333, ans=0.0 2023-09-29 14:50:57,481 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 14:50:58,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 14:51:00,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:51:02,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 14:51:02,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:51:05,126 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.08 vs. limit=15.0 2023-09-29 14:51:05,598 INFO [train.py:1039] (3/4) Epoch 12, batch 1450, loss[loss=0.1926, simple_loss=0.2769, pruned_loss=0.0541, over 24307.00 frames. ], tot_loss[loss=0.1961, simple_loss=0.2676, pruned_loss=0.06229, over 4710706.60 frames. ], batch size: 74, lr: 8.74e-03, grad_scale: 32.0 2023-09-29 14:51:05,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:51:09,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:51:09,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=399220.0, ans=0.125 2023-09-29 14:51:12,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:51:12,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:12,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 14:51:17,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:51:18,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 14:51:20,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:51:20,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 14:51:22,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 14:51:22,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 14:51:23,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:24,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:51:24,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 14:51:27,623 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:51:29,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 14:51:29,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 14:51:29,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:51:29,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:51:30,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:31,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=399286.6666666667, ans=0.125 2023-09-29 14:51:33,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:51:39,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:51:39,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:51:42,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:51:42,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=399353.3333333333, ans=0.125 2023-09-29 14:51:43,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:44,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:51:45,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:51:45,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:51:46,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:51:49,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 14:51:53,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:51:56,345 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 14:51:57,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:51:59,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 14:52:01,098 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:52:01,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=399420.0, ans=0.1 2023-09-29 14:52:02,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 14:52:07,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:52:09,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 14:52:10,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 14:52:12,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:52:14,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:52:14,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:52:17,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 14:52:19,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 14:52:19,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 14:52:21,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:52:22,743 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 2.049e+02 2.244e+02 2.638e+02 4.746e+02, threshold=4.488e+02, percent-clipped=1.0 2023-09-29 14:52:24,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 14:52:26,667 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=399486.6666666667, ans=0.125 2023-09-29 14:52:29,527 INFO [train.py:1039] (3/4) Epoch 12, batch 1500, loss[loss=0.2053, simple_loss=0.2838, pruned_loss=0.06345, over 23961.00 frames. ], tot_loss[loss=0.1968, simple_loss=0.2685, pruned_loss=0.06258, over 4714879.47 frames. ], batch size: 80, lr: 8.73e-03, grad_scale: 32.0 2023-09-29 14:52:37,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 14:52:37,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 14:52:37,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:52:39,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:52:40,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:52:40,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 14:52:42,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 14:52:42,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 14:52:43,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 14:52:43,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:52:43,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:52:44,275 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=399620.0, ans=0.07 2023-09-29 14:52:45,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:52:47,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:52:53,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:52:53,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 14:52:54,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:52:56,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:52:57,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:53:02,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 14:53:05,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 14:53:07,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:53:07,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 14:53:10,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 14:53:13,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:53:13,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:53:13,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:53:15,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 14:53:16,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:53:16,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:53:16,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 14:53:18,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:53:23,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 14:53:23,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 14:53:28,432 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 14:53:30,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 14:53:35,648 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 14:53:35,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:53:37,057 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 14:53:38,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:53:38,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:53:40,175 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 14:53:41,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 14:53:42,416 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.00 vs. limit=6.0 2023-09-29 14:53:44,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 14:53:45,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:53:50,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:53:50,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:53:50,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:53:51,395 INFO [train.py:1039] (3/4) Epoch 12, batch 1550, loss[loss=0.2639, simple_loss=0.3089, pruned_loss=0.1094, over 19660.00 frames. ], tot_loss[loss=0.1984, simple_loss=0.2695, pruned_loss=0.06365, over 4707498.97 frames. ], batch size: 388, lr: 8.73e-03, grad_scale: 16.0 2023-09-29 14:53:51,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:53:51,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 14:53:53,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 14:53:55,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 14:53:55,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:53:56,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 14:53:57,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 14:54:00,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:54:00,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:54:00,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=399886.6666666667, ans=0.125 2023-09-29 14:54:01,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:54:01,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:54:03,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:54:03,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:54:07,948 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 14:54:07,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:54:08,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 14:54:08,235 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=399953.3333333333, ans=0.0 2023-09-29 14:54:10,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 14:54:10,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=399953.3333333333, ans=0.07 2023-09-29 14:54:12,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 14:54:12,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 14:54:12,570 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=399953.3333333333, ans=0.05 2023-09-29 14:54:15,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:54:15,271 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 14:54:16,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 14:54:16,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 14:54:16,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:54:16,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:54:18,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=399953.3333333333, ans=0.2 2023-09-29 14:54:26,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:54:28,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 14:54:28,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 14:54:37,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:54:40,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:54:42,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 14:54:42,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:54:42,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 14:54:48,257 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.80 vs. limit=15.0 2023-09-29 14:54:49,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 14:54:50,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:54:53,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:54:55,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:54:55,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:54:55,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 14:54:55,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=400086.6666666667, ans=0.2 2023-09-29 14:54:57,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:54:59,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:54:59,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:54:59,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 14:54:59,272 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 14:55:01,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=400153.3333333333, ans=0.0 2023-09-29 14:55:02,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:55:03,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=400153.3333333333, ans=0.0 2023-09-29 14:55:09,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 14:55:11,557 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.65 vs. limit=12.0 2023-09-29 14:55:11,982 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.422e+02 2.047e+02 2.446e+02 2.941e+02 5.003e+02, threshold=4.892e+02, percent-clipped=3.0 2023-09-29 14:55:12,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:55:13,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:55:13,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 14:55:16,842 INFO [train.py:1039] (3/4) Epoch 12, batch 1600, loss[loss=0.2189, simple_loss=0.2808, pruned_loss=0.07855, over 23534.00 frames. ], tot_loss[loss=0.1988, simple_loss=0.27, pruned_loss=0.06383, over 4707927.42 frames. ], batch size: 285, lr: 8.72e-03, grad_scale: 32.0 2023-09-29 14:55:16,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:55:18,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:55:18,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:55:18,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:55:19,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 14:55:23,074 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.10 vs. limit=15.0 2023-09-29 14:55:24,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:55:24,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 14:55:24,833 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=400220.0, ans=0.0 2023-09-29 14:55:26,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 14:55:29,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 14:55:29,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=400220.0, ans=0.125 2023-09-29 14:55:31,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:55:34,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 14:55:34,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:55:35,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:55:41,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:55:44,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 14:55:48,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:55:49,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 14:55:50,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:55:50,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 14:55:57,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 14:55:58,959 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:56:04,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:56:04,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 14:56:06,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:56:06,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:56:06,310 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 14:56:09,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 14:56:12,467 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.09 vs. limit=15.0 2023-09-29 14:56:13,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 14:56:13,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:56:13,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:56:14,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:56:16,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 14:56:17,980 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 14:56:18,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 14:56:19,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 14:56:27,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:56:27,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:56:30,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 14:56:30,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:56:31,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 14:56:36,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:56:39,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:56:39,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 14:56:41,174 INFO [train.py:1039] (3/4) Epoch 12, batch 1650, loss[loss=0.2145, simple_loss=0.2847, pruned_loss=0.07211, over 24326.00 frames. ], tot_loss[loss=0.1998, simple_loss=0.2706, pruned_loss=0.06444, over 4705539.99 frames. ], batch size: 77, lr: 8.72e-03, grad_scale: 16.0 2023-09-29 14:56:41,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 14:56:41,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 14:56:41,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 14:56:41,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 14:56:45,393 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:56:45,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=400553.3333333333, ans=0.1 2023-09-29 14:56:46,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:56:48,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:56:48,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:56:48,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 14:56:51,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:56:55,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 14:56:57,102 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:56:57,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:56:57,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:56:57,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 14:56:58,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 14:56:58,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 14:57:06,779 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 14:57:06,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 14:57:08,680 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=400620.0, ans=0.125 2023-09-29 14:57:17,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 14:57:19,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:57:22,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 14:57:25,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:57:28,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:57:28,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=400753.3333333333, ans=0.2 2023-09-29 14:57:29,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:57:29,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:57:30,560 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.24 vs. limit=15.0 2023-09-29 14:57:31,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 14:57:31,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:57:34,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:57:35,164 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.90 vs. limit=10.0 2023-09-29 14:57:36,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:57:36,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:57:36,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:57:37,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:57:38,712 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.34 vs. limit=22.5 2023-09-29 14:57:39,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 14:57:41,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=400753.3333333333, ans=0.125 2023-09-29 14:57:42,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 14:57:44,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 14:57:45,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:57:47,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 14:57:47,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 14:57:47,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 14:57:47,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:57:49,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 14:57:50,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:57:51,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:57:51,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 14:57:56,174 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.80 vs. limit=12.0 2023-09-29 14:57:56,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:57:58,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:57:58,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:57:59,739 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.879e+02 2.089e+02 2.451e+02 3.406e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-29 14:58:00,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 14:58:03,005 INFO [train.py:1039] (3/4) Epoch 12, batch 1700, loss[loss=0.2152, simple_loss=0.2939, pruned_loss=0.06823, over 24049.00 frames. ], tot_loss[loss=0.1985, simple_loss=0.2702, pruned_loss=0.06339, over 4716372.68 frames. ], batch size: 86, lr: 8.72e-03, grad_scale: 16.0 2023-09-29 14:58:03,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=400886.6666666667, ans=0.0 2023-09-29 14:58:04,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:58:04,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 14:58:06,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 14:58:06,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:58:06,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:58:06,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:58:09,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 14:58:10,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 14:58:10,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 14:58:13,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 14:58:13,774 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.01 vs. limit=15.0 2023-09-29 14:58:18,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:58:20,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=400953.3333333333, ans=0.0 2023-09-29 14:58:21,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 14:58:22,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=400953.3333333333, ans=0.0 2023-09-29 14:58:29,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 14:58:29,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:58:29,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 14:58:29,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:58:30,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=400953.3333333333, ans=0.125 2023-09-29 14:58:33,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 14:58:35,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 14:58:35,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:58:37,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 14:58:38,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 14:58:41,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 14:58:41,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 14:58:43,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:58:46,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 14:58:46,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 14:58:55,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:58:57,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:58:58,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 14:59:00,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 14:59:00,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 14:59:01,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 14:59:01,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=401086.6666666667, ans=0.125 2023-09-29 14:59:02,620 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:59:02,621 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 14:59:02,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:59:02,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:59:02,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:59:02,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:59:07,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:59:07,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 14:59:08,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:59:08,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 14:59:08,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:59:12,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:59:13,619 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 14:59:17,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 14:59:17,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=401153.3333333333, ans=0.125 2023-09-29 14:59:17,875 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.25 vs. limit=15.0 2023-09-29 14:59:18,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 14:59:21,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 14:59:27,050 INFO [train.py:1039] (3/4) Epoch 12, batch 1750, loss[loss=0.1925, simple_loss=0.2639, pruned_loss=0.0605, over 24509.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.2683, pruned_loss=0.06277, over 4714647.07 frames. ], batch size: 66, lr: 8.71e-03, grad_scale: 16.0 2023-09-29 14:59:27,926 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.04 vs. limit=12.0 2023-09-29 14:59:28,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:59:31,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:59:31,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 14:59:33,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 14:59:33,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 14:59:33,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=401220.0, ans=0.04949747468305833 2023-09-29 14:59:36,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 14:59:36,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 14:59:41,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 14:59:43,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 14:59:44,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=401286.6666666667, ans=0.1 2023-09-29 14:59:44,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=401286.6666666667, ans=0.1 2023-09-29 14:59:46,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 14:59:46,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 14:59:46,542 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 14:59:47,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 14:59:51,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 14:59:53,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 14:59:54,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 14:59:56,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 14:59:58,593 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=401353.3333333333, ans=0.125 2023-09-29 15:00:04,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:00:06,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:00:06,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:00:13,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:00:13,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:00:15,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=401420.0, ans=0.125 2023-09-29 15:00:16,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:00:16,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=401420.0, ans=0.125 2023-09-29 15:00:17,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:00:19,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:00:20,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:00:21,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 15:00:24,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:00:26,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 15:00:28,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:00:29,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:00:29,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:00:33,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:00:34,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 15:00:34,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:00:37,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:00:42,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:00:42,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=401486.6666666667, ans=0.0 2023-09-29 15:00:44,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:00:45,916 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.394e+02 1.915e+02 2.238e+02 2.656e+02 3.754e+02, threshold=4.475e+02, percent-clipped=0.0 2023-09-29 15:00:46,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:00:46,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 15:00:46,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:00:49,074 INFO [train.py:1039] (3/4) Epoch 12, batch 1800, loss[loss=0.202, simple_loss=0.2698, pruned_loss=0.06711, over 23522.00 frames. ], tot_loss[loss=0.1957, simple_loss=0.2673, pruned_loss=0.0621, over 4716831.92 frames. ], batch size: 106, lr: 8.71e-03, grad_scale: 16.0 2023-09-29 15:00:49,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 15:00:49,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:00:49,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:00:49,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:00:50,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:00:54,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:00:55,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:00:58,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:00:59,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:01:02,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:01:04,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:01:07,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:01:11,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:01:11,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:01:12,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:01:14,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:01:14,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 15:01:15,764 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:01:19,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:01:19,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=401620.0, ans=0.125 2023-09-29 15:01:24,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 15:01:25,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 15:01:25,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 15:01:27,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:01:29,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:01:29,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:01:29,391 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:01:39,687 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 15:01:39,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:01:41,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:01:42,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 15:01:43,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 15:01:44,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:01:46,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:01:46,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:01:46,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=401753.3333333333, ans=0.125 2023-09-29 15:01:50,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 15:01:59,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:01:59,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 15:01:59,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:01:59,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:01:59,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:02:01,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 15:02:04,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:02:04,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:02:08,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 15:02:08,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:02:11,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:02:11,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:02:11,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:02:11,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=401886.6666666667, ans=0.1 2023-09-29 15:02:13,212 INFO [train.py:1039] (3/4) Epoch 12, batch 1850, loss[loss=0.1958, simple_loss=0.2715, pruned_loss=0.06007, over 24446.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.2683, pruned_loss=0.06243, over 4720574.07 frames. ], batch size: 69, lr: 8.71e-03, grad_scale: 16.0 2023-09-29 15:02:13,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:02:14,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:02:15,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=401886.6666666667, ans=0.125 2023-09-29 15:02:15,665 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.95 vs. limit=15.0 2023-09-29 15:02:17,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:02:17,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:02:19,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:02:20,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:02:27,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:02:27,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 15:02:32,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 15:02:35,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 15:02:40,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:02:40,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 15:02:40,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 15:02:51,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:02:51,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 15:02:53,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=402020.0, ans=0.2 2023-09-29 15:02:54,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:02:54,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:02:57,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 15:02:57,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:02:57,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:02:59,482 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:03:00,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:03:03,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:03:07,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:03:11,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:03:11,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:03:11,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 15:03:11,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:03:11,708 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:03:15,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:03:16,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:03:20,091 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=402153.3333333333, ans=0.125 2023-09-29 15:03:21,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 15:03:23,322 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:03:25,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:03:26,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:03:26,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 15:03:26,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 15:03:28,143 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 15:03:28,272 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 15:03:29,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:03:29,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:03:29,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:03:30,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:03:31,435 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 15:03:31,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:03:31,515 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:03:32,701 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.981e+02 2.242e+02 2.724e+02 5.145e+02, threshold=4.485e+02, percent-clipped=1.0 2023-09-29 15:03:32,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 15:03:34,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:03:35,676 INFO [train.py:1039] (3/4) Epoch 12, batch 1900, loss[loss=0.1956, simple_loss=0.2714, pruned_loss=0.05987, over 24033.00 frames. ], tot_loss[loss=0.1973, simple_loss=0.2691, pruned_loss=0.06275, over 4717977.90 frames. ], batch size: 80, lr: 8.70e-03, grad_scale: 16.0 2023-09-29 15:03:37,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:03:37,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 15:03:41,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:03:41,091 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 15:03:42,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:03:43,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:03:48,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:03:49,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=402220.0, ans=0.125 2023-09-29 15:03:50,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:03:51,025 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 15:03:53,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 15:03:54,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:03:54,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:03:54,833 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 15:03:56,699 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 15:03:58,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 15:04:00,850 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.32 vs. limit=15.0 2023-09-29 15:04:01,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:04:03,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 15:04:03,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=402286.6666666667, ans=0.125 2023-09-29 15:04:06,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 15:04:15,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 15:04:17,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 15:04:17,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:04:19,115 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 15:04:19,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 15:04:19,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 15:04:19,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 15:04:19,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:04:24,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 15:04:29,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:04:33,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:04:33,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 15:04:37,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:04:40,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 15:04:40,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:04:46,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:04:46,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:04:46,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:04:48,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:04:50,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:04:50,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:04:51,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:04:54,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:04:54,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:04:56,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:04:56,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:04:56,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:04:58,333 INFO [train.py:1039] (3/4) Epoch 12, batch 1950, loss[loss=0.2002, simple_loss=0.2659, pruned_loss=0.06724, over 23494.00 frames. ], tot_loss[loss=0.1974, simple_loss=0.2694, pruned_loss=0.0627, over 4717183.32 frames. ], batch size: 106, lr: 8.70e-03, grad_scale: 16.0 2023-09-29 15:04:58,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:05:04,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:05:05,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:05:05,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:05:05,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:05:06,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=402553.3333333333, ans=0.07 2023-09-29 15:05:10,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 15:05:10,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 15:05:10,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:05:13,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:05:16,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:05:16,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:05:17,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:05:19,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:05:21,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:05:21,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:05:22,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:05:22,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:05:28,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:05:31,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:05:31,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:05:31,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:05:31,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 15:05:33,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:05:33,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:05:33,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:05:37,644 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.50 vs. limit=10.0 2023-09-29 15:05:38,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:05:41,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:05:45,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:05:48,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:05:48,705 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.28 vs. limit=15.0 2023-09-29 15:05:49,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:05:49,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 15:05:49,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:05:54,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:05:54,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:05:55,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:06:02,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:06:02,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:06:06,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:06:08,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:06:11,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:06:12,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:06:13,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 15:06:13,462 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:06:14,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:06:16,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 15:06:17,822 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 2.002e+02 2.294e+02 2.547e+02 3.463e+02, threshold=4.587e+02, percent-clipped=0.0 2023-09-29 15:06:19,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:06:20,835 INFO [train.py:1039] (3/4) Epoch 12, batch 2000, loss[loss=0.1746, simple_loss=0.2462, pruned_loss=0.05149, over 24465.00 frames. ], tot_loss[loss=0.1986, simple_loss=0.27, pruned_loss=0.06366, over 4723529.53 frames. ], batch size: 58, lr: 8.70e-03, grad_scale: 32.0 2023-09-29 15:06:23,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:06:24,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:06:25,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:06:25,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:06:27,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:06:31,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 15:06:32,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:06:34,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:06:35,995 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 15:06:38,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:06:38,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:06:38,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=402953.3333333333, ans=0.07 2023-09-29 15:06:41,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:06:42,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 15:06:44,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:06:46,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:06:48,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:06:48,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 15:06:49,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:06:51,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 15:06:51,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:06:54,432 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:06:54,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=403020.0, ans=0.0 2023-09-29 15:06:55,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 15:06:55,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:06:57,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:07:00,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:07:01,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 15:07:03,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 15:07:03,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:07:03,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:07:06,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:07:10,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:07:10,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:07:11,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:07:13,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:07:13,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:07:13,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:07:13,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:07:15,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:19,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:07:19,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 15:07:24,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:07:24,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:07:27,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:07:27,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:07:32,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:33,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:07:33,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:33,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 15:07:33,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:07:36,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:07:38,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:42,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:07:44,189 INFO [train.py:1039] (3/4) Epoch 12, batch 2050, loss[loss=0.1749, simple_loss=0.2578, pruned_loss=0.046, over 24619.00 frames. ], tot_loss[loss=0.1982, simple_loss=0.2691, pruned_loss=0.06367, over 4717477.22 frames. ], batch size: 65, lr: 8.69e-03, grad_scale: 16.0 2023-09-29 15:07:44,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=403220.0, ans=0.2 2023-09-29 15:07:44,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=403220.0, ans=0.0 2023-09-29 15:07:45,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:47,737 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=403220.0, ans=0.0 2023-09-29 15:07:52,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:07:55,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:07:56,098 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:07:57,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:08:00,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 15:08:00,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:08:02,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:08:02,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:08:10,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:08:10,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:08:13,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 15:08:14,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:08:18,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 15:08:18,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:08:20,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:08:21,002 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.55 vs. limit=22.5 2023-09-29 15:08:23,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:08:23,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 15:08:25,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:08:27,064 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:08:29,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:08:29,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:08:33,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:08:35,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:08:35,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=403420.0, ans=0.125 2023-09-29 15:08:36,814 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:08:39,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:08:43,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:08:47,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:08:49,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 15:08:55,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:08:56,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:08:58,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=403486.6666666667, ans=0.125 2023-09-29 15:08:59,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:09:01,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 15:09:01,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=403486.6666666667, ans=0.2 2023-09-29 15:09:04,762 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.918e+02 2.083e+02 2.421e+02 3.715e+02, threshold=4.167e+02, percent-clipped=0.0 2023-09-29 15:09:06,253 INFO [train.py:1039] (3/4) Epoch 12, batch 2100, loss[loss=0.1733, simple_loss=0.2519, pruned_loss=0.04731, over 24473.00 frames. ], tot_loss[loss=0.1964, simple_loss=0.2672, pruned_loss=0.06276, over 4722047.27 frames. ], batch size: 63, lr: 8.69e-03, grad_scale: 16.0 2023-09-29 15:09:06,481 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 15:09:06,482 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:09:07,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:09:07,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:09:09,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:09:09,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 15:09:09,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 15:09:09,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=403553.3333333333, ans=0.1 2023-09-29 15:09:12,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:09:15,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:09:15,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:09:18,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:09:19,793 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.83 vs. limit=8.0 2023-09-29 15:09:20,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:09:20,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 15:09:20,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:09:21,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 15:09:21,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 15:09:23,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:09:23,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:09:23,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 15:09:23,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 15:09:29,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 15:09:29,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:09:34,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:09:36,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:09:39,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:09:39,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 15:09:39,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:09:39,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 15:09:41,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 15:09:43,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:09:43,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 15:09:43,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 15:09:43,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 15:09:46,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:09:47,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:09:50,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:09:52,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:09:54,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:09:56,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:09:56,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 15:09:56,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:09:56,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:09:57,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:09:59,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 15:10:00,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 15:10:02,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 15:10:06,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:10:09,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:10:09,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 15:10:11,541 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=403820.0, ans=0.125 2023-09-29 15:10:15,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:10:19,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:10:19,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:10:19,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:10:19,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 15:10:19,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:10:21,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:10:21,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:10:22,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:10:22,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:10:25,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 15:10:27,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 15:10:27,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:10:28,543 INFO [train.py:1039] (3/4) Epoch 12, batch 2150, loss[loss=0.196, simple_loss=0.2596, pruned_loss=0.0662, over 23692.00 frames. ], tot_loss[loss=0.1962, simple_loss=0.2664, pruned_loss=0.06297, over 4710644.08 frames. ], batch size: 232, lr: 8.69e-03, grad_scale: 16.0 2023-09-29 15:10:28,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:10:28,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:10:28,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=403886.6666666667, ans=0.125 2023-09-29 15:10:30,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:10:30,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:10:35,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=403886.6666666667, ans=0.0 2023-09-29 15:10:37,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 15:10:37,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:10:39,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:10:40,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:10:40,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:10:40,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:10:45,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:10:47,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:10:47,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:10:51,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:10:51,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 15:10:55,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:10:57,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:10:59,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:10:59,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:11:00,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:00,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:11:01,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:11:01,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:11:02,044 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:11:03,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 15:11:05,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:11:07,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:11:07,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:11:09,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:11:11,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=404020.0, ans=0.125 2023-09-29 15:11:12,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:11:14,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:11:14,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:11:15,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:11:15,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 15:11:15,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:11:18,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:11:18,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:20,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:11:21,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=404086.6666666667, ans=0.1 2023-09-29 15:11:22,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:11:22,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:11:22,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=404086.6666666667, ans=0.0 2023-09-29 15:11:24,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:24,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 15:11:26,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 15:11:26,559 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=404086.6666666667, ans=0.0 2023-09-29 15:11:26,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=404086.6666666667, ans=0.1 2023-09-29 15:11:27,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:11:27,812 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 15:11:27,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:11:29,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:11:29,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 15:11:29,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:11:29,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 15:11:30,759 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 15:11:30,759 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 15:11:30,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 15:11:33,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:11:33,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:11:33,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:11:33,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:35,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:11:36,217 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.05 vs. limit=12.0 2023-09-29 15:11:38,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:11:38,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:39,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=404153.3333333333, ans=0.0 2023-09-29 15:11:48,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:11:49,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 15:11:50,278 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.936e+02 2.140e+02 2.613e+02 4.157e+02, threshold=4.280e+02, percent-clipped=0.0 2023-09-29 15:11:51,883 INFO [train.py:1039] (3/4) Epoch 12, batch 2200, loss[loss=0.196, simple_loss=0.2857, pruned_loss=0.05315, over 24446.00 frames. ], tot_loss[loss=0.1963, simple_loss=0.2672, pruned_loss=0.06268, over 4727752.33 frames. ], batch size: 69, lr: 8.68e-03, grad_scale: 16.0 2023-09-29 15:11:52,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:11:59,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:11:59,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:12:01,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:12:02,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:12:02,957 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=404220.0, ans=0.125 2023-09-29 15:12:04,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:12:04,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:12:04,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 15:12:04,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=404220.0, ans=0.0 2023-09-29 15:12:06,520 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.12 vs. limit=15.0 2023-09-29 15:12:09,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 15:12:11,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 15:12:18,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 15:12:22,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:12:24,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:12:24,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:12:24,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=404353.3333333333, ans=0.2 2023-09-29 15:12:27,267 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:12:28,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 15:12:32,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:12:33,913 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:12:34,019 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 15:12:37,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:12:37,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:12:42,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:12:43,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:12:45,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 15:12:46,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:12:47,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 15:12:50,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:12:50,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:12:52,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:12:54,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:12:55,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:12:55,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:12:55,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:12:57,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 15:12:57,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:13:00,620 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:13:03,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 15:13:04,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:13:07,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:13:08,844 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 15:13:10,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=404486.6666666667, ans=0.07 2023-09-29 15:13:12,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:13:12,385 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 15:13:12,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=404486.6666666667, ans=0.025 2023-09-29 15:13:13,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 15:13:14,004 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 15:13:14,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:13:15,595 INFO [train.py:1039] (3/4) Epoch 12, batch 2250, loss[loss=0.1992, simple_loss=0.2796, pruned_loss=0.05942, over 24401.00 frames. ], tot_loss[loss=0.1961, simple_loss=0.2674, pruned_loss=0.06244, over 4735605.29 frames. ], batch size: 69, lr: 8.68e-03, grad_scale: 16.0 2023-09-29 15:13:15,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:13:17,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:13:18,931 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 15:13:20,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:13:23,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:13:28,096 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.80 vs. limit=10.0 2023-09-29 15:13:29,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:13:29,753 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:13:32,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:13:34,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:13:35,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:13:38,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 15:13:38,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:13:38,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:13:40,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 15:13:40,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:13:42,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:13:43,824 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:13:47,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=404686.6666666667, ans=0.125 2023-09-29 15:13:50,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:13:52,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 15:13:52,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 15:13:54,285 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.88 vs. limit=12.0 2023-09-29 15:13:55,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 15:13:56,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:13:56,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:14:03,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:14:05,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:14:06,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:14:06,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:14:09,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:14:11,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:14:14,633 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=404753.3333333333, ans=0.0 2023-09-29 15:14:16,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:14:18,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 15:14:22,596 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.75 vs. limit=12.0 2023-09-29 15:14:23,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:14:23,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:14:23,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:14:29,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 15:14:30,406 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.81 vs. limit=10.0 2023-09-29 15:14:31,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:14:31,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 15:14:31,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:14:32,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:14:34,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 15:14:36,009 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.633e+02 2.000e+02 2.393e+02 2.961e+02 4.518e+02, threshold=4.786e+02, percent-clipped=2.0 2023-09-29 15:14:38,348 INFO [train.py:1039] (3/4) Epoch 12, batch 2300, loss[loss=0.1769, simple_loss=0.2594, pruned_loss=0.04727, over 24654.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.2679, pruned_loss=0.06266, over 4729335.75 frames. ], batch size: 68, lr: 8.67e-03, grad_scale: 16.0 2023-09-29 15:14:38,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:14:38,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:14:38,796 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=404886.6666666667, ans=0.0 2023-09-29 15:14:45,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:14:46,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:14:49,617 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 15:14:51,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:14:54,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=404953.3333333333, ans=0.0 2023-09-29 15:14:58,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=404953.3333333333, ans=0.125 2023-09-29 15:15:00,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:15:00,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 15:15:00,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:15:02,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:15:02,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 15:15:02,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:15:04,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=404953.3333333333, ans=0.125 2023-09-29 15:15:05,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:15:05,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:15:08,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:15:13,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:15:15,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:15:19,033 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:15:20,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:15:20,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:15:23,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:15:25,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=405086.6666666667, ans=0.0 2023-09-29 15:15:26,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:15:30,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:15:32,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:15:32,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:15:32,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 15:15:36,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 15:15:36,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:15:36,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:15:36,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:15:38,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:15:38,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 15:15:38,310 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:15:39,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 15:15:39,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:15:39,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:15:41,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 15:15:50,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:15:50,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=405153.3333333333, ans=0.0 2023-09-29 15:15:51,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:15:57,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:15:57,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=405153.3333333333, ans=10.0 2023-09-29 15:15:58,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:15:58,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:15:58,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 15:16:00,271 INFO [train.py:1039] (3/4) Epoch 12, batch 2350, loss[loss=0.2, simple_loss=0.2822, pruned_loss=0.05894, over 24631.00 frames. ], tot_loss[loss=0.1976, simple_loss=0.2692, pruned_loss=0.06303, over 4726033.75 frames. ], batch size: 68, lr: 8.67e-03, grad_scale: 8.0 2023-09-29 15:16:00,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:16:00,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:16:02,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 15:16:07,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:16:07,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 15:16:13,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 15:16:17,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:16:21,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:16:21,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:16:21,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:16:23,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:16:25,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 15:16:28,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:16:35,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 15:16:35,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:16:38,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:16:38,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:16:38,615 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:16:39,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:16:42,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 15:16:42,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:16:44,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:16:44,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:16:44,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:16:45,380 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=405353.3333333333, ans=0.125 2023-09-29 15:16:46,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:16:49,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 15:16:50,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:16:51,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=405420.0, ans=0.05 2023-09-29 15:16:55,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:16:55,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:16:55,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 15:16:56,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:16:59,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 15:17:00,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:17:06,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 15:17:11,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 15:17:12,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:17:12,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 15:17:14,148 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 15:17:14,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 15:17:14,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=405486.6666666667, ans=0.125 2023-09-29 15:17:17,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 15:17:19,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=405486.6666666667, ans=0.025 2023-09-29 15:17:20,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:17:22,272 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.923e+02 2.128e+02 2.368e+02 3.787e+02, threshold=4.256e+02, percent-clipped=0.0 2023-09-29 15:17:22,315 INFO [train.py:1039] (3/4) Epoch 12, batch 2400, loss[loss=0.2042, simple_loss=0.2851, pruned_loss=0.06165, over 24669.00 frames. ], tot_loss[loss=0.1975, simple_loss=0.2688, pruned_loss=0.06306, over 4731424.34 frames. ], batch size: 68, lr: 8.67e-03, grad_scale: 16.0 2023-09-29 15:17:25,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:17:28,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:17:30,148 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.37 vs. limit=15.0 2023-09-29 15:17:30,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:17:30,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 15:17:31,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 15:17:40,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:17:40,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:17:40,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 15:17:42,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:17:43,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:17:43,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 15:17:48,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:17:50,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 15:17:56,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:18:02,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 15:18:05,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:18:07,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:18:12,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:18:12,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 15:18:12,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:18:15,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=405753.3333333333, ans=0.0 2023-09-29 15:18:20,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:18:23,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:18:26,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:18:26,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:18:26,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 15:18:28,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:18:28,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:18:28,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:18:28,355 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 15:18:33,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:18:35,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:18:35,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 15:18:37,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 15:18:39,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:18:39,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:18:39,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 15:18:41,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 15:18:41,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 15:18:41,438 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 15:18:43,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 15:18:44,365 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.12 vs. limit=15.0 2023-09-29 15:18:44,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:18:46,475 INFO [train.py:1039] (3/4) Epoch 12, batch 2450, loss[loss=0.181, simple_loss=0.2414, pruned_loss=0.0603, over 23355.00 frames. ], tot_loss[loss=0.1971, simple_loss=0.2677, pruned_loss=0.06325, over 4721286.16 frames. ], batch size: 285, lr: 8.66e-03, grad_scale: 16.0 2023-09-29 15:18:46,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:18:46,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:18:46,714 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 15:18:48,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:18:49,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 15:18:52,152 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.69 vs. limit=12.0 2023-09-29 15:18:54,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:18:54,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:18:57,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:18:57,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:18:59,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 15:19:05,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:19:05,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:19:08,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:19:08,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:19:08,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:19:08,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 15:19:14,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:19:17,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:19:19,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:19:21,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:19:21,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:19:24,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:19:24,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:19:26,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 15:19:27,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:19:32,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:19:34,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:19:35,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:19:35,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:19:35,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:19:37,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:19:38,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 15:19:41,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:19:41,648 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:19:41,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=406086.6666666667, ans=0.125 2023-09-29 15:19:45,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:19:45,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:19:51,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:19:51,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 15:19:53,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:19:53,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:19:54,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 15:19:54,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:19:56,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:19:56,430 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=406153.3333333333, ans=0.1 2023-09-29 15:20:00,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:20:03,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:20:03,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:20:07,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 15:20:08,459 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.958e+02 2.224e+02 2.647e+02 3.379e+02, threshold=4.447e+02, percent-clipped=0.0 2023-09-29 15:20:08,503 INFO [train.py:1039] (3/4) Epoch 12, batch 2500, loss[loss=0.2108, simple_loss=0.2867, pruned_loss=0.06743, over 24570.00 frames. ], tot_loss[loss=0.1964, simple_loss=0.2669, pruned_loss=0.06297, over 4714071.01 frames. ], batch size: 71, lr: 8.66e-03, grad_scale: 16.0 2023-09-29 15:20:08,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:20:15,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:20:15,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=406220.0, ans=0.1 2023-09-29 15:20:25,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:20:25,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:20:26,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:20:26,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 15:20:33,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:20:34,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:20:36,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 15:20:36,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 15:20:36,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 15:20:39,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:20:39,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:20:39,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 15:20:39,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:20:40,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 15:20:41,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:20:47,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:20:47,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:20:51,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:20:51,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 15:20:53,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:20:54,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:20:58,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:21:03,716 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:21:04,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=406420.0, ans=6.0 2023-09-29 15:21:07,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:21:07,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=406420.0, ans=0.2 2023-09-29 15:21:12,678 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.58 vs. limit=15.0 2023-09-29 15:21:13,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:21:13,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=406486.6666666667, ans=0.5 2023-09-29 15:21:14,584 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.83 vs. limit=8.0 2023-09-29 15:21:15,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 15:21:16,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:21:16,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:21:18,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=406486.6666666667, ans=0.1 2023-09-29 15:21:19,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:21:19,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:21:21,186 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 15:21:21,187 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 15:21:21,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 15:21:24,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:21:26,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 15:21:26,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 15:21:29,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:21:29,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 15:21:31,834 INFO [train.py:1039] (3/4) Epoch 12, batch 2550, loss[loss=0.189, simple_loss=0.2695, pruned_loss=0.05426, over 24282.00 frames. ], tot_loss[loss=0.1968, simple_loss=0.2677, pruned_loss=0.063, over 4709201.35 frames. ], batch size: 61, lr: 8.66e-03, grad_scale: 16.0 2023-09-29 15:21:32,835 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.94 vs. limit=12.0 2023-09-29 15:21:33,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 15:21:37,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:21:38,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:21:38,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:21:41,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:21:43,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 15:21:43,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:21:46,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 15:21:48,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:21:48,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=406620.0, ans=0.125 2023-09-29 15:21:51,170 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:21:52,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:21:52,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 15:21:52,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 15:21:52,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:21:53,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=406620.0, ans=0.125 2023-09-29 15:21:54,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:21:57,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:21:57,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 15:21:58,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:21:58,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:21:58,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 15:22:07,596 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.79 vs. limit=15.0 2023-09-29 15:22:14,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:22:16,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=406686.6666666667, ans=0.125 2023-09-29 15:22:16,483 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.17 vs. limit=15.0 2023-09-29 15:22:18,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:22:18,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:22:18,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:22:19,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:22:24,294 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.48 vs. limit=22.5 2023-09-29 15:22:26,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:22:29,082 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.60 vs. limit=12.0 2023-09-29 15:22:30,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 15:22:30,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:22:31,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:22:31,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 15:22:31,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:22:31,701 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.08 vs. limit=15.0 2023-09-29 15:22:37,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:22:37,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:22:41,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:22:41,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 15:22:41,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:22:43,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:22:44,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:22:46,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:22:46,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:22:52,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:22:54,381 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.975e+02 2.307e+02 2.705e+02 3.697e+02, threshold=4.615e+02, percent-clipped=0.0 2023-09-29 15:22:54,424 INFO [train.py:1039] (3/4) Epoch 12, batch 2600, loss[loss=0.1972, simple_loss=0.27, pruned_loss=0.06224, over 23396.00 frames. ], tot_loss[loss=0.197, simple_loss=0.2681, pruned_loss=0.06293, over 4708569.94 frames. ], batch size: 93, lr: 8.65e-03, grad_scale: 16.0 2023-09-29 15:22:54,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:22:57,878 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 15:23:02,220 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 15:23:02,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:23:02,319 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 15:23:03,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 15:23:03,851 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 15:23:06,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:23:06,876 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 15:23:09,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 15:23:11,300 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 15:23:12,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:23:14,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 15:23:17,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 15:23:19,267 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:23:19,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 15:23:22,966 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 15:23:22,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 15:23:29,879 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.58 vs. limit=12.0 2023-09-29 15:23:30,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:23:30,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:23:30,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:23:30,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 15:23:33,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:23:39,949 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 15:23:41,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=407086.6666666667, ans=0.125 2023-09-29 15:23:45,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:23:45,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:23:47,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 15:23:49,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:23:49,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:23:50,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 15:23:53,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:23:53,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:23:58,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:24:02,640 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 15:24:02,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:24:02,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:24:07,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:24:08,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:24:08,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 15:24:10,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:24:11,948 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:24:12,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:24:16,572 INFO [train.py:1039] (3/4) Epoch 12, batch 2650, loss[loss=0.19, simple_loss=0.2689, pruned_loss=0.05554, over 24496.00 frames. ], tot_loss[loss=0.1971, simple_loss=0.2684, pruned_loss=0.06293, over 4708777.44 frames. ], batch size: 63, lr: 8.65e-03, grad_scale: 16.0 2023-09-29 15:24:16,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 15:24:18,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:24:18,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=407220.0, ans=0.0 2023-09-29 15:24:21,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:24:27,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 15:24:27,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:24:28,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 15:24:28,664 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 15:24:28,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:24:30,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:24:32,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:24:34,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:24:37,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:24:37,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 15:24:38,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:24:38,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:24:40,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=407286.6666666667, ans=0.0 2023-09-29 15:24:41,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 15:24:43,144 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 15:24:44,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:24:47,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 15:24:47,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:24:49,336 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 15:24:55,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:24:56,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:24:56,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:24:57,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:01,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 15:25:01,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 15:25:06,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:25:08,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 15:25:08,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:25:10,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:11,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:25:11,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:25:11,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:25:14,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:25:16,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:25:16,383 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:25:16,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:25:17,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:25:19,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:25:20,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:25:21,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:25:23,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:25:24,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 15:25:25,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:27,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:25:27,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:25:27,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 15:25:32,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:25:34,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:34,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:35,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=407486.6666666667, ans=0.2 2023-09-29 15:25:37,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:25:38,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:25:39,910 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.894e+02 2.126e+02 2.422e+02 3.822e+02, threshold=4.252e+02, percent-clipped=0.0 2023-09-29 15:25:39,953 INFO [train.py:1039] (3/4) Epoch 12, batch 2700, loss[loss=0.2667, simple_loss=0.3163, pruned_loss=0.1086, over 19724.00 frames. ], tot_loss[loss=0.1981, simple_loss=0.2694, pruned_loss=0.06339, over 4711924.81 frames. ], batch size: 388, lr: 8.65e-03, grad_scale: 16.0 2023-09-29 15:25:40,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:25:41,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:25:43,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 15:25:46,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:25:47,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 15:25:48,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=407553.3333333333, ans=0.0 2023-09-29 15:25:48,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=407553.3333333333, ans=0.2 2023-09-29 15:25:49,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:25:49,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:25:49,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:25:50,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:25:50,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:25:50,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:25:51,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 15:25:52,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 15:25:52,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:25:53,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:25:55,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:25:56,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:25:57,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=407620.0, ans=0.125 2023-09-29 15:26:00,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:26:00,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 15:26:02,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:26:02,552 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=407620.0, ans=0.125 2023-09-29 15:26:07,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:26:07,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:26:09,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=407620.0, ans=0.09899494936611666 2023-09-29 15:26:11,428 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=407686.6666666667, ans=0.0 2023-09-29 15:26:14,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:26:14,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:26:14,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:26:14,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:26:18,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:26:19,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:26:19,805 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:26:19,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:26:20,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=407686.6666666667, ans=0.0 2023-09-29 15:26:21,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=407686.6666666667, ans=0.05 2023-09-29 15:26:25,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:26:25,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:26:35,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:26:35,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:26:40,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:26:40,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:26:41,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=407753.3333333333, ans=0.2 2023-09-29 15:26:41,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=407753.3333333333, ans=0.2 2023-09-29 15:26:44,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:26:44,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:26:46,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:26:48,277 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:26:49,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:26:51,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:26:51,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=407820.0, ans=0.125 2023-09-29 15:26:52,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:26:54,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:26:54,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:26:56,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 15:26:57,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:26:59,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:26:59,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 15:27:00,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 15:27:00,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:27:02,260 INFO [train.py:1039] (3/4) Epoch 12, batch 2750, loss[loss=0.1733, simple_loss=0.2388, pruned_loss=0.05391, over 24275.00 frames. ], tot_loss[loss=0.1974, simple_loss=0.2679, pruned_loss=0.06349, over 4714098.09 frames. ], batch size: 56, lr: 8.64e-03, grad_scale: 16.0 2023-09-29 15:27:04,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:27:06,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:27:08,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:08,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:27:09,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:12,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:27:14,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 15:27:14,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:27:14,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:14,148 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 15:27:14,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:27:14,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:27:21,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 15:27:23,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:27:23,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:24,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:27:26,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 15:27:27,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:27:27,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:27:27,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:27:28,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=407953.3333333333, ans=0.1 2023-09-29 15:27:29,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:27:32,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:27:32,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 15:27:32,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:27:34,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:34,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 15:27:36,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=408020.0, ans=0.0 2023-09-29 15:27:43,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:27:46,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:27:46,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:27:50,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:27:50,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:27:51,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:27:59,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:27:59,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=408086.6666666667, ans=0.0 2023-09-29 15:28:00,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:28:00,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 15:28:05,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:28:06,934 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=408153.3333333333, ans=0.1 2023-09-29 15:28:08,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 15:28:11,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=408153.3333333333, ans=0.2 2023-09-29 15:28:13,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 15:28:14,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:28:17,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 15:28:18,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:28:18,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:28:20,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 15:28:20,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:28:24,687 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 1.991e+02 2.173e+02 2.887e+02 4.719e+02, threshold=4.346e+02, percent-clipped=2.0 2023-09-29 15:28:24,740 INFO [train.py:1039] (3/4) Epoch 12, batch 2800, loss[loss=0.191, simple_loss=0.2563, pruned_loss=0.06285, over 23824.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.2671, pruned_loss=0.06339, over 4713735.30 frames. ], batch size: 195, lr: 8.64e-03, grad_scale: 32.0 2023-09-29 15:28:24,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 15:28:24,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:28:24,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:28:26,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 15:28:26,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:28:26,598 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=408220.0, ans=0.0 2023-09-29 15:28:27,765 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:28:29,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:28:30,070 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 15:28:30,071 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 15:28:32,494 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.40 vs. limit=15.0 2023-09-29 15:28:33,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:28:33,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=408220.0, ans=15.0 2023-09-29 15:28:36,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:28:36,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:28:38,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=408220.0, ans=0.0 2023-09-29 15:28:39,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:28:41,051 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 15:28:44,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 15:28:44,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 15:28:45,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:28:45,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:28:45,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:28:50,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:28:50,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:28:50,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 15:28:50,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=408286.6666666667, ans=0.1 2023-09-29 15:28:52,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:28:57,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=408353.3333333333, ans=0.0 2023-09-29 15:29:00,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:29:02,089 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:29:05,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:29:05,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:29:06,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:29:13,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:29:13,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 15:29:14,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:29:14,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:29:14,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:29:19,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:29:19,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:29:24,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:29:25,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=408420.0, ans=0.1 2023-09-29 15:29:27,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:29:27,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:29:27,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:29:28,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:29:28,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:29:28,867 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=408420.0, ans=0.05 2023-09-29 15:29:30,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:29:30,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 15:29:30,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:29:31,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:29:31,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:29:33,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 15:29:33,453 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=408486.6666666667, ans=0.1 2023-09-29 15:29:35,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:29:35,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:29:36,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:29:38,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 15:29:45,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:29:45,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:29:45,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:29:48,307 INFO [train.py:1039] (3/4) Epoch 12, batch 2850, loss[loss=0.177, simple_loss=0.2579, pruned_loss=0.04807, over 24473.00 frames. ], tot_loss[loss=0.1963, simple_loss=0.2666, pruned_loss=0.06299, over 4705304.62 frames. ], batch size: 66, lr: 8.64e-03, grad_scale: 32.0 2023-09-29 15:29:48,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:29:51,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:29:51,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:29:53,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:29:56,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:29:56,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:29:59,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:30:00,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 15:30:04,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=408620.0, ans=0.2 2023-09-29 15:30:06,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 15:30:06,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:30:08,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 15:30:10,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:30:12,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 15:30:14,043 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 15:30:15,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:30:17,447 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:30:29,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:30:31,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:30:31,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:30:32,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 15:30:32,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:30:33,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:30:36,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:30:36,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 15:30:38,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:30:38,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:30:38,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:30:38,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:30:41,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:30:41,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:30:43,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:30:45,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:30:48,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:30:48,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:30:50,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:30:53,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:30:55,077 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=408820.0, ans=0.125 2023-09-29 15:30:57,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:30:59,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 15:31:00,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 15:31:02,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:31:02,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:31:03,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 15:31:03,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:31:05,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:31:05,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:31:05,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:31:05,222 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 15:31:07,284 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 15:31:07,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:31:07,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:31:10,762 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.902e+02 2.094e+02 2.569e+02 4.916e+02, threshold=4.188e+02, percent-clipped=1.0 2023-09-29 15:31:10,806 INFO [train.py:1039] (3/4) Epoch 12, batch 2900, loss[loss=0.1665, simple_loss=0.236, pruned_loss=0.04851, over 19756.00 frames. ], tot_loss[loss=0.1964, simple_loss=0.2669, pruned_loss=0.06297, over 4717047.94 frames. ], batch size: 43, lr: 8.63e-03, grad_scale: 32.0 2023-09-29 15:31:12,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 15:31:12,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:31:12,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:31:12,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=408886.6666666667, ans=0.2 2023-09-29 15:31:14,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 15:31:17,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:31:19,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 15:31:19,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=408886.6666666667, ans=0.125 2023-09-29 15:31:21,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 15:31:22,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:31:22,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:31:26,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:31:26,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:31:31,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:31:31,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:31:34,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:31:35,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 15:31:35,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:31:37,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:31:40,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 15:31:40,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 15:31:44,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:31:44,429 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 15:31:44,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:31:47,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:31:49,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 15:31:52,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:31:53,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:31:57,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:32:02,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:32:02,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 15:32:04,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 15:32:04,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:32:05,232 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.43 vs. limit=15.0 2023-09-29 15:32:08,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:32:10,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 15:32:10,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=409086.6666666667, ans=0.125 2023-09-29 15:32:11,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:32:16,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:32:24,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:32:24,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:32:25,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 15:32:27,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:32:27,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 15:32:28,775 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:32:28,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:32:34,419 INFO [train.py:1039] (3/4) Epoch 12, batch 2950, loss[loss=0.2058, simple_loss=0.2723, pruned_loss=0.06969, over 23363.00 frames. ], tot_loss[loss=0.197, simple_loss=0.2681, pruned_loss=0.06299, over 4721253.10 frames. ], batch size: 285, lr: 8.63e-03, grad_scale: 32.0 2023-09-29 15:32:34,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:32:37,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 15:32:37,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:32:37,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:32:39,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:32:40,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:32:42,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 15:32:42,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 15:32:43,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:32:43,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:32:50,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:32:52,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:32:55,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:32:55,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:32:58,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:32:58,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:33:02,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:33:02,847 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.26 vs. limit=15.0 2023-09-29 15:33:03,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:33:03,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:33:05,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 15:33:09,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 15:33:09,488 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 15:33:10,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:33:12,500 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 15:33:13,517 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=8.97 vs. limit=10.0 2023-09-29 15:33:15,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 15:33:15,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:33:15,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=409353.3333333333, ans=0.1 2023-09-29 15:33:16,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:33:16,834 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 15:33:16,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 15:33:19,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 15:33:20,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:33:20,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:33:23,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:33:24,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=409420.0, ans=0.0 2023-09-29 15:33:25,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:33:25,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:33:27,374 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 15:33:27,460 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:33:27,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 15:33:34,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:33:37,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:33:38,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 15:33:38,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:33:38,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 15:33:41,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:33:43,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:33:45,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:33:46,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:33:46,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 15:33:46,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:33:48,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:33:48,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:33:50,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:33:50,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:33:50,919 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.70 vs. limit=12.0 2023-09-29 15:33:51,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:33:53,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:33:53,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 15:33:55,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:33:56,520 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.935e+02 2.190e+02 2.674e+02 3.950e+02, threshold=4.379e+02, percent-clipped=0.0 2023-09-29 15:33:56,567 INFO [train.py:1039] (3/4) Epoch 12, batch 3000, loss[loss=0.2104, simple_loss=0.2742, pruned_loss=0.07326, over 23668.00 frames. ], tot_loss[loss=0.1989, simple_loss=0.2696, pruned_loss=0.06408, over 4719385.59 frames. ], batch size: 232, lr: 8.63e-03, grad_scale: 32.0 2023-09-29 15:33:56,567 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 15:34:11,476 INFO [train.py:1071] (3/4) Epoch 12, validation: loss=0.2606, simple_loss=0.2686, pruned_loss=0.1263, over 1125622.00 frames. 2023-09-29 15:34:11,476 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 15:34:14,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:34:14,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:34:19,196 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 15:34:19,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 15:34:20,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:34:20,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:34:22,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 15:34:22,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:34:29,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:34:33,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=409620.0, ans=0.125 2023-09-29 15:34:38,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=409620.0, ans=0.0 2023-09-29 15:34:39,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:34:47,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 15:34:47,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:34:50,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:34:52,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:34:52,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:34:53,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:34:53,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 15:34:56,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 15:34:58,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:34:58,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 15:35:02,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:35:02,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:35:04,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:35:04,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:35:07,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:35:07,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=409753.3333333333, ans=0.0 2023-09-29 15:35:08,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:35:08,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:35:10,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:35:13,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 15:35:15,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:35:16,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:35:16,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:35:20,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:35:22,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:35:23,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 15:35:23,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 15:35:23,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:35:25,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 15:35:25,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=409820.0, ans=0.125 2023-09-29 15:35:25,858 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.87 vs. limit=15.0 2023-09-29 15:35:26,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:35:28,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 15:35:31,259 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:35:32,695 INFO [train.py:1039] (3/4) Epoch 12, batch 3050, loss[loss=0.1842, simple_loss=0.2527, pruned_loss=0.05786, over 19751.00 frames. ], tot_loss[loss=0.1994, simple_loss=0.2704, pruned_loss=0.0642, over 4718849.89 frames. ], batch size: 43, lr: 8.62e-03, grad_scale: 32.0 2023-09-29 15:35:32,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:35:34,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 15:35:34,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 15:35:34,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:35:35,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:35:36,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:35:36,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=409886.6666666667, ans=0.0 2023-09-29 15:35:37,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:35:37,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:35:37,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:35:39,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 15:35:41,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:35:43,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:35:43,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:35:48,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:35:51,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 15:35:59,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 15:35:59,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 15:35:59,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:36:01,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:36:02,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=409953.3333333333, ans=0.125 2023-09-29 15:36:04,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:36:05,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:36:05,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:36:07,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:36:07,749 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=410020.0, ans=0.125 2023-09-29 15:36:09,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:36:09,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:36:10,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:36:10,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:36:12,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:36:14,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:36:15,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:36:17,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 15:36:17,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:36:17,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:36:20,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:36:20,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:36:21,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=410086.6666666667, ans=0.1 2023-09-29 15:36:22,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:36:22,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:36:28,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:36:28,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:36:37,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:36:37,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:36:37,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:36:40,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:36:40,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 15:36:42,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:36:42,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 15:36:44,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:36:44,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:36:47,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 15:36:51,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:36:54,183 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.685e+02 1.862e+02 2.008e+02 2.262e+02 2.937e+02, threshold=4.017e+02, percent-clipped=0.0 2023-09-29 15:36:54,227 INFO [train.py:1039] (3/4) Epoch 12, batch 3100, loss[loss=0.1859, simple_loss=0.2529, pruned_loss=0.05946, over 18928.00 frames. ], tot_loss[loss=0.1989, simple_loss=0.2699, pruned_loss=0.06391, over 4723239.32 frames. ], batch size: 41, lr: 8.62e-03, grad_scale: 32.0 2023-09-29 15:36:54,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:36:56,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:36:59,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 15:37:01,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 15:37:03,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 15:37:05,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 15:37:07,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:37:10,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:37:10,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:37:12,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 15:37:12,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=410286.6666666667, ans=0.1 2023-09-29 15:37:15,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:37:19,653 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.53 vs. limit=15.0 2023-09-29 15:37:22,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 15:37:26,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 15:37:27,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:37:27,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:37:27,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:37:29,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 15:37:30,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:37:30,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 15:37:30,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:37:32,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:37:32,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 15:37:33,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:37:39,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:37:40,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 15:37:41,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 15:37:42,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:37:42,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=410420.0, ans=0.125 2023-09-29 15:37:43,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:37:45,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:37:45,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:37:46,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:37:46,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:37:46,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:37:50,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:37:50,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:37:50,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:37:50,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 15:37:56,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:37:57,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 15:37:58,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:37:59,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 15:37:59,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:37:59,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:38:01,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 15:38:03,266 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=410486.6666666667, ans=0.1 2023-09-29 15:38:12,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 15:38:14,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:38:15,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:38:17,012 INFO [train.py:1039] (3/4) Epoch 12, batch 3150, loss[loss=0.1864, simple_loss=0.2247, pruned_loss=0.07409, over 19110.00 frames. ], tot_loss[loss=0.1979, simple_loss=0.2684, pruned_loss=0.06365, over 4707411.99 frames. ], batch size: 388, lr: 8.62e-03, grad_scale: 32.0 2023-09-29 15:38:17,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:38:17,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:38:17,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 15:38:18,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:38:20,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 15:38:22,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 15:38:23,054 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.91 vs. limit=12.0 2023-09-29 15:38:23,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:38:24,410 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.74 vs. limit=15.0 2023-09-29 15:38:27,343 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 15:38:29,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 15:38:31,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:38:32,580 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 15:38:32,871 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=410620.0, ans=0.0 2023-09-29 15:38:34,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 15:38:34,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 15:38:35,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 15:38:35,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 15:38:35,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:38:35,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:38:37,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:38:40,343 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 15:38:41,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:38:41,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:38:43,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:38:44,400 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:38:45,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 15:38:49,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 15:38:50,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:38:53,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:38:53,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:38:53,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 15:38:58,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 15:38:58,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:39:00,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 15:39:00,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 15:39:00,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:39:00,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:39:02,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:39:02,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 15:39:04,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 15:39:06,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 15:39:06,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:07,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:39:07,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:39:07,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 15:39:07,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:39:09,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 15:39:10,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:11,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 15:39:12,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 15:39:12,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:39:14,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:39:14,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 15:39:16,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 15:39:17,245 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.21 vs. limit=10.0 2023-09-29 15:39:17,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:39:19,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:39:20,455 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.73 vs. limit=15.0 2023-09-29 15:39:21,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:21,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:39:21,652 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=410753.3333333333, ans=0.1 2023-09-29 15:39:24,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=410820.0, ans=0.0 2023-09-29 15:39:27,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:39:27,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:28,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=410820.0, ans=0.0 2023-09-29 15:39:31,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 15:39:37,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:39:37,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 15:39:41,213 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 2.037e+02 2.402e+02 2.745e+02 4.943e+02, threshold=4.804e+02, percent-clipped=1.0 2023-09-29 15:39:41,259 INFO [train.py:1039] (3/4) Epoch 12, batch 3200, loss[loss=0.2204, simple_loss=0.2831, pruned_loss=0.07885, over 23416.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.2671, pruned_loss=0.06306, over 4707473.11 frames. ], batch size: 119, lr: 8.61e-03, grad_scale: 32.0 2023-09-29 15:39:42,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:43,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:39:43,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 15:39:46,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:39:49,657 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:39:52,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:39:54,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=410886.6666666667, ans=0.07 2023-09-29 15:40:03,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:40:14,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 15:40:14,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=411020.0, ans=0.0 2023-09-29 15:40:15,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:40:18,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 15:40:20,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 15:40:23,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:40:23,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:40:24,907 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.26 vs. limit=15.0 2023-09-29 15:40:25,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:40:27,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=411020.0, ans=0.2 2023-09-29 15:40:29,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 15:40:30,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=411086.6666666667, ans=0.125 2023-09-29 15:40:31,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 15:40:33,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 15:40:35,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 15:40:38,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:40:43,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:40:45,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 15:40:45,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:40:45,411 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 15:40:45,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:40:49,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:40:49,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=411153.3333333333, ans=0.125 2023-09-29 15:40:50,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 15:40:52,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 15:40:54,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 15:40:54,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 15:40:55,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:40:57,999 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.10 vs. limit=22.5 2023-09-29 15:41:00,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:41:00,116 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 15:41:00,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:41:00,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:01,686 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 15:41:03,206 INFO [train.py:1039] (3/4) Epoch 12, batch 3250, loss[loss=0.1882, simple_loss=0.2703, pruned_loss=0.0531, over 24578.00 frames. ], tot_loss[loss=0.1971, simple_loss=0.2675, pruned_loss=0.06334, over 4694931.01 frames. ], batch size: 71, lr: 8.61e-03, grad_scale: 32.0 2023-09-29 15:41:06,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:41:10,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:41:15,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=411220.0, ans=0.0 2023-09-29 15:41:19,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:41:19,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 15:41:20,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:41:20,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:41:20,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:41:22,516 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.93 vs. limit=22.5 2023-09-29 15:41:23,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:41:23,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 15:41:25,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:26,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:41:26,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:41:26,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:26,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:28,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:41:30,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:41:31,285 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.97 vs. limit=22.5 2023-09-29 15:41:32,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:41:32,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=411286.6666666667, ans=0.1 2023-09-29 15:41:34,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:41:35,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:41:35,302 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=411353.3333333333, ans=0.125 2023-09-29 15:41:37,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:41:37,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:41:37,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:41:44,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 15:41:44,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:41:45,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:41:46,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:41:47,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:41:52,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:42:00,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:42:00,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:00,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 15:42:00,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:42:00,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 15:42:01,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:01,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=411420.0, ans=0.2 2023-09-29 15:42:04,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 15:42:04,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 15:42:05,106 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=411420.0, ans=0.0 2023-09-29 15:42:06,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:42:07,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:42:07,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:42:09,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 15:42:09,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:42:14,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:42:14,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:42:17,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 15:42:17,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:42:20,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:42:20,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 15:42:25,063 INFO [train.py:1039] (3/4) Epoch 12, batch 3300, loss[loss=0.2015, simple_loss=0.2792, pruned_loss=0.06188, over 24598.00 frames. ], tot_loss[loss=0.1979, simple_loss=0.2688, pruned_loss=0.06353, over 4700750.90 frames. ], batch size: 68, lr: 8.61e-03, grad_scale: 16.0 2023-09-29 15:42:25,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:42:25,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 15:42:26,585 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.957e+02 2.272e+02 2.906e+02 4.656e+02, threshold=4.545e+02, percent-clipped=0.0 2023-09-29 15:42:26,782 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 15:42:28,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 15:42:28,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:42:33,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:42:34,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:42:36,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:37,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 15:42:37,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:42:40,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:42:42,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:42:46,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 15:42:46,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:42:46,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:42:47,333 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=411620.0, ans=0.1 2023-09-29 15:42:48,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:48,613 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 15:42:50,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:42:50,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 15:42:51,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 15:42:51,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:42:53,185 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 15:42:54,541 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=411620.0, ans=0.0 2023-09-29 15:42:55,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:42:55,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:42:58,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:42:58,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 15:43:01,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 15:43:01,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:43:02,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:43:04,600 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 15:43:06,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 15:43:08,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:43:10,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=411686.6666666667, ans=0.015 2023-09-29 15:43:10,539 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=411686.6666666667, ans=0.0 2023-09-29 15:43:11,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 15:43:13,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:43:15,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 15:43:16,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:43:20,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:43:20,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:43:20,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:43:20,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:43:23,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=411753.3333333333, ans=0.09899494936611666 2023-09-29 15:43:24,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:43:24,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:43:26,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:43:27,593 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 15:43:29,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 15:43:30,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 15:43:30,740 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:43:30,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:43:33,554 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.34 vs. limit=10.0 2023-09-29 15:43:34,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:43:34,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:43:35,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:43:36,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:43:36,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:43:37,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:43:39,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:43:42,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 15:43:42,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:43:44,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:43:47,608 INFO [train.py:1039] (3/4) Epoch 12, batch 3350, loss[loss=0.1706, simple_loss=0.2523, pruned_loss=0.04448, over 24488.00 frames. ], tot_loss[loss=0.1972, simple_loss=0.2688, pruned_loss=0.06279, over 4721061.31 frames. ], batch size: 63, lr: 8.60e-03, grad_scale: 16.0 2023-09-29 15:43:47,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 15:43:47,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:43:49,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:43:51,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:43:51,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:43:54,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:43:54,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=411886.6666666667, ans=0.2 2023-09-29 15:43:56,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:43:57,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:44:00,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:02,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:44:02,602 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=5.12 vs. limit=15.0 2023-09-29 15:44:03,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:44:05,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:44:05,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 15:44:05,386 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 15:44:05,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=411953.3333333333, ans=0.1 2023-09-29 15:44:05,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=411953.3333333333, ans=0.0 2023-09-29 15:44:06,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:44:09,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 15:44:09,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 15:44:12,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:44:12,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:44:12,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:44:12,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 15:44:13,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:13,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:44:16,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:18,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:44:18,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:20,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:44:23,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:44:28,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:44:28,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:44:32,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:44:33,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:44:36,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:44:36,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:44:36,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=412086.6666666667, ans=0.125 2023-09-29 15:44:37,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:44:40,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 15:44:40,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:44:42,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 15:44:42,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:44:43,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=412086.6666666667, ans=0.0 2023-09-29 15:44:43,302 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=412086.6666666667, ans=0.1 2023-09-29 15:44:44,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 15:44:44,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=412086.6666666667, ans=0.0 2023-09-29 15:44:45,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:44:46,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:44:48,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=412086.6666666667, ans=0.0 2023-09-29 15:44:53,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:44:55,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 15:44:55,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 15:44:55,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:44:56,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:45:02,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:45:05,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 15:45:05,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:45:05,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:45:08,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:45:08,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 15:45:09,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:45:09,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 15:45:11,202 INFO [train.py:1039] (3/4) Epoch 12, batch 3400, loss[loss=0.1967, simple_loss=0.2597, pruned_loss=0.06683, over 22733.00 frames. ], tot_loss[loss=0.1984, simple_loss=0.2699, pruned_loss=0.06343, over 4710543.76 frames. ], batch size: 322, lr: 8.60e-03, grad_scale: 16.0 2023-09-29 15:45:11,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:45:11,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:45:12,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 15:45:13,409 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.868e+02 2.094e+02 2.439e+02 4.049e+02, threshold=4.189e+02, percent-clipped=0.0 2023-09-29 15:45:13,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:45:15,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 15:45:18,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 15:45:18,198 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 15:45:18,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:45:21,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:45:21,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 15:45:22,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:45:25,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:45:30,863 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.81 vs. limit=15.0 2023-09-29 15:45:33,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:45:36,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 15:45:38,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.min_positive, batch_count=412286.6666666667, ans=0.05 2023-09-29 15:45:40,139 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=412286.6666666667, ans=0.125 2023-09-29 15:45:42,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:45:44,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:45:44,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:45:46,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 15:45:47,039 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.73 vs. limit=15.0 2023-09-29 15:45:50,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:45:54,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 15:45:56,430 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=412353.3333333333, ans=0.125 2023-09-29 15:45:58,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=412353.3333333333, ans=0.125 2023-09-29 15:46:00,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:46:00,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:46:03,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 15:46:04,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:46:04,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:46:06,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:46:06,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:46:09,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:46:14,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=412420.0, ans=0.125 2023-09-29 15:46:15,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:46:16,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:46:16,874 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.32 vs. limit=15.0 2023-09-29 15:46:20,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:46:22,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 15:46:22,838 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 15:46:29,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 15:46:31,499 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.48 vs. limit=22.5 2023-09-29 15:46:33,846 INFO [train.py:1039] (3/4) Epoch 12, batch 3450, loss[loss=0.1796, simple_loss=0.2542, pruned_loss=0.05246, over 24438.00 frames. ], tot_loss[loss=0.1977, simple_loss=0.2693, pruned_loss=0.06305, over 4717857.05 frames. ], batch size: 58, lr: 8.59e-03, grad_scale: 16.0 2023-09-29 15:46:34,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 15:46:37,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 15:46:37,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:46:39,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:46:39,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=412553.3333333333, ans=0.05 2023-09-29 15:46:41,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 15:46:42,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:46:44,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:46:47,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=412553.3333333333, ans=0.125 2023-09-29 15:46:51,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:46:52,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:46:52,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:46:52,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:46:54,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:47:00,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 15:47:05,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=412620.0, ans=0.0 2023-09-29 15:47:07,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 15:47:07,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 15:47:07,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:47:10,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:47:13,526 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=13.52 vs. limit=22.5 2023-09-29 15:47:16,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 15:47:17,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:47:21,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:47:21,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:47:22,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 15:47:24,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:47:27,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 15:47:27,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:47:28,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:47:31,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:47:34,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 15:47:35,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=412753.3333333333, ans=0.0 2023-09-29 15:47:38,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:47:43,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:47:46,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:47:49,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:47:53,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:47:53,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:47:55,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:47:57,181 INFO [train.py:1039] (3/4) Epoch 12, batch 3500, loss[loss=0.1982, simple_loss=0.2746, pruned_loss=0.06091, over 24036.00 frames. ], tot_loss[loss=0.1971, simple_loss=0.2683, pruned_loss=0.06292, over 4710168.42 frames. ], batch size: 80, lr: 8.59e-03, grad_scale: 16.0 2023-09-29 15:47:57,261 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:47:58,596 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.929e+02 2.065e+02 2.305e+02 4.202e+02, threshold=4.129e+02, percent-clipped=1.0 2023-09-29 15:47:59,625 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.84 vs. limit=22.5 2023-09-29 15:48:01,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:48:04,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:48:05,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 15:48:07,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 15:48:11,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 15:48:13,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:48:13,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 15:48:15,654 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=412953.3333333333, ans=0.125 2023-09-29 15:48:16,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:48:18,373 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:48:20,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:48:20,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:48:20,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:48:20,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:22,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:48:22,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 15:48:27,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:27,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:48:27,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:48:32,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:34,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 15:48:34,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:48:37,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:48:37,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:48:39,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:41,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:48:41,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:48:42,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 15:48:43,289 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.45 vs. limit=10.0 2023-09-29 15:48:43,485 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.89 vs. limit=15.0 2023-09-29 15:48:44,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 15:48:45,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 15:48:45,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:48:47,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:48:48,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:48:48,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 15:48:49,010 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=413086.6666666667, ans=0.1 2023-09-29 15:48:51,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:48:53,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:48:57,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:48:59,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 15:48:59,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 15:48:59,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:49:02,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:49:03,880 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.97 vs. limit=15.0 2023-09-29 15:49:04,343 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:49:05,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:49:07,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 15:49:08,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:49:10,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:49:10,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 15:49:14,150 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 15:49:17,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:49:18,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:49:18,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:49:18,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:49:20,017 INFO [train.py:1039] (3/4) Epoch 12, batch 3550, loss[loss=0.2158, simple_loss=0.2868, pruned_loss=0.07246, over 23967.00 frames. ], tot_loss[loss=0.1955, simple_loss=0.2672, pruned_loss=0.0619, over 4716476.06 frames. ], batch size: 80, lr: 8.59e-03, grad_scale: 16.0 2023-09-29 15:49:21,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 15:49:22,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=413220.0, ans=0.2 2023-09-29 15:49:32,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=413220.0, ans=0.125 2023-09-29 15:49:33,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:49:33,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 15:49:34,118 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.max_abs, batch_count=413220.0, ans=10.0 2023-09-29 15:49:36,312 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=413286.6666666667, ans=0.125 2023-09-29 15:49:39,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:49:40,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 15:49:42,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:49:43,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:49:43,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:49:46,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:49:46,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:49:48,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:49:48,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 15:49:50,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:49:52,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=413353.3333333333, ans=0.125 2023-09-29 15:49:55,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:49:55,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 15:49:56,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:49:56,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:49:58,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:49:58,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 15:49:58,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:50:01,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:50:03,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 15:50:04,596 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.15 vs. limit=6.0 2023-09-29 15:50:06,435 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.50 vs. limit=15.0 2023-09-29 15:50:10,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:50:10,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:50:11,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:50:13,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 15:50:14,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:50:15,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 15:50:17,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 15:50:18,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:50:18,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:50:21,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 15:50:23,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:50:28,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:50:28,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 15:50:30,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:50:35,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:50:35,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 15:50:43,737 INFO [train.py:1039] (3/4) Epoch 12, batch 3600, loss[loss=0.2007, simple_loss=0.2706, pruned_loss=0.06545, over 23215.00 frames. ], tot_loss[loss=0.1955, simple_loss=0.2672, pruned_loss=0.06189, over 4712725.11 frames. ], batch size: 105, lr: 8.58e-03, grad_scale: 32.0 2023-09-29 15:50:43,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 15:50:43,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:50:43,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:50:44,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=413553.3333333333, ans=0.1 2023-09-29 15:50:45,967 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 1.995e+02 2.200e+02 2.637e+02 4.261e+02, threshold=4.399e+02, percent-clipped=1.0 2023-09-29 15:50:47,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:50:47,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:50:49,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:50:53,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:50:55,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:50:57,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:50:57,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:50:59,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:50:59,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 15:51:02,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:51:02,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:51:05,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:51:06,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=413620.0, ans=22.5 2023-09-29 15:51:08,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:51:11,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 15:51:12,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:51:12,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 15:51:12,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 15:51:12,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=413620.0, ans=0.0 2023-09-29 15:51:15,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:51:17,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 15:51:18,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:51:22,482 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:51:22,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:51:24,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 15:51:31,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:51:33,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 15:51:33,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 15:51:37,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=413753.3333333333, ans=0.125 2023-09-29 15:51:39,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:51:45,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:51:48,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:51:56,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 15:51:56,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:51:56,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 15:51:58,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 15:51:59,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 15:52:02,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:52:03,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:52:04,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 15:52:05,938 INFO [train.py:1039] (3/4) Epoch 12, batch 3650, loss[loss=0.1983, simple_loss=0.2766, pruned_loss=0.05996, over 24406.00 frames. ], tot_loss[loss=0.1956, simple_loss=0.2677, pruned_loss=0.06177, over 4717205.57 frames. ], batch size: 77, lr: 8.58e-03, grad_scale: 32.0 2023-09-29 15:52:06,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:52:06,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:52:06,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:52:07,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 15:52:07,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 15:52:10,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:52:13,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 15:52:17,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 15:52:19,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:52:22,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 15:52:24,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 15:52:29,638 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:52:29,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 15:52:29,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 15:52:32,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 15:52:34,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:52:34,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 15:52:34,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 15:52:36,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:52:36,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 15:52:36,723 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=413953.3333333333, ans=0.1 2023-09-29 15:52:37,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:52:39,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:52:39,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:52:42,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:52:43,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 15:52:45,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 15:52:46,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:52:49,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 15:52:50,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:52:50,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:52:57,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:52:57,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=414086.6666666667, ans=0.125 2023-09-29 15:52:59,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:52:59,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 15:53:00,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 15:53:02,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:53:04,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:53:07,398 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:53:09,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:53:09,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:53:12,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 15:53:12,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=414153.3333333333, ans=0.0 2023-09-29 15:53:14,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:53:14,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:53:20,565 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 15:53:22,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=414153.3333333333, ans=0.125 2023-09-29 15:53:24,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:53:24,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:53:24,547 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=414153.3333333333, ans=0.2 2023-09-29 15:53:27,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 15:53:27,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:53:28,621 INFO [train.py:1039] (3/4) Epoch 12, batch 3700, loss[loss=0.2012, simple_loss=0.2685, pruned_loss=0.067, over 23433.00 frames. ], tot_loss[loss=0.1959, simple_loss=0.2681, pruned_loss=0.06181, over 4731867.95 frames. ], batch size: 106, lr: 8.58e-03, grad_scale: 32.0 2023-09-29 15:53:28,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 15:53:28,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:53:30,804 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.903e+02 2.176e+02 2.360e+02 3.995e+02, threshold=4.353e+02, percent-clipped=0.0 2023-09-29 15:53:31,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 15:53:31,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:53:32,840 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:53:36,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:53:36,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:53:39,597 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:53:39,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 15:53:39,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:53:39,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 15:53:41,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 15:53:45,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 15:53:48,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:53:49,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:53:49,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 15:53:51,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:53:51,373 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:53:52,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:53:54,486 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 15:53:59,780 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=414286.6666666667, ans=0.125 2023-09-29 15:54:04,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:54:05,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 15:54:07,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 15:54:07,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 15:54:07,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:54:11,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:54:11,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 15:54:14,657 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:54:16,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:54:17,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:54:18,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 15:54:20,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 15:54:24,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:54:26,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 15:54:26,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:54:27,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 15:54:28,748 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.45 vs. limit=10.0 2023-09-29 15:54:31,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:54:31,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:54:35,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:54:36,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 15:54:38,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:54:38,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 15:54:38,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=414486.6666666667, ans=0.125 2023-09-29 15:54:39,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:54:39,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:54:43,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:54:44,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 15:54:46,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 15:54:46,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:54:46,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:54:48,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 15:54:48,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:54:53,607 INFO [train.py:1039] (3/4) Epoch 12, batch 3750, loss[loss=0.2053, simple_loss=0.2669, pruned_loss=0.07188, over 23641.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.269, pruned_loss=0.06243, over 4736704.77 frames. ], batch size: 232, lr: 8.57e-03, grad_scale: 32.0 2023-09-29 15:54:53,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:54:55,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:54:56,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:54:58,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 15:54:58,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 15:55:01,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 15:55:03,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 15:55:03,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:55:04,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:55:05,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:55:05,796 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=414553.3333333333, ans=0.2 2023-09-29 15:55:06,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:55:11,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:55:14,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 15:55:16,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:55:21,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:55:23,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:55:24,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 15:55:24,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:55:26,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:55:26,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:55:28,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 15:55:31,054 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.72 vs. limit=10.0 2023-09-29 15:55:34,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 15:55:34,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:55:34,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:55:37,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:55:42,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:55:44,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 15:55:48,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=414753.3333333333, ans=0.05 2023-09-29 15:55:49,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 15:55:52,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:55:54,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=414753.3333333333, ans=0.125 2023-09-29 15:55:56,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 15:55:56,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:56:01,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 15:56:03,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 15:56:04,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 15:56:06,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 15:56:07,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 15:56:09,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 15:56:15,898 INFO [train.py:1039] (3/4) Epoch 12, batch 3800, loss[loss=0.2181, simple_loss=0.2748, pruned_loss=0.08065, over 23733.00 frames. ], tot_loss[loss=0.1981, simple_loss=0.27, pruned_loss=0.06314, over 4715555.51 frames. ], batch size: 164, lr: 8.57e-03, grad_scale: 8.0 2023-09-29 15:56:19,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 15:56:21,132 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.598e+02 2.017e+02 2.225e+02 2.467e+02 3.965e+02, threshold=4.450e+02, percent-clipped=0.0 2023-09-29 15:56:24,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:56:24,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 15:56:25,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 15:56:27,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:56:27,916 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=7.21 vs. limit=15.0 2023-09-29 15:56:29,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:56:29,828 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=414886.6666666667, ans=0.0 2023-09-29 15:56:31,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 15:56:33,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 15:56:33,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:56:34,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 15:56:36,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:56:36,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 15:56:37,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:56:39,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 15:56:41,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=414953.3333333333, ans=0.05 2023-09-29 15:56:44,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 15:56:44,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:56:46,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:56:49,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 15:56:49,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 15:56:49,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=415020.0, ans=0.125 2023-09-29 15:56:51,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 15:56:52,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:56:54,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:56:57,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:57:02,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 15:57:02,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 15:57:03,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:57:03,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=415086.6666666667, ans=0.1 2023-09-29 15:57:04,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=415086.6666666667, ans=0.125 2023-09-29 15:57:11,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:57:15,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:57:16,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=415086.6666666667, ans=0.125 2023-09-29 15:57:19,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 15:57:21,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 15:57:21,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:57:24,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:57:24,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:57:26,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 15:57:26,755 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=415153.3333333333, ans=0.125 2023-09-29 15:57:31,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 15:57:31,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 15:57:31,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:57:32,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:57:33,401 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.12 vs. limit=15.0 2023-09-29 15:57:38,757 INFO [train.py:1039] (3/4) Epoch 12, batch 3850, loss[loss=0.1698, simple_loss=0.2459, pruned_loss=0.04687, over 24635.00 frames. ], tot_loss[loss=0.1962, simple_loss=0.2676, pruned_loss=0.06245, over 4719051.84 frames. ], batch size: 60, lr: 8.57e-03, grad_scale: 4.0 2023-09-29 15:57:38,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:57:40,452 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 15:57:40,686 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=415220.0, ans=0.125 2023-09-29 15:57:46,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 15:57:47,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 15:57:47,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 15:57:47,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:57:48,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=415220.0, ans=0.125 2023-09-29 15:57:53,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 15:57:55,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:57:57,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 15:57:57,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 15:58:05,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:07,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:58:10,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:58:10,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 15:58:13,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:13,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 15:58:15,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:58:15,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 15:58:15,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:58:17,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:58:19,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:19,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 15:58:20,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 15:58:22,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 15:58:22,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:58:22,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:25,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:58:27,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:27,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 15:58:30,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 15:58:32,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:58:34,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 15:58:36,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 15:58:37,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=415420.0, ans=0.125 2023-09-29 15:58:38,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=415420.0, ans=0.05 2023-09-29 15:58:42,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:58:43,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:58:46,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:58:48,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 15:58:50,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 15:58:52,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=415486.6666666667, ans=0.125 2023-09-29 15:58:53,138 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.38 vs. limit=12.0 2023-09-29 15:58:53,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:58:55,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:58:57,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 15:58:57,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 15:58:58,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:58:58,663 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=415486.6666666667, ans=0.125 2023-09-29 15:59:00,043 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:00,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 15:59:00,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 15:59:01,979 INFO [train.py:1039] (3/4) Epoch 12, batch 3900, loss[loss=0.1807, simple_loss=0.2553, pruned_loss=0.05303, over 23325.00 frames. ], tot_loss[loss=0.1956, simple_loss=0.2668, pruned_loss=0.06215, over 4715804.68 frames. ], batch size: 119, lr: 8.56e-03, grad_scale: 8.0 2023-09-29 15:59:02,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 15:59:03,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 15:59:03,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:03,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:59:04,566 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.44 vs. limit=10.0 2023-09-29 15:59:05,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 15:59:05,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:06,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 15:59:07,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 15:59:07,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 15:59:07,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:59:07,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 15:59:07,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=415553.3333333333, ans=0.09899494936611666 2023-09-29 15:59:08,536 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.940e+02 2.154e+02 2.415e+02 3.457e+02, threshold=4.308e+02, percent-clipped=0.0 2023-09-29 15:59:08,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:12,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:59:15,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:59:15,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 15:59:16,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 15:59:18,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 15:59:18,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:21,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 15:59:21,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 15:59:21,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:59:25,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 15:59:25,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 15:59:27,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 15:59:30,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 15:59:33,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:59:35,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 15:59:35,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 15:59:35,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 15:59:42,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 15:59:44,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 15:59:45,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 15:59:45,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 15:59:46,421 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.48 vs. limit=6.0 2023-09-29 15:59:47,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 15:59:51,838 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=415753.3333333333, ans=0.1 2023-09-29 15:59:54,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 15:59:54,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 15:59:58,622 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.81 vs. limit=15.0 2023-09-29 16:00:00,286 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=415753.3333333333, ans=0.1 2023-09-29 16:00:03,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:00:05,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:00:13,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:00:17,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:00:18,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 16:00:20,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 16:00:20,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:00:22,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 16:00:24,273 INFO [train.py:1039] (3/4) Epoch 12, batch 3950, loss[loss=0.1988, simple_loss=0.2642, pruned_loss=0.06675, over 23467.00 frames. ], tot_loss[loss=0.1954, simple_loss=0.2669, pruned_loss=0.06191, over 4731884.08 frames. ], batch size: 134, lr: 8.56e-03, grad_scale: 8.0 2023-09-29 16:00:24,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:00:24,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 16:00:32,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:00:33,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 16:00:33,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:00:36,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:00:37,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:00:44,298 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 16:00:45,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:00:45,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 16:00:45,852 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 16:00:46,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:00:48,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:00:48,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:00:48,355 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:00:53,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 16:00:55,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:00:55,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=415953.3333333333, ans=0.125 2023-09-29 16:00:56,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:00:56,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:00:56,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:00:58,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:01:08,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:01:08,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:01:16,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 16:01:22,052 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 16:01:22,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 16:01:23,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:01:23,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:01:32,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:01:33,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:01:33,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:01:33,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:01:33,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 16:01:35,325 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:01:40,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:01:42,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:01:46,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 16:01:47,923 INFO [train.py:1039] (3/4) Epoch 12, batch 4000, loss[loss=0.1954, simple_loss=0.2613, pruned_loss=0.06478, over 23455.00 frames. ], tot_loss[loss=0.196, simple_loss=0.2676, pruned_loss=0.06219, over 4733541.77 frames. ], batch size: 134, lr: 8.56e-03, grad_scale: 16.0 2023-09-29 16:01:55,134 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 2.007e+02 2.286e+02 2.878e+02 4.961e+02, threshold=4.572e+02, percent-clipped=2.0 2023-09-29 16:01:55,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:02:01,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:02:08,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:02:08,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:02:08,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:02:08,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 16:02:09,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:02:10,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 16:02:10,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:02:10,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 16:02:13,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:02:13,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=416286.6666666667, ans=0.125 2023-09-29 16:02:14,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:02:14,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:02:14,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:02:15,235 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=416286.6666666667, ans=0.125 2023-09-29 16:02:16,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:02:16,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 16:02:18,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:02:21,471 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 16:02:21,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:02:23,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:02:25,357 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 16:02:27,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 16:02:27,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:02:35,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=416353.3333333333, ans=0.125 2023-09-29 16:02:37,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 16:02:37,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:02:40,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:02:41,882 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 16:02:43,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:02:43,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 16:02:43,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:02:45,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:02:45,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:02:45,804 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=13.56 vs. limit=15.0 2023-09-29 16:02:46,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:02:46,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:02:46,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:02:49,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 16:02:49,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:02:51,410 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 16:02:56,903 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:02:58,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:02:58,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=416486.6666666667, ans=0.1 2023-09-29 16:03:01,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 16:03:04,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:03:04,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:03:05,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:03:08,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:03:11,729 INFO [train.py:1039] (3/4) Epoch 12, batch 4050, loss[loss=0.1896, simple_loss=0.2503, pruned_loss=0.06443, over 23816.00 frames. ], tot_loss[loss=0.1961, simple_loss=0.2677, pruned_loss=0.06226, over 4726503.63 frames. ], batch size: 179, lr: 8.55e-03, grad_scale: 16.0 2023-09-29 16:03:11,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:03:14,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 16:03:14,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 16:03:16,548 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:03:16,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:03:17,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:03:19,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:03:19,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=416553.3333333333, ans=0.0 2023-09-29 16:03:19,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=416553.3333333333, ans=0.0 2023-09-29 16:03:21,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:03:21,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=416553.3333333333, ans=0.09899494936611666 2023-09-29 16:03:25,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:03:27,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:03:29,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 16:03:30,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:03:31,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:03:32,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=416620.0, ans=0.1 2023-09-29 16:03:36,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:03:38,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:03:41,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 16:03:42,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 16:03:43,237 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=416686.6666666667, ans=0.125 2023-09-29 16:03:44,881 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 16:03:47,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:03:50,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=416686.6666666667, ans=0.0 2023-09-29 16:03:52,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 16:03:53,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:03:56,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:03:59,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:04:01,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:04:01,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:04:04,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:04:07,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 16:04:07,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:04:09,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:04:11,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 16:04:15,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=416753.3333333333, ans=0.125 2023-09-29 16:04:16,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:04:23,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 16:04:24,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:04:24,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:04:26,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 16:04:26,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 16:04:26,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:04:26,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=416820.0, ans=0.125 2023-09-29 16:04:29,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:04:30,648 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:04:30,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:04:33,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=416886.6666666667, ans=0.5 2023-09-29 16:04:34,261 INFO [train.py:1039] (3/4) Epoch 12, batch 4100, loss[loss=0.2152, simple_loss=0.2877, pruned_loss=0.07134, over 24002.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.2684, pruned_loss=0.06265, over 4705336.11 frames. ], batch size: 86, lr: 8.55e-03, grad_scale: 8.0 2023-09-29 16:04:37,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 16:04:40,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 16:04:40,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 16:04:42,527 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 2.020e+02 2.338e+02 2.754e+02 3.996e+02, threshold=4.676e+02, percent-clipped=0.0 2023-09-29 16:04:42,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 16:04:42,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:04:44,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:04:44,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:04:45,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:04:47,227 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 16:04:49,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:04:51,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:04:51,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:04:52,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:04:54,745 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=416953.3333333333, ans=0.2 2023-09-29 16:04:55,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:04:57,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:04:57,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:04:57,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 16:04:58,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:04:59,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:04:59,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:04:59,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:04:59,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 16:05:04,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:05:05,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 16:05:07,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:05:08,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:05:08,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 16:05:10,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:05:10,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:05:11,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:05:13,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 16:05:15,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:05:16,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:05:18,043 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 16:05:20,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:05:20,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:05:23,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:05:24,077 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=417086.6666666667, ans=0.025 2023-09-29 16:05:29,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:05:32,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:05:35,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:05:42,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:05:42,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:05:48,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:05:50,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:05:52,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:05:53,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:05:55,456 INFO [train.py:1039] (3/4) Epoch 12, batch 4150, loss[loss=0.211, simple_loss=0.266, pruned_loss=0.07798, over 19592.00 frames. ], tot_loss[loss=0.197, simple_loss=0.2685, pruned_loss=0.06273, over 4710975.12 frames. ], batch size: 388, lr: 8.55e-03, grad_scale: 8.0 2023-09-29 16:05:55,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:05:55,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:05:58,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 16:05:59,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:05:59,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 16:06:00,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 16:06:01,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 16:06:02,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:06:06,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:06:06,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:06:11,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:06:12,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:06:14,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 16:06:16,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:06:17,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:06:17,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 16:06:21,483 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.07 vs. limit=6.0 2023-09-29 16:06:22,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:06:27,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:06:29,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 16:06:30,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 16:06:30,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:06:31,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 16:06:31,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:06:31,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:06:32,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=417353.3333333333, ans=0.125 2023-09-29 16:06:34,435 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten.whitening_limit, batch_count=417353.3333333333, ans=15.0 2023-09-29 16:06:35,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:06:36,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:06:41,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 16:06:46,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:06:46,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:06:47,978 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 16:06:48,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:06:50,163 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.38 vs. limit=6.0 2023-09-29 16:06:50,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 16:06:52,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:06:52,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=417420.0, ans=0.09899494936611666 2023-09-29 16:06:55,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:06:55,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:06:57,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 16:06:57,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:06:57,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 16:07:00,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 16:07:03,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 16:07:03,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:07:03,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:07:05,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 16:07:06,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 16:07:07,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:07:07,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 16:07:08,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:07:10,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:07:10,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 16:07:10,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:07:14,436 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.57 vs. limit=15.0 2023-09-29 16:07:16,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:07:18,204 INFO [train.py:1039] (3/4) Epoch 12, batch 4200, loss[loss=0.2141, simple_loss=0.2983, pruned_loss=0.06494, over 24614.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.2677, pruned_loss=0.06273, over 4707218.31 frames. ], batch size: 68, lr: 8.54e-03, grad_scale: 8.0 2023-09-29 16:07:18,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 16:07:20,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:07:22,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:07:25,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:07:26,353 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 1.937e+02 2.271e+02 2.682e+02 4.339e+02, threshold=4.541e+02, percent-clipped=0.0 2023-09-29 16:07:26,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:07:26,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:07:29,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 16:07:32,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 16:07:32,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:07:35,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:07:37,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:07:39,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 16:07:41,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:07:42,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:07:42,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 16:07:42,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:07:45,129 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.57 vs. limit=15.0 2023-09-29 16:07:45,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:07:45,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:07:46,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:07:49,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:07:49,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 16:07:49,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:07:54,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 16:07:56,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:07:59,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:08:00,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:08:02,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:08:02,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 16:08:02,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:08:05,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:08:08,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:08:10,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:08:17,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:08:18,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 16:08:20,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:08:27,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:08:27,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:08:30,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 16:08:30,988 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.88 vs. limit=15.0 2023-09-29 16:08:34,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:08:39,285 INFO [train.py:1039] (3/4) Epoch 12, batch 4250, loss[loss=0.2, simple_loss=0.2764, pruned_loss=0.06179, over 24102.00 frames. ], tot_loss[loss=0.1967, simple_loss=0.2675, pruned_loss=0.06294, over 4722543.66 frames. ], batch size: 80, lr: 8.54e-03, grad_scale: 8.0 2023-09-29 16:08:39,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:08:39,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:08:41,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:08:48,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:08:49,455 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 16:08:49,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:08:51,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:08:52,783 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=417886.6666666667, ans=0.125 2023-09-29 16:08:57,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:09:02,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:09:02,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:09:02,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=417953.3333333333, ans=0.125 2023-09-29 16:09:04,259 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:09:04,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:09:05,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:09:05,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:09:07,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:09:09,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:09:10,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:09:12,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 16:09:16,041 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.24 vs. limit=12.0 2023-09-29 16:09:17,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 16:09:18,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:09:19,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:09:19,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:09:21,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:09:21,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:09:21,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:09:26,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 16:09:26,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:09:28,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=418086.6666666667, ans=0.0 2023-09-29 16:09:28,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=418086.6666666667, ans=0.125 2023-09-29 16:09:31,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:09:33,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:09:33,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 16:09:34,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:09:34,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 16:09:36,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:09:37,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:09:40,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:09:40,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:09:43,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 16:09:44,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:09:45,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:09:50,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:09:51,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:09:53,617 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.36 vs. limit=15.0 2023-09-29 16:09:54,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:09:55,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:09:57,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:09:59,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:09:59,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:09:59,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 16:10:01,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:10:03,571 INFO [train.py:1039] (3/4) Epoch 12, batch 4300, loss[loss=0.196, simple_loss=0.2447, pruned_loss=0.07367, over 19388.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.2676, pruned_loss=0.06281, over 4718319.38 frames. ], batch size: 388, lr: 8.54e-03, grad_scale: 8.0 2023-09-29 16:10:07,111 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=418220.0, ans=0.1 2023-09-29 16:10:08,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:10:08,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:10:11,196 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.977e+02 2.264e+02 2.605e+02 3.860e+02, threshold=4.528e+02, percent-clipped=0.0 2023-09-29 16:10:11,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:10:18,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:10:18,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 16:10:20,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:10:22,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:10:22,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:10:22,258 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 16:10:25,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 16:10:29,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:10:32,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 16:10:32,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:10:34,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 16:10:36,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 16:10:38,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:10:42,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:10:42,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:10:42,620 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:10:44,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:10:45,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:10:45,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 16:10:45,845 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 16:10:48,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:10:50,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:10:50,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 16:10:50,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:10:50,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:10:50,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 16:10:50,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 16:10:52,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 16:10:53,015 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.20 vs. limit=15.0 2023-09-29 16:10:53,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:10:53,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 16:10:53,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 16:10:55,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:10:57,084 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 16:10:59,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:11:02,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:11:02,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:11:04,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 16:11:06,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:11:06,490 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:11:06,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:11:06,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:11:08,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:11:09,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:11:13,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:11:13,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:11:14,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:11:20,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 16:11:22,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 16:11:25,246 INFO [train.py:1039] (3/4) Epoch 12, batch 4350, loss[loss=0.2168, simple_loss=0.2863, pruned_loss=0.07363, over 23206.00 frames. ], tot_loss[loss=0.197, simple_loss=0.2681, pruned_loss=0.06292, over 4715206.13 frames. ], batch size: 105, lr: 8.53e-03, grad_scale: 8.0 2023-09-29 16:11:25,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:11:27,645 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.29 vs. limit=12.0 2023-09-29 16:11:28,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:11:30,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:11:30,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:11:34,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:11:34,964 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.21 vs. limit=15.0 2023-09-29 16:11:39,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:11:43,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:11:44,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:11:46,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:11:49,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:11:50,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:11:56,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 16:11:57,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:11:58,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:04,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:07,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 16:12:11,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:12:12,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:12:18,176 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 16:12:19,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:12:19,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:12:21,257 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 16:12:21,381 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 16:12:21,390 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:12:22,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:12:22,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:12:24,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:12:25,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:12:25,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:12:30,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 16:12:30,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:30,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:12:30,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:30,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 16:12:31,902 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 16:12:31,909 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 16:12:31,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 16:12:35,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:12:35,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:12:35,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:12:36,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:12:38,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 16:12:41,323 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 16:12:41,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:47,066 INFO [train.py:1039] (3/4) Epoch 12, batch 4400, loss[loss=0.1718, simple_loss=0.2534, pruned_loss=0.04515, over 24327.00 frames. ], tot_loss[loss=0.1972, simple_loss=0.269, pruned_loss=0.06269, over 4726831.75 frames. ], batch size: 61, lr: 8.53e-03, grad_scale: 16.0 2023-09-29 16:12:47,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:12:47,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:12:50,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:12:50,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 16:12:52,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 16:12:52,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 16:12:52,361 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 16:12:54,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:12:54,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:12:55,961 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.634e+02 1.963e+02 2.169e+02 2.661e+02 4.171e+02, threshold=4.339e+02, percent-clipped=0.0 2023-09-29 16:12:56,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 16:12:59,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:13:00,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:13:00,919 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 16:13:05,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:13:05,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 16:13:05,459 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 16:13:08,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=418953.3333333333, ans=0.2 2023-09-29 16:13:09,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 16:13:10,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 16:13:10,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 16:13:10,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:13:11,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:13:11,863 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=418953.3333333333, ans=0.1 2023-09-29 16:13:13,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:13:14,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:13:16,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 16:13:16,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 16:13:17,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:13:19,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:13:19,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:13:21,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:13:22,256 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.17 vs. limit=22.5 2023-09-29 16:13:23,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:13:23,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 16:13:24,604 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 16:13:27,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:13:35,317 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:13:36,921 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 16:13:38,829 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=419086.6666666667, ans=0.2 2023-09-29 16:13:41,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:13:43,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:13:46,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=419086.6666666667, ans=0.0 2023-09-29 16:13:47,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:13:47,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 16:13:47,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:13:47,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:13:47,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:13:49,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:13:54,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 16:13:56,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.max_abs, batch_count=419153.3333333333, ans=10.0 2023-09-29 16:13:57,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 16:13:59,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 16:13:59,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:13:59,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 16:14:01,310 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:14:04,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:14:06,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 16:14:09,901 INFO [train.py:1039] (3/4) Epoch 12, batch 4450, loss[loss=0.194, simple_loss=0.2629, pruned_loss=0.06253, over 23655.00 frames. ], tot_loss[loss=0.197, simple_loss=0.2688, pruned_loss=0.0626, over 4729087.91 frames. ], batch size: 149, lr: 8.53e-03, grad_scale: 16.0 2023-09-29 16:14:12,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:14:14,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:14:14,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:14:23,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:14:24,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:14:25,487 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=419286.6666666667, ans=0.1 2023-09-29 16:14:26,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:14:28,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:14:32,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:14:33,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:14:35,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 16:14:35,638 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:14:37,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:14:37,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:14:37,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:14:38,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 16:14:39,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=419286.6666666667, ans=0.0 2023-09-29 16:14:42,934 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=419353.3333333333, ans=0.1 2023-09-29 16:14:45,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:14:45,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:14:47,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:14:47,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:14:49,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:14:53,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 16:14:55,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 16:14:56,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 16:14:56,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:14:58,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:14:58,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 16:15:02,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:15:06,740 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:15:08,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 16:15:08,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:15:08,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:15:10,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:15:10,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:15:10,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:15:13,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 16:15:15,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 16:15:17,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:15:20,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:15:22,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:15:23,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:15:24,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 16:15:25,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:15:28,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 16:15:30,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:15:31,809 INFO [train.py:1039] (3/4) Epoch 12, batch 4500, loss[loss=0.1834, simple_loss=0.2734, pruned_loss=0.04674, over 24462.00 frames. ], tot_loss[loss=0.197, simple_loss=0.2688, pruned_loss=0.06255, over 4731310.14 frames. ], batch size: 69, lr: 8.52e-03, grad_scale: 16.0 2023-09-29 16:15:35,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:15:37,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 16:15:37,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 16:15:38,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:15:40,284 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.947e+02 2.224e+02 2.499e+02 3.956e+02, threshold=4.448e+02, percent-clipped=0.0 2023-09-29 16:15:44,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:15:44,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:15:45,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:15:46,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=419553.3333333333, ans=0.125 2023-09-29 16:15:47,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:15:47,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:15:47,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:15:51,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=419620.0, ans=0.125 2023-09-29 16:15:54,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=419620.0, ans=0.0 2023-09-29 16:16:00,293 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.43 vs. limit=15.0 2023-09-29 16:16:00,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:16:00,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:16:03,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:16:04,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:16:04,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=419686.6666666667, ans=0.0 2023-09-29 16:16:05,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:16:05,927 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:16:13,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 16:16:18,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:16:23,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:16:27,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:16:27,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 16:16:27,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:16:27,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:16:30,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:16:30,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:16:33,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:16:34,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 16:16:34,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:16:34,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:16:39,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:16:39,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:16:43,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:16:45,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:16:46,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:16:48,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 16:16:50,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 16:16:50,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 16:16:54,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 16:16:56,302 INFO [train.py:1039] (3/4) Epoch 12, batch 4550, loss[loss=0.2124, simple_loss=0.2796, pruned_loss=0.07261, over 23319.00 frames. ], tot_loss[loss=0.1962, simple_loss=0.2673, pruned_loss=0.06258, over 4736218.24 frames. ], batch size: 119, lr: 8.52e-03, grad_scale: 16.0 2023-09-29 16:16:58,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=419886.6666666667, ans=0.125 2023-09-29 16:16:59,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 16:16:59,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:17:02,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:17:04,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:17:04,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=419886.6666666667, ans=0.125 2023-09-29 16:17:07,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:17:11,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:17:13,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:17:14,547 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.82 vs. limit=15.0 2023-09-29 16:17:15,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:17:15,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:17:15,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:17:18,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:17:18,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:17:22,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:17:24,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 16:17:26,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 16:17:26,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:17:28,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 16:17:29,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=420020.0, ans=0.125 2023-09-29 16:17:33,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 16:17:34,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:17:35,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=420020.0, ans=0.125 2023-09-29 16:17:36,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 16:17:37,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:17:42,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:17:44,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:17:44,373 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:17:46,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 16:17:46,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:17:49,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:17:49,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:17:52,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:17:53,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 16:17:55,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 16:17:55,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:17:57,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 16:17:57,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 16:17:57,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:17:59,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:17:59,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:18:01,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:18:01,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:18:01,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=420153.3333333333, ans=0.0 2023-09-29 16:18:03,456 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.98 vs. limit=15.0 2023-09-29 16:18:04,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 16:18:04,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 16:18:06,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:18:06,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 16:18:07,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 16:18:07,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:18:07,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 16:18:10,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:18:10,978 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:18:12,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=420153.3333333333, ans=0.125 2023-09-29 16:18:13,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:18:14,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:18:14,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 16:18:17,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:18:18,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:18:19,371 INFO [train.py:1039] (3/4) Epoch 12, batch 4600, loss[loss=0.1695, simple_loss=0.2493, pruned_loss=0.04485, over 24477.00 frames. ], tot_loss[loss=0.195, simple_loss=0.2659, pruned_loss=0.06208, over 4732825.79 frames. ], batch size: 63, lr: 8.52e-03, grad_scale: 16.0 2023-09-29 16:18:22,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:18:23,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:18:25,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:18:25,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:18:26,890 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.954e+02 2.198e+02 2.471e+02 4.636e+02, threshold=4.396e+02, percent-clipped=1.0 2023-09-29 16:18:27,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:18:27,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 16:18:28,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:18:34,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=420286.6666666667, ans=0.125 2023-09-29 16:18:36,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:18:36,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:18:39,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:18:47,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 16:18:47,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:18:50,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:18:52,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:18:52,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:18:57,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 16:18:57,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 16:18:59,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:19:01,052 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=420353.3333333333, ans=0.0 2023-09-29 16:19:04,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:19:04,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:19:06,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:19:11,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 16:19:13,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 16:19:13,570 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:19:18,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:19:19,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:19:22,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:19:22,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 16:19:22,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:19:22,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=420420.0, ans=0.2 2023-09-29 16:19:24,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 16:19:24,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:19:24,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:19:26,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:19:26,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:19:27,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:19:29,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 16:19:29,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=420486.6666666667, ans=0.125 2023-09-29 16:19:30,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 16:19:30,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 16:19:30,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:19:32,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:19:32,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:19:34,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:19:43,607 INFO [train.py:1039] (3/4) Epoch 12, batch 4650, loss[loss=0.1927, simple_loss=0.2387, pruned_loss=0.07333, over 19068.00 frames. ], tot_loss[loss=0.194, simple_loss=0.2648, pruned_loss=0.06157, over 4723025.05 frames. ], batch size: 388, lr: 8.51e-03, grad_scale: 8.0 2023-09-29 16:19:45,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:19:49,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:19:51,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:19:51,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:19:51,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:19:52,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:19:54,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:19:57,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 16:20:01,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:20:03,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 16:20:03,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:20:03,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 16:20:05,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:20:05,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 16:20:05,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 16:20:05,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:20:07,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:20:10,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:20:11,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:20:11,784 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 16:20:12,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=420620.0, ans=0.0 2023-09-29 16:20:14,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=420686.6666666667, ans=0.2 2023-09-29 16:20:16,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:20:17,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 16:20:20,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:20:22,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:20:22,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 16:20:24,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:20:28,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:20:30,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:20:30,879 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=420753.3333333333, ans=0.0 2023-09-29 16:20:35,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:20:39,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:20:39,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:20:39,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:20:44,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 16:20:44,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 16:20:44,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 16:20:44,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 16:20:47,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:20:55,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:20:55,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:20:55,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 16:20:55,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:20:58,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:20:58,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:21:00,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:21:03,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:21:03,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:21:03,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:21:04,939 INFO [train.py:1039] (3/4) Epoch 12, batch 4700, loss[loss=0.1791, simple_loss=0.2564, pruned_loss=0.05092, over 24336.00 frames. ], tot_loss[loss=0.1946, simple_loss=0.2657, pruned_loss=0.06176, over 4720615.11 frames. ], batch size: 61, lr: 8.51e-03, grad_scale: 8.0 2023-09-29 16:21:09,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:21:09,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:21:10,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:21:11,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 16:21:11,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:21:12,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 16:21:14,004 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.872e+02 2.064e+02 2.331e+02 3.087e+02, threshold=4.129e+02, percent-clipped=0.0 2023-09-29 16:21:20,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:21:21,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:21:22,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:21:23,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:21:24,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:21:28,504 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=420953.3333333333, ans=0.125 2023-09-29 16:21:29,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 16:21:29,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 16:21:29,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=420953.3333333333, ans=0.125 2023-09-29 16:21:33,204 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:21:33,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:21:34,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:21:37,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:21:38,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=421020.0, ans=0.0 2023-09-29 16:21:44,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=421020.0, ans=0.125 2023-09-29 16:21:45,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 16:21:45,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 16:21:48,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:21:55,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 16:21:57,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:22:00,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:02,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=421086.6666666667, ans=0.125 2023-09-29 16:22:03,331 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.80 vs. limit=15.0 2023-09-29 16:22:04,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 16:22:04,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:22:08,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=421086.6666666667, ans=0.0 2023-09-29 16:22:09,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:22:09,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 16:22:09,722 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=421153.3333333333, ans=0.125 2023-09-29 16:22:09,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=421153.3333333333, ans=0.125 2023-09-29 16:22:12,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:12,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:22:14,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:22:15,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:22:15,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 16:22:17,126 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 16:22:17,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=421153.3333333333, ans=0.125 2023-09-29 16:22:18,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:22:20,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:20,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:20,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 16:22:21,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:22:25,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 16:22:27,019 INFO [train.py:1039] (3/4) Epoch 12, batch 4750, loss[loss=0.2132, simple_loss=0.2747, pruned_loss=0.07591, over 23412.00 frames. ], tot_loss[loss=0.195, simple_loss=0.2664, pruned_loss=0.06185, over 4713966.17 frames. ], batch size: 285, lr: 8.51e-03, grad_scale: 8.0 2023-09-29 16:22:28,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:22:30,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:22:34,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:22:34,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:22:36,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 16:22:37,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:22:41,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 16:22:42,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:22:44,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:22:44,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:22:49,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 16:22:52,836 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:22:53,954 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:22:55,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 16:22:55,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:22:59,410 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.83 vs. limit=15.0 2023-09-29 16:23:02,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:23:02,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:23:02,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:23:03,162 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.02 vs. limit=15.0 2023-09-29 16:23:04,421 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 16:23:04,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 16:23:08,867 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.29 vs. limit=22.5 2023-09-29 16:23:11,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 16:23:14,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:23:14,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:23:16,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:23:16,653 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 16:23:16,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:23:19,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:23:22,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:23:24,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 16:23:25,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 16:23:26,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:23:27,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:23:27,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:23:29,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 16:23:29,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 16:23:33,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 16:23:35,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=421486.6666666667, ans=0.125 2023-09-29 16:23:36,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:23:40,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:23:40,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 16:23:40,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:23:41,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:23:44,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:23:45,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:23:45,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 16:23:47,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=421486.6666666667, ans=0.125 2023-09-29 16:23:50,785 INFO [train.py:1039] (3/4) Epoch 12, batch 4800, loss[loss=0.2689, simple_loss=0.3143, pruned_loss=0.1118, over 19218.00 frames. ], tot_loss[loss=0.1966, simple_loss=0.2679, pruned_loss=0.06263, over 4707226.45 frames. ], batch size: 388, lr: 8.50e-03, grad_scale: 16.0 2023-09-29 16:23:50,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:23:50,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 16:23:52,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 16:23:52,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 16:23:55,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:23:55,584 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:23:57,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 16:23:59,992 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 2.054e+02 2.346e+02 2.832e+02 5.942e+02, threshold=4.692e+02, percent-clipped=5.0 2023-09-29 16:24:03,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:24:04,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:24:07,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:24:08,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:24:09,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:24:09,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 16:24:09,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:24:10,291 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.42 vs. limit=15.0 2023-09-29 16:24:11,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:24:11,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:24:17,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:24:18,244 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.18 vs. limit=15.0 2023-09-29 16:24:18,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:24:20,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:24:20,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:24:20,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=421620.0, ans=0.0 2023-09-29 16:24:21,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 16:24:21,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:24:23,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:24:24,577 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.29 vs. limit=15.0 2023-09-29 16:24:25,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:24:26,343 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.40 vs. limit=15.0 2023-09-29 16:24:28,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:24:29,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:24:30,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:24:31,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 16:24:33,287 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=421686.6666666667, ans=0.0 2023-09-29 16:24:34,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:24:36,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 16:24:36,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 16:24:36,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:24:36,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:24:37,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:24:37,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:24:37,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:24:39,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:24:39,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:24:42,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:24:46,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:24:48,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:24:52,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=421753.3333333333, ans=0.125 2023-09-29 16:24:53,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 16:24:53,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:24:55,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:24:55,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:24:56,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:25:01,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:25:01,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:25:01,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:25:01,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:25:02,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:25:02,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:25:06,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=421820.0, ans=0.1 2023-09-29 16:25:07,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:25:08,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:25:08,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:25:10,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 16:25:11,790 INFO [train.py:1039] (3/4) Epoch 12, batch 4850, loss[loss=0.1779, simple_loss=0.2544, pruned_loss=0.05065, over 24304.00 frames. ], tot_loss[loss=0.1982, simple_loss=0.2694, pruned_loss=0.06347, over 4703208.12 frames. ], batch size: 61, lr: 8.50e-03, grad_scale: 16.0 2023-09-29 16:25:12,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 16:25:12,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:25:12,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:25:12,717 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.25 vs. limit=15.0 2023-09-29 16:25:13,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:25:13,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:25:16,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:25:21,148 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=20.89 vs. limit=22.5 2023-09-29 16:25:24,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 16:25:27,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:25:33,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:25:33,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 16:25:34,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:25:37,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:25:39,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:25:40,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:25:42,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 16:25:45,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:25:47,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=422020.0, ans=0.125 2023-09-29 16:25:48,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:25:48,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 16:25:48,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:25:48,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 16:25:50,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:25:50,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:25:55,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:25:55,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 16:25:55,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 16:25:57,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:26:06,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:26:07,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 16:26:09,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:26:09,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:26:12,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:26:12,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 16:26:12,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:26:13,314 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.59 vs. limit=22.5 2023-09-29 16:26:13,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 16:26:13,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:26:15,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:26:16,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 16:26:24,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:26:25,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=422153.3333333333, ans=0.07 2023-09-29 16:26:30,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:26:32,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:26:34,911 INFO [train.py:1039] (3/4) Epoch 12, batch 4900, loss[loss=0.1872, simple_loss=0.2636, pruned_loss=0.05538, over 24677.00 frames. ], tot_loss[loss=0.1974, simple_loss=0.2688, pruned_loss=0.06299, over 4711260.89 frames. ], batch size: 65, lr: 8.50e-03, grad_scale: 16.0 2023-09-29 16:26:38,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 16:26:38,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:26:43,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:26:43,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:26:43,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=422220.0, ans=0.0 2023-09-29 16:26:44,649 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.724e+02 2.050e+02 2.285e+02 2.620e+02 3.714e+02, threshold=4.569e+02, percent-clipped=0.0 2023-09-29 16:26:44,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:26:48,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 16:26:52,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 16:26:57,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 16:26:58,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 16:26:58,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:26:59,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:27:00,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:27:00,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:27:00,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:27:00,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 16:27:05,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 16:27:06,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:27:08,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:27:08,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:27:11,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:27:12,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:27:12,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=422353.3333333333, ans=0.125 2023-09-29 16:27:13,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:27:13,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 16:27:15,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:27:15,789 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.59 vs. limit=15.0 2023-09-29 16:27:16,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:27:18,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 16:27:18,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 16:27:21,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 16:27:22,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:27:24,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:27:24,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:27:25,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:27:25,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 16:27:25,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:27:25,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 16:27:26,218 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=422420.0, ans=0.0 2023-09-29 16:27:28,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:27:31,245 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.19 vs. limit=10.0 2023-09-29 16:27:31,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 16:27:33,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:27:36,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 16:27:38,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:27:39,032 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 16:27:39,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 16:27:46,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:27:48,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:27:48,690 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=422486.6666666667, ans=0.0 2023-09-29 16:27:49,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 16:27:49,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 16:27:49,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:27:49,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:27:53,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=422486.6666666667, ans=0.0 2023-09-29 16:27:54,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:27:54,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:27:54,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:27:54,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 16:27:57,478 INFO [train.py:1039] (3/4) Epoch 12, batch 4950, loss[loss=0.2315, simple_loss=0.3045, pruned_loss=0.07925, over 23974.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.2682, pruned_loss=0.06277, over 4713610.37 frames. ], batch size: 80, lr: 8.49e-03, grad_scale: 8.0 2023-09-29 16:27:57,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 16:27:59,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:28:00,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 16:28:03,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 16:28:03,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 16:28:04,182 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=422553.3333333333, ans=0.2 2023-09-29 16:28:05,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:28:06,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 16:28:06,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:06,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:28:06,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:28:06,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:28:09,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:28:11,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:28:13,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:28:13,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:28:15,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:15,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:28:19,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:28:19,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=422620.0, ans=0.0 2023-09-29 16:28:25,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:26,679 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.32 vs. limit=12.0 2023-09-29 16:28:27,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:28:30,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:30,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:28:32,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:28:33,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 16:28:35,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 16:28:35,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=422686.6666666667, ans=0.2 2023-09-29 16:28:38,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:28:41,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:28:41,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:28:42,085 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.05 vs. limit=12.0 2023-09-29 16:28:43,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:28:43,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:28:43,636 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:28:45,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:28:46,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:28:50,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:28:53,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:28:53,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:28:54,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 16:28:54,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:28:56,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:28:59,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:29:00,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:29:00,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:29:00,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:29:01,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=422753.3333333333, ans=0.07 2023-09-29 16:29:02,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:29:02,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:29:05,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:29:05,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:29:06,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:29:08,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 16:29:11,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:29:16,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 16:29:18,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 16:29:20,396 INFO [train.py:1039] (3/4) Epoch 12, batch 5000, loss[loss=0.1863, simple_loss=0.2698, pruned_loss=0.05136, over 24514.00 frames. ], tot_loss[loss=0.1954, simple_loss=0.2671, pruned_loss=0.06181, over 4722291.41 frames. ], batch size: 71, lr: 8.49e-03, grad_scale: 8.0 2023-09-29 16:29:26,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:29:26,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:29:28,010 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 16:29:28,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 16:29:28,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=422886.6666666667, ans=0.05 2023-09-29 16:29:31,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:29:32,531 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.648e+02 1.922e+02 2.238e+02 2.801e+02 3.922e+02, threshold=4.477e+02, percent-clipped=0.0 2023-09-29 16:29:32,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 16:29:32,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:29:32,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:29:34,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 16:29:35,802 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:29:35,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:29:37,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 16:29:37,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:29:37,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:29:39,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 16:29:39,435 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=422953.3333333333, ans=0.125 2023-09-29 16:29:40,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 16:29:40,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:29:40,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 16:29:40,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:29:42,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:29:42,277 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:29:42,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 16:29:42,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 16:29:42,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=422953.3333333333, ans=0.0 2023-09-29 16:29:43,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 16:29:45,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:29:45,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:29:46,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 16:29:47,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:29:49,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:29:51,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:29:54,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 16:29:57,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 16:29:57,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:29:59,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:30:01,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=423020.0, ans=0.125 2023-09-29 16:30:03,819 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 16:30:06,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:30:08,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:30:08,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:11,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 16:30:11,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:30:12,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:30:13,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:30:14,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 16:30:14,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:30:17,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:30:17,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:30:25,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 16:30:31,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:40,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:30:41,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:41,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:30:41,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:30:41,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:30:43,045 INFO [train.py:1039] (3/4) Epoch 12, batch 5050, loss[loss=0.2277, simple_loss=0.2852, pruned_loss=0.08511, over 22814.00 frames. ], tot_loss[loss=0.1957, simple_loss=0.2675, pruned_loss=0.06193, over 4721097.66 frames. ], batch size: 322, lr: 8.49e-03, grad_scale: 8.0 2023-09-29 16:30:43,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:30:43,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:47,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:30:47,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 16:30:48,735 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.23 vs. limit=6.0 2023-09-29 16:30:49,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:30:51,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:30:52,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:30:54,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 16:30:54,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:30:55,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:30:55,833 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=423220.0, ans=0.125 2023-09-29 16:30:57,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:30:58,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:30:58,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:31:09,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 16:31:09,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 16:31:11,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:31:11,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 16:31:12,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:31:14,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:31:15,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:31:17,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:31:17,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 16:31:18,620 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 16:31:20,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:31:21,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:31:23,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:31:24,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 16:31:26,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:31:29,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 16:31:32,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:31:32,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:31:34,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:31:34,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=423420.0, ans=0.0 2023-09-29 16:31:35,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:31:35,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:31:39,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:31:39,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:31:39,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:31:41,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:31:41,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 16:31:41,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:31:43,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:31:44,899 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=423420.0, ans=0.025 2023-09-29 16:31:46,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:31:46,197 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 16:31:46,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 16:31:47,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:31:49,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:31:49,311 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 16:31:52,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:31:52,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 16:31:52,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:31:56,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:31:56,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:31:58,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 16:31:58,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 16:32:00,690 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=423486.6666666667, ans=0.125 2023-09-29 16:32:01,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:32:01,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:32:01,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:32:03,314 INFO [train.py:1039] (3/4) Epoch 12, batch 5100, loss[loss=0.1832, simple_loss=0.2686, pruned_loss=0.04893, over 24639.00 frames. ], tot_loss[loss=0.1965, simple_loss=0.268, pruned_loss=0.06254, over 4726254.98 frames. ], batch size: 73, lr: 8.48e-03, grad_scale: 8.0 2023-09-29 16:32:04,954 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 16:32:06,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:32:10,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 16:32:10,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 16:32:10,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=423553.3333333333, ans=0.0 2023-09-29 16:32:12,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:32:12,649 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=9.93 vs. limit=22.5 2023-09-29 16:32:15,480 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.923e+02 2.120e+02 2.504e+02 4.528e+02, threshold=4.241e+02, percent-clipped=1.0 2023-09-29 16:32:15,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:32:16,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=423553.3333333333, ans=0.0 2023-09-29 16:32:18,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:32:20,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 16:32:20,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 16:32:25,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:32:25,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:32:25,777 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.88 vs. limit=22.5 2023-09-29 16:32:28,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:32:30,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 16:32:31,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:32:31,733 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=423620.0, ans=0.125 2023-09-29 16:32:33,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:32:33,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 16:32:36,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:32:36,976 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.52 vs. limit=15.0 2023-09-29 16:32:37,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:32:37,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 16:32:39,321 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 16:32:39,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=423686.6666666667, ans=0.125 2023-09-29 16:32:41,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:32:42,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 16:32:42,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 16:32:47,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:32:55,443 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:32:55,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=423753.3333333333, ans=0.125 2023-09-29 16:32:56,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:32:58,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 16:32:58,542 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 16:32:58,566 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 16:33:00,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 16:33:00,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:33:03,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 16:33:07,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 16:33:10,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 16:33:12,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:33:16,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 16:33:18,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 16:33:18,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 16:33:24,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:33:24,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:33:24,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:33:26,693 INFO [train.py:1039] (3/4) Epoch 12, batch 5150, loss[loss=0.2621, simple_loss=0.3148, pruned_loss=0.1047, over 19521.00 frames. ], tot_loss[loss=0.198, simple_loss=0.2694, pruned_loss=0.06332, over 4715264.63 frames. ], batch size: 388, lr: 8.48e-03, grad_scale: 8.0 2023-09-29 16:33:26,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:33:26,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 16:33:27,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=423886.6666666667, ans=0.2 2023-09-29 16:33:28,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:33:29,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 16:33:29,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 16:33:31,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 16:33:31,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:33:31,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 16:33:32,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:33:33,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 16:33:34,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:33:36,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:33:41,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:33:41,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 16:33:42,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:33:44,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:33:45,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:33:45,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:33:45,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:33:45,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:33:45,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:33:47,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 16:33:49,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:33:49,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:33:49,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=423953.3333333333, ans=0.125 2023-09-29 16:33:52,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:33:55,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 16:33:55,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:34:03,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:34:04,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=424020.0, ans=0.0 2023-09-29 16:34:05,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 16:34:08,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:34:15,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:34:17,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:34:19,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=424086.6666666667, ans=0.125 2023-09-29 16:34:22,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:34:22,117 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:34:23,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 16:34:28,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:34:29,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:34:30,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:34:33,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:34:35,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:34:35,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 16:34:39,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=424153.3333333333, ans=0.1 2023-09-29 16:34:43,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:34:43,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 16:34:45,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:34:45,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:34:46,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 16:34:46,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:34:46,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:34:48,056 INFO [train.py:1039] (3/4) Epoch 12, batch 5200, loss[loss=0.1893, simple_loss=0.2727, pruned_loss=0.05295, over 24665.00 frames. ], tot_loss[loss=0.1981, simple_loss=0.2696, pruned_loss=0.06327, over 4726464.78 frames. ], batch size: 73, lr: 8.48e-03, grad_scale: 16.0 2023-09-29 16:34:48,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:34:49,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=424220.0, ans=0.0 2023-09-29 16:34:51,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:34:51,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=424220.0, ans=0.1 2023-09-29 16:34:54,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:34:55,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:34:58,798 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.638e+02 1.937e+02 2.192e+02 2.501e+02 3.290e+02, threshold=4.383e+02, percent-clipped=0.0 2023-09-29 16:34:59,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 16:35:00,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:35:01,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:35:05,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:35:05,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:35:05,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:35:07,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 16:35:11,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:35:12,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:35:15,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 16:35:18,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:35:18,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:35:20,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 16:35:20,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 16:35:23,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 16:35:23,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:35:23,610 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 16:35:23,620 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:35:25,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:35:25,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:35:26,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 16:35:26,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:35:29,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:35:32,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 16:35:32,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 16:35:34,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 16:35:37,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=424420.0, ans=0.0 2023-09-29 16:35:38,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 16:35:38,512 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=424420.0, ans=0.125 2023-09-29 16:35:40,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:35:45,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:35:45,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:35:48,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 16:35:48,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:35:48,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 16:35:48,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:35:50,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:35:54,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:35:56,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:35:57,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:35:59,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:35:59,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:36:01,712 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.65 vs. limit=15.0 2023-09-29 16:36:05,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:36:05,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=424486.6666666667, ans=0.125 2023-09-29 16:36:06,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 16:36:07,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:36:07,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:36:08,362 INFO [train.py:1039] (3/4) Epoch 12, batch 5250, loss[loss=0.1712, simple_loss=0.2408, pruned_loss=0.05085, over 24365.00 frames. ], tot_loss[loss=0.1973, simple_loss=0.2693, pruned_loss=0.06268, over 4732691.22 frames. ], batch size: 56, lr: 8.47e-03, grad_scale: 16.0 2023-09-29 16:36:08,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:36:10,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 16:36:11,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:36:15,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:36:16,783 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.99 vs. limit=22.5 2023-09-29 16:36:17,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:36:17,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:36:19,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:36:20,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=424553.3333333333, ans=0.2 2023-09-29 16:36:22,394 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=424553.3333333333, ans=0.0 2023-09-29 16:36:25,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:36:26,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:36:26,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:36:28,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:36:31,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 16:36:31,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:36:32,900 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:36:46,392 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=5.31 vs. limit=12.0 2023-09-29 16:36:52,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=424686.6666666667, ans=0.125 2023-09-29 16:37:10,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=424820.0, ans=0.125 2023-09-29 16:37:15,954 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.82 vs. limit=15.0 2023-09-29 16:37:23,494 INFO [train.py:1039] (3/4) Epoch 12, batch 5300, loss[loss=0.1791, simple_loss=0.2566, pruned_loss=0.05084, over 24499.00 frames. ], tot_loss[loss=0.1953, simple_loss=0.2669, pruned_loss=0.06186, over 4702855.62 frames. ], batch size: 63, lr: 8.47e-03, grad_scale: 16.0 2023-09-29 16:37:33,073 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 1.874e+02 2.092e+02 2.441e+02 3.524e+02, threshold=4.184e+02, percent-clipped=0.0 2023-09-29 16:37:38,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:37:38,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 16:37:38,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 16:37:38,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:37:38,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:37:38,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:37:38,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:37:38,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:37:38,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:37:39,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:37:39,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 16:37:40,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:37:40,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 16:37:40,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 16:37:40,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 16:37:40,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 16:37:40,518 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 16:37:40,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 16:37:40,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:37:41,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:37:41,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:37:41,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:37:41,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:37:42,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:37:42,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:37:42,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:37:42,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:37:42,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:37:42,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:37:42,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:37:42,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:37:43,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 16:37:43,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:37:44,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:37:44,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 16:37:44,383 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 16:37:44,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:37:44,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:37:44,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 16:37:44,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 16:37:45,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:37:45,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:37:45,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:37:46,047 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 16:37:46,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 16:37:46,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:37:46,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:37:46,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 16:37:46,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 16:37:46,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 16:37:46,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:37:57,155 INFO [train.py:1039] (3/4) Epoch 13, batch 0, loss[loss=0.2097, simple_loss=0.2953, pruned_loss=0.06206, over 24360.00 frames. ], tot_loss[loss=0.2097, simple_loss=0.2953, pruned_loss=0.06206, over 24360.00 frames. ], batch size: 77, lr: 8.14e-03, grad_scale: 32.0 2023-09-29 16:37:57,155 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 16:38:10,942 INFO [train.py:1071] (3/4) Epoch 13, validation: loss=0.2695, simple_loss=0.2756, pruned_loss=0.1317, over 1125622.00 frames. 2023-09-29 16:38:10,943 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 16:38:12,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 16:38:13,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=424966.6666666667, ans=0.0 2023-09-29 16:38:13,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=424966.6666666667, ans=0.0 2023-09-29 16:38:14,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:38:15,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:38:20,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:38:20,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:38:22,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:38:22,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 16:38:24,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 16:38:27,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:38:28,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:38:33,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:38:33,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:38:35,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:38:35,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:38:36,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 16:38:38,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:38:43,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=425100.0, ans=0.125 2023-09-29 16:38:45,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:38:45,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:38:48,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 16:38:52,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:38:52,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:38:53,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:38:59,147 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:38:59,511 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:39:04,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:39:06,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=425166.6666666667, ans=0.2 2023-09-29 16:39:09,428 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=6.72 vs. limit=15.0 2023-09-29 16:39:11,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 16:39:13,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=425166.6666666667, ans=0.125 2023-09-29 16:39:14,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 16:39:14,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:39:14,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:39:15,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:39:16,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:39:18,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 16:39:22,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:39:24,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:39:27,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:39:30,521 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 16:39:32,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:39:34,209 INFO [train.py:1039] (3/4) Epoch 13, batch 50, loss[loss=0.1839, simple_loss=0.2598, pruned_loss=0.05399, over 21989.00 frames. ], tot_loss[loss=0.1979, simple_loss=0.2703, pruned_loss=0.06272, over 1059208.28 frames. ], batch size: 48, lr: 8.14e-03, grad_scale: 32.0 2023-09-29 16:39:35,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:39:37,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:39:37,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 16:39:39,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 16:39:39,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:39:44,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:39:46,350 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:39:48,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:39:51,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 16:39:51,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:39:58,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 16:40:00,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 16:40:02,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 16:40:04,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:40:06,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:40:06,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:40:06,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:40:07,397 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.22 vs. limit=22.5 2023-09-29 16:40:08,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 16:40:10,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 16:40:10,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:40:10,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=425433.3333333333, ans=0.0 2023-09-29 16:40:17,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:40:18,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:40:18,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:40:20,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 16:40:23,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:40:24,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:40:24,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 16:40:26,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:40:26,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=425500.0, ans=0.125 2023-09-29 16:40:27,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 16:40:35,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:40:35,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:40:38,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:40:40,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:40:40,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:40:43,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 16:40:43,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 16:40:43,735 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:40:45,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:40:46,809 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.884e+02 2.162e+02 2.621e+02 5.674e+02, threshold=4.324e+02, percent-clipped=3.0 2023-09-29 16:40:46,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:40:47,733 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.45 vs. limit=15.0 2023-09-29 16:40:48,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:40:48,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:40:48,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 16:40:50,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 16:40:50,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 16:40:52,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:40:52,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:40:54,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 16:40:54,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 16:40:54,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:40:54,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=425633.3333333333, ans=0.125 2023-09-29 16:40:55,945 INFO [train.py:1039] (3/4) Epoch 13, batch 100, loss[loss=0.174, simple_loss=0.2468, pruned_loss=0.05064, over 24430.00 frames. ], tot_loss[loss=0.1981, simple_loss=0.2705, pruned_loss=0.06287, over 1875664.56 frames. ], batch size: 58, lr: 8.13e-03, grad_scale: 32.0 2023-09-29 16:40:56,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:40:57,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 16:40:57,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:41:01,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:41:03,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=425633.3333333333, ans=10.0 2023-09-29 16:41:04,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:41:07,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:41:07,819 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=425633.3333333333, ans=0.0 2023-09-29 16:41:09,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 16:41:09,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:41:11,270 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.97 vs. limit=15.0 2023-09-29 16:41:12,874 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.80 vs. limit=22.5 2023-09-29 16:41:13,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:41:13,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:41:13,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:41:13,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:41:13,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:41:15,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 16:41:15,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:41:15,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:41:17,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:41:17,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:41:18,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=425700.0, ans=0.125 2023-09-29 16:41:20,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 16:41:22,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:41:24,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:41:25,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:41:29,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:41:31,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=425766.6666666667, ans=0.1 2023-09-29 16:41:31,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=425766.6666666667, ans=0.1 2023-09-29 16:41:32,696 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 16:41:32,719 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 16:41:34,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:41:34,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:41:39,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 16:41:42,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:41:42,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:41:48,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:41:48,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=425833.3333333333, ans=0.0 2023-09-29 16:41:49,747 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 16:41:51,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 16:41:56,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:41:56,727 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=425833.3333333333, ans=0.2 2023-09-29 16:41:57,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:41:59,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=425900.0, ans=0.125 2023-09-29 16:42:01,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:42:04,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:42:06,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:42:08,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:42:10,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:42:12,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:42:13,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=425900.0, ans=0.125 2023-09-29 16:42:14,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:42:14,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:42:14,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:42:16,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 16:42:16,406 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 16:42:17,685 INFO [train.py:1039] (3/4) Epoch 13, batch 150, loss[loss=0.1901, simple_loss=0.2736, pruned_loss=0.05327, over 24027.00 frames. ], tot_loss[loss=0.1982, simple_loss=0.2706, pruned_loss=0.06288, over 2517396.11 frames. ], batch size: 80, lr: 8.13e-03, grad_scale: 32.0 2023-09-29 16:42:17,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:42:18,026 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=425966.6666666667, ans=0.125 2023-09-29 16:42:18,749 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=425966.6666666667, ans=15.0 2023-09-29 16:42:19,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:42:19,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:19,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:42:19,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 16:42:19,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:42:19,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:42:19,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:20,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:42:22,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:42:24,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:42:24,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:42:27,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:42:28,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=425966.6666666667, ans=0.0 2023-09-29 16:42:29,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:42:29,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:42:31,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:34,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:42:34,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:37,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:42:38,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:42,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 16:42:42,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 16:42:42,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 16:42:46,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:42:46,704 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:42:48,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:42:48,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:42:49,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:42:49,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:51,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:42:52,847 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 16:42:54,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:42:59,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:43:01,224 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=15.26 vs. limit=22.5 2023-09-29 16:43:02,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:43:04,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 16:43:09,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:43:09,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:43:09,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:43:11,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:43:11,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=426166.6666666667, ans=0.0 2023-09-29 16:43:12,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:43:14,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:43:15,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:43:15,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 16:43:21,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=426166.6666666667, ans=0.125 2023-09-29 16:43:23,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:43:24,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:43:24,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:43:24,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:43:27,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:43:27,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 16:43:30,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:43:32,023 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.958e+02 2.151e+02 2.617e+02 4.145e+02, threshold=4.302e+02, percent-clipped=0.0 2023-09-29 16:43:32,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:43:33,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:43:35,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:43:35,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 16:43:36,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:43:37,442 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 16:43:40,262 INFO [train.py:1039] (3/4) Epoch 13, batch 200, loss[loss=0.216, simple_loss=0.2906, pruned_loss=0.07069, over 24443.00 frames. ], tot_loss[loss=0.1993, simple_loss=0.2716, pruned_loss=0.06348, over 3009846.60 frames. ], batch size: 77, lr: 8.13e-03, grad_scale: 32.0 2023-09-29 16:43:42,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:43:45,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:43:45,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:43:49,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 16:43:50,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=426300.0, ans=0.125 2023-09-29 16:43:51,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:43:51,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:43:53,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 16:43:55,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 16:43:56,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:43:58,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:44:00,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=426366.6666666667, ans=0.125 2023-09-29 16:44:02,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=426366.6666666667, ans=0.2 2023-09-29 16:44:03,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:44:03,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:44:05,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:44:18,638 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=426433.3333333333, ans=0.125 2023-09-29 16:44:20,546 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=426433.3333333333, ans=0.5 2023-09-29 16:44:22,133 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=426433.3333333333, ans=0.1 2023-09-29 16:44:23,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:44:24,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:44:24,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 16:44:26,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:44:26,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 16:44:26,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:44:28,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:44:28,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:44:28,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:44:28,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:44:32,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 16:44:32,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 16:44:32,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:44:39,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:44:44,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:44:51,267 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:44:51,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:44:57,317 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.07 vs. limit=10.0 2023-09-29 16:44:58,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:44:59,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 16:45:01,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:45:01,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:45:01,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:45:03,244 INFO [train.py:1039] (3/4) Epoch 13, batch 250, loss[loss=0.1651, simple_loss=0.2256, pruned_loss=0.05225, over 22716.00 frames. ], tot_loss[loss=0.1988, simple_loss=0.2709, pruned_loss=0.06333, over 3394027.57 frames. ], batch size: 322, lr: 8.12e-03, grad_scale: 32.0 2023-09-29 16:45:03,322 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 16:45:03,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 16:45:05,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:45:05,664 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 16:45:08,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:45:10,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:45:10,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:45:15,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:45:16,140 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.27 vs. limit=12.0 2023-09-29 16:45:17,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:45:17,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:45:18,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:45:23,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:45:35,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:45:36,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:45:36,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:45:42,120 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=426766.6666666667, ans=0.0 2023-09-29 16:45:45,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 16:45:46,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 16:45:48,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:45:48,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:45:48,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:45:48,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:45:50,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:45:53,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:45:56,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 16:45:56,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:45:58,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:45:58,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:45:58,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:46:00,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:46:01,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:46:01,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:46:04,928 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:46:06,546 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:46:06,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:46:08,674 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=426833.3333333333, ans=0.125 2023-09-29 16:46:09,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:46:13,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:46:15,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:46:17,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=426900.0, ans=0.0 2023-09-29 16:46:20,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:46:21,729 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.855e+02 2.073e+02 2.477e+02 4.320e+02, threshold=4.145e+02, percent-clipped=1.0 2023-09-29 16:46:22,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.min_positive, batch_count=426900.0, ans=0.05 2023-09-29 16:46:23,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:46:23,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=426900.0, ans=0.125 2023-09-29 16:46:26,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 16:46:28,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:46:29,460 INFO [train.py:1039] (3/4) Epoch 13, batch 300, loss[loss=0.195, simple_loss=0.2728, pruned_loss=0.05866, over 23342.00 frames. ], tot_loss[loss=0.1968, simple_loss=0.2686, pruned_loss=0.06243, over 3691359.89 frames. ], batch size: 93, lr: 8.12e-03, grad_scale: 32.0 2023-09-29 16:46:29,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 16:46:31,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 16:46:32,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 16:46:33,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:46:33,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 16:46:39,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:46:40,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:46:43,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:46:43,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 16:46:45,267 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:46:45,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 16:46:45,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 16:46:45,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:46:50,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:46:50,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=427033.3333333333, ans=0.125 2023-09-29 16:46:55,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:46:56,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 16:47:01,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 16:47:01,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:02,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:47:06,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:06,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 16:47:06,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:47:09,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:47:12,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:47:12,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:47:17,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 16:47:18,805 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 16:47:18,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:47:22,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:23,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 16:47:25,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:47:28,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:47:33,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:47:33,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 16:47:37,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:37,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 16:47:40,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:42,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:47:42,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 16:47:42,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:47:43,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:47:45,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 16:47:46,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:47:47,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:47:48,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:47:49,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:47:50,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:47:51,850 INFO [train.py:1039] (3/4) Epoch 13, batch 350, loss[loss=0.1944, simple_loss=0.2757, pruned_loss=0.05649, over 24570.00 frames. ], tot_loss[loss=0.1957, simple_loss=0.2671, pruned_loss=0.06217, over 3915471.69 frames. ], batch size: 71, lr: 8.12e-03, grad_scale: 32.0 2023-09-29 16:47:55,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:47:55,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 16:47:58,687 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:05,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:48:07,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:48:07,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:10,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 16:48:12,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:48:12,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 16:48:15,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:15,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 16:48:15,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:48:18,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 16:48:20,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:48:22,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:48:22,658 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=427366.6666666667, ans=0.0 2023-09-29 16:48:23,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:48:25,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:48:26,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:48:26,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:48:26,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:48:27,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:48:27,386 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=427433.3333333333, ans=0.125 2023-09-29 16:48:28,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:48:28,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:36,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:48:36,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:48:37,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:48:37,743 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:48:44,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 16:48:45,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:48:49,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:48:49,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:48:50,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:48:50,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 16:48:54,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:48:54,706 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 16:48:57,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 16:48:57,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:49:00,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:49:00,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 16:49:02,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:49:04,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 16:49:05,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:49:07,238 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.930e+02 2.101e+02 2.393e+02 3.670e+02, threshold=4.202e+02, percent-clipped=0.0 2023-09-29 16:49:07,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:49:07,418 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:49:07,670 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=427566.6666666667, ans=0.125 2023-09-29 16:49:10,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:49:10,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=427566.6666666667, ans=0.0 2023-09-29 16:49:13,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:49:15,492 INFO [train.py:1039] (3/4) Epoch 13, batch 400, loss[loss=0.17, simple_loss=0.2497, pruned_loss=0.04514, over 24692.00 frames. ], tot_loss[loss=0.1948, simple_loss=0.2663, pruned_loss=0.06166, over 4092686.75 frames. ], batch size: 65, lr: 8.11e-03, grad_scale: 32.0 2023-09-29 16:49:15,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 16:49:17,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 16:49:17,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:49:17,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:49:19,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:49:21,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:49:24,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:49:26,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:49:29,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 16:49:30,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 16:49:30,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:49:32,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 16:49:32,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:49:38,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:49:38,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:49:38,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 16:49:38,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:49:38,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:49:38,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:49:40,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:49:42,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=427700.0, ans=0.125 2023-09-29 16:49:43,549 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 16:49:43,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 16:49:47,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=427766.6666666667, ans=0.125 2023-09-29 16:49:47,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=427766.6666666667, ans=0.0 2023-09-29 16:49:48,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:49:50,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:49:52,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 16:49:52,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 16:49:52,790 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=427766.6666666667, ans=0.125 2023-09-29 16:49:54,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=427766.6666666667, ans=0.2 2023-09-29 16:49:55,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:50:00,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:50:07,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 16:50:08,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 16:50:11,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 16:50:13,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:50:15,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:50:15,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 16:50:18,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:50:21,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 16:50:23,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:50:25,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:50:27,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 16:50:27,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=427900.0, ans=0.1 2023-09-29 16:50:28,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 16:50:30,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 16:50:31,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 16:50:31,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:50:33,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 16:50:35,894 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=427900.0, ans=0.125 2023-09-29 16:50:36,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:50:37,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:50:38,469 INFO [train.py:1039] (3/4) Epoch 13, batch 450, loss[loss=0.1987, simple_loss=0.2673, pruned_loss=0.06503, over 23969.00 frames. ], tot_loss[loss=0.1954, simple_loss=0.2667, pruned_loss=0.06203, over 4235973.36 frames. ], batch size: 196, lr: 8.11e-03, grad_scale: 32.0 2023-09-29 16:50:38,536 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 16:50:40,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 16:50:40,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:50:40,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:50:41,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:50:43,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 16:50:43,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:50:43,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 16:50:46,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 16:50:58,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:50:58,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:51:00,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 16:51:01,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 16:51:03,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:51:06,507 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.21 vs. limit=15.0 2023-09-29 16:51:08,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:51:09,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:51:13,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:51:13,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:51:14,741 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.25 vs. limit=15.0 2023-09-29 16:51:18,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 16:51:18,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 16:51:21,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 16:51:21,345 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:51:22,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:51:22,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:51:25,107 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 16:51:25,131 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 16:51:26,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:51:26,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:51:27,111 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=428166.6666666667, ans=0.0 2023-09-29 16:51:28,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 16:51:31,765 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 16:51:31,828 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 16:51:31,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 16:51:33,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 16:51:36,365 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.93 vs. limit=15.0 2023-09-29 16:51:36,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:51:39,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 16:51:40,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 16:51:40,503 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=428166.6666666667, ans=0.0 2023-09-29 16:51:41,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 16:51:45,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 16:51:46,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 16:51:48,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 16:51:49,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 16:51:52,576 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.942e+02 2.289e+02 2.754e+02 3.873e+02, threshold=4.578e+02, percent-clipped=0.0 2023-09-29 16:51:55,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:51:58,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:52:00,614 INFO [train.py:1039] (3/4) Epoch 13, batch 500, loss[loss=0.1849, simple_loss=0.2656, pruned_loss=0.05208, over 24637.00 frames. ], tot_loss[loss=0.1958, simple_loss=0.2676, pruned_loss=0.06197, over 4339713.73 frames. ], batch size: 68, lr: 8.11e-03, grad_scale: 16.0 2023-09-29 16:52:00,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:52:00,781 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 16:52:03,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:52:05,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:52:05,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:52:05,463 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 16:52:07,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 16:52:07,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:52:10,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 16:52:15,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 16:52:17,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 16:52:20,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:52:20,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:52:20,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:52:33,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:52:33,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 16:52:34,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 16:52:34,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:52:34,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 16:52:34,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 16:52:38,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:52:39,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 16:52:39,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 16:52:39,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:52:40,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 16:52:43,546 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 16:52:49,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:52:50,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:52:50,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=428500.0, ans=0.125 2023-09-29 16:52:51,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:52:51,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:52:53,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 16:52:54,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 16:52:58,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 16:53:00,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:53:03,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:53:04,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=428566.6666666667, ans=0.125 2023-09-29 16:53:05,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=428566.6666666667, ans=0.0 2023-09-29 16:53:06,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:53:13,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:53:14,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 16:53:14,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:53:14,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:53:17,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 16:53:19,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 16:53:20,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:53:23,477 INFO [train.py:1039] (3/4) Epoch 13, batch 550, loss[loss=0.1757, simple_loss=0.2532, pruned_loss=0.04905, over 24348.00 frames. ], tot_loss[loss=0.196, simple_loss=0.2679, pruned_loss=0.06203, over 4429767.62 frames. ], batch size: 56, lr: 8.11e-03, grad_scale: 16.0 2023-09-29 16:53:26,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 16:53:28,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 16:53:30,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:53:30,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 16:53:31,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:53:31,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:53:31,726 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:53:33,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:53:33,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:53:33,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:53:36,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:53:37,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 16:53:39,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:53:41,440 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.20 vs. limit=22.5 2023-09-29 16:53:43,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:53:44,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:53:47,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:53:47,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:53:51,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=428700.0, ans=0.125 2023-09-29 16:53:52,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 16:53:54,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 16:53:54,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:54:03,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:54:03,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:54:03,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:54:07,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:54:07,960 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 16:54:08,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:54:09,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 16:54:12,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 16:54:12,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 16:54:12,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:54:14,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:54:15,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 16:54:15,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 16:54:17,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:54:19,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:54:19,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:54:19,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:54:20,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:54:22,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 16:54:25,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:54:25,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:54:25,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 16:54:27,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:54:29,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:54:31,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:54:32,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:54:34,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 16:54:34,390 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 16:54:39,420 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 1.968e+02 2.209e+02 2.597e+02 3.344e+02, threshold=4.418e+02, percent-clipped=0.0 2023-09-29 16:54:39,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 16:54:44,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 16:54:44,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:54:44,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=428966.6666666667, ans=0.0 2023-09-29 16:54:45,779 INFO [train.py:1039] (3/4) Epoch 13, batch 600, loss[loss=0.1955, simple_loss=0.2826, pruned_loss=0.05422, over 24585.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.2687, pruned_loss=0.06255, over 4495007.67 frames. ], batch size: 71, lr: 8.10e-03, grad_scale: 16.0 2023-09-29 16:54:45,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:54:45,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:54:46,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=428966.6666666667, ans=0.0 2023-09-29 16:54:52,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:54:52,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 16:54:54,319 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 16:54:55,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 16:54:59,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:55:02,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:55:05,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 16:55:06,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:55:13,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=429033.3333333333, ans=0.025 2023-09-29 16:55:14,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 16:55:18,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:55:18,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:55:18,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:55:25,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:55:25,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:55:27,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:55:32,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=429100.0, ans=0.0 2023-09-29 16:55:33,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:55:39,202 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:55:39,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:55:39,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:55:44,311 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.99 vs. limit=15.0 2023-09-29 16:55:48,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 16:55:54,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 16:55:55,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:55:56,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=429233.3333333333, ans=0.1 2023-09-29 16:55:58,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 16:55:58,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 16:56:00,648 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=429233.3333333333, ans=0.1 2023-09-29 16:56:01,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 16:56:03,309 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 16:56:03,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 16:56:05,275 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=429233.3333333333, ans=0.0 2023-09-29 16:56:08,407 INFO [train.py:1039] (3/4) Epoch 13, batch 650, loss[loss=0.1925, simple_loss=0.2813, pruned_loss=0.05184, over 24282.00 frames. ], tot_loss[loss=0.196, simple_loss=0.2677, pruned_loss=0.06217, over 4555910.56 frames. ], batch size: 74, lr: 8.10e-03, grad_scale: 16.0 2023-09-29 16:56:08,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 16:56:08,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=429300.0, ans=0.125 2023-09-29 16:56:12,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 16:56:13,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:56:14,061 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 16:56:15,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:56:17,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:56:17,810 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.88 vs. limit=15.0 2023-09-29 16:56:20,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 16:56:20,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:56:28,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 16:56:28,204 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:56:30,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:56:34,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 16:56:34,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:56:36,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:56:40,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:56:40,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 16:56:42,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=429433.3333333333, ans=15.0 2023-09-29 16:56:45,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:56:45,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:56:45,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:56:46,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:56:48,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 16:56:49,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 16:56:50,020 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 16:56:50,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:56:50,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:56:55,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:56:55,785 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.82 vs. limit=15.0 2023-09-29 16:56:56,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:56:56,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:56:58,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 16:56:58,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 16:56:59,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 16:56:59,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 16:57:01,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 16:57:01,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:57:02,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 16:57:03,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 16:57:04,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 16:57:04,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:57:04,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:57:04,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:57:06,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:57:07,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:57:14,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:57:14,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:57:18,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:57:19,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:57:20,052 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=429566.6666666667, ans=0.2 2023-09-29 16:57:20,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=429566.6666666667, ans=0.0 2023-09-29 16:57:21,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 16:57:21,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:57:21,963 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.55 vs. limit=15.0 2023-09-29 16:57:24,849 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 2.054e+02 2.276e+02 2.735e+02 4.255e+02, threshold=4.551e+02, percent-clipped=0.0 2023-09-29 16:57:28,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 16:57:28,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:57:28,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:57:28,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=429566.6666666667, ans=0.0 2023-09-29 16:57:29,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:57:31,309 INFO [train.py:1039] (3/4) Epoch 13, batch 700, loss[loss=0.1908, simple_loss=0.273, pruned_loss=0.05429, over 24495.00 frames. ], tot_loss[loss=0.1946, simple_loss=0.2658, pruned_loss=0.06167, over 4595630.15 frames. ], batch size: 66, lr: 8.10e-03, grad_scale: 16.0 2023-09-29 16:57:34,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 16:57:35,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 16:57:39,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 16:57:40,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:57:42,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:57:45,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 16:57:49,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:57:50,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:57:53,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:57:53,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=429700.0, ans=0.1 2023-09-29 16:57:56,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 16:57:56,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:57:56,471 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=429700.0, ans=0.125 2023-09-29 16:57:59,173 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.72 vs. limit=10.0 2023-09-29 16:58:01,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:58:01,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=429700.0, ans=0.125 2023-09-29 16:58:04,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 16:58:04,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 16:58:04,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=429766.6666666667, ans=0.125 2023-09-29 16:58:05,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 16:58:07,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 16:58:12,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 16:58:12,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:58:13,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 16:58:18,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 16:58:18,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 16:58:24,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:58:24,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 16:58:24,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 16:58:29,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 16:58:29,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:58:31,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=429833.3333333333, ans=0.1 2023-09-29 16:58:32,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:58:37,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=429900.0, ans=0.125 2023-09-29 16:58:38,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 16:58:40,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 16:58:43,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 16:58:43,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 16:58:45,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:58:45,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=429900.0, ans=0.125 2023-09-29 16:58:50,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:58:50,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:58:53,025 INFO [train.py:1039] (3/4) Epoch 13, batch 750, loss[loss=0.2063, simple_loss=0.2863, pruned_loss=0.06313, over 24416.00 frames. ], tot_loss[loss=0.1939, simple_loss=0.2656, pruned_loss=0.06111, over 4621256.21 frames. ], batch size: 69, lr: 8.09e-03, grad_scale: 16.0 2023-09-29 16:58:53,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:58:53,117 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 16:58:57,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 16:58:57,673 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 16:58:57,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 16:58:57,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 16:58:57,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 16:58:59,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 16:59:00,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 16:59:02,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:59:04,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:59:06,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=429966.6666666667, ans=0.125 2023-09-29 16:59:06,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=429966.6666666667, ans=10.0 2023-09-29 16:59:07,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:59:08,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:59:08,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 16:59:08,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:59:11,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 16:59:13,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 16:59:15,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 16:59:16,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:59:16,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 16:59:16,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 16:59:18,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 16:59:19,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:59:20,737 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.59 vs. limit=15.0 2023-09-29 16:59:21,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 16:59:24,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 16:59:26,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 16:59:26,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 16:59:27,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 16:59:27,991 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 16:59:29,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 16:59:29,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 16:59:29,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 16:59:33,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 16:59:43,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 16:59:43,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:59:43,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 16:59:44,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 16:59:45,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 16:59:46,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 16:59:46,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 16:59:48,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 16:59:49,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 16:59:51,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 16:59:52,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 16:59:54,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 16:59:58,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 16:59:59,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 16:59:59,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:00,767 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.56 vs. limit=15.0 2023-09-29 17:00:03,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:00:08,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 17:00:08,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:00:09,478 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.868e+02 2.099e+02 2.383e+02 3.939e+02, threshold=4.199e+02, percent-clipped=0.0 2023-09-29 17:00:09,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:00:11,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:00:13,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:00:13,524 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=430233.3333333333, ans=0.125 2023-09-29 17:00:15,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:00:15,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:00:16,487 INFO [train.py:1039] (3/4) Epoch 13, batch 800, loss[loss=0.1959, simple_loss=0.2791, pruned_loss=0.05634, over 24343.00 frames. ], tot_loss[loss=0.1944, simple_loss=0.2662, pruned_loss=0.06133, over 4643436.79 frames. ], batch size: 77, lr: 8.09e-03, grad_scale: 32.0 2023-09-29 17:00:22,239 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.19 vs. limit=6.0 2023-09-29 17:00:24,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:00:24,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:00:25,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:00:25,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:00:26,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:00:27,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:27,780 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=430300.0, ans=0.1 2023-09-29 17:00:30,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:00:31,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=430366.6666666667, ans=0.1 2023-09-29 17:00:35,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:00:36,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:00:39,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 17:00:41,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:41,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:00:42,426 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.65 vs. limit=15.0 2023-09-29 17:00:43,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:00:43,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:00:43,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 17:00:43,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:00:45,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 17:00:47,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:00:49,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=430433.3333333333, ans=0.125 2023-09-29 17:00:50,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:00:51,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:00:52,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:00:53,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:53,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:00:59,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:00:59,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:00:59,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 17:01:00,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=430433.3333333333, ans=0.0 2023-09-29 17:01:01,561 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 17:01:01,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 17:01:01,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:01:01,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:01:03,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:01:03,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:01:08,541 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 17:01:09,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 17:01:10,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:01:13,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:01:16,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:01:21,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:01:22,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 17:01:23,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:01:25,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 17:01:30,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:01:33,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:01:33,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 17:01:33,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:01:33,589 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=430566.6666666667, ans=0.125 2023-09-29 17:01:34,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:01:36,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 17:01:36,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:01:37,117 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=430633.3333333333, ans=0.125 2023-09-29 17:01:38,241 INFO [train.py:1039] (3/4) Epoch 13, batch 850, loss[loss=0.1847, simple_loss=0.2558, pruned_loss=0.05681, over 24428.00 frames. ], tot_loss[loss=0.1958, simple_loss=0.2672, pruned_loss=0.06219, over 4642069.35 frames. ], batch size: 58, lr: 8.09e-03, grad_scale: 16.0 2023-09-29 17:01:38,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:01:39,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:01:41,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:01:42,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:01:45,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 17:01:45,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 17:01:45,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 17:01:47,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:01:47,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:01:51,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:01:52,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:01:52,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:01:53,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=430633.3333333333, ans=0.125 2023-09-29 17:01:56,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:01:56,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:01:57,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 17:02:02,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 17:02:03,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:02:05,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 17:02:08,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 17:02:10,096 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 17:02:13,602 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 17:02:13,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:02:13,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:02:13,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 17:02:17,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:02:18,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:02:20,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 17:02:20,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=430766.6666666667, ans=0.0 2023-09-29 17:02:21,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:02:21,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:02:24,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:02:24,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 17:02:27,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:02:27,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 17:02:27,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 17:02:31,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:02:31,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:02:32,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:02:32,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:02:32,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=430833.3333333333, ans=0.125 2023-09-29 17:02:34,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:02:36,224 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.27 vs. limit=15.0 2023-09-29 17:02:37,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:02:38,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 17:02:40,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:02:41,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:02:41,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:02:42,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=430833.3333333333, ans=0.1 2023-09-29 17:02:51,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:02:51,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:02:53,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 17:02:53,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:02:53,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:02:56,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=430900.0, ans=0.125 2023-09-29 17:02:57,236 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.595e+02 2.022e+02 2.303e+02 2.741e+02 5.777e+02, threshold=4.606e+02, percent-clipped=1.0 2023-09-29 17:02:57,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 17:03:02,590 INFO [train.py:1039] (3/4) Epoch 13, batch 900, loss[loss=0.1919, simple_loss=0.2727, pruned_loss=0.05559, over 24384.00 frames. ], tot_loss[loss=0.1965, simple_loss=0.2682, pruned_loss=0.0624, over 4663366.41 frames. ], batch size: 77, lr: 8.08e-03, grad_scale: 16.0 2023-09-29 17:03:05,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:03:07,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:03:07,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 17:03:09,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=430966.6666666667, ans=0.0 2023-09-29 17:03:10,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:03:10,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 17:03:11,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 17:03:13,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:03:13,429 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:03:14,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:03:14,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:03:26,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:03:26,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:03:26,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:03:29,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:03:31,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=431033.3333333333, ans=0.1 2023-09-29 17:03:34,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 17:03:37,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:03:42,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:03:42,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:03:43,984 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 17:03:45,358 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 17:03:53,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:03:53,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:03:53,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:03:59,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:04:01,373 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:04:03,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 17:04:03,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:04:08,454 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 17:04:09,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:04:11,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:04:13,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:04:13,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:04:18,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 17:04:18,237 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 17:04:19,772 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 17:04:19,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 17:04:21,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:04:22,896 INFO [train.py:1039] (3/4) Epoch 13, batch 950, loss[loss=0.2103, simple_loss=0.2642, pruned_loss=0.07824, over 23713.00 frames. ], tot_loss[loss=0.196, simple_loss=0.2676, pruned_loss=0.06217, over 4677243.25 frames. ], batch size: 212, lr: 8.08e-03, grad_scale: 16.0 2023-09-29 17:04:24,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 17:04:29,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:04:31,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:04:33,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:04:33,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 17:04:35,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=431300.0, ans=0.0 2023-09-29 17:04:38,170 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 17:04:40,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:04:41,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:04:43,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:04:43,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:04:43,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 17:04:45,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 17:04:47,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:04:47,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 17:04:47,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=431366.6666666667, ans=0.0 2023-09-29 17:04:48,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:04:53,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:04:53,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:04:53,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:04:54,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 17:04:56,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 17:04:58,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:05:01,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:05:07,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:05:07,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:05:09,539 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=431433.3333333333, ans=0.0 2023-09-29 17:05:11,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 17:05:11,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=431500.0, ans=0.0 2023-09-29 17:05:14,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 17:05:14,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:05:15,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:05:16,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:05:16,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:05:16,183 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=431500.0, ans=0.0 2023-09-29 17:05:19,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 17:05:19,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:05:24,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:05:25,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:05:26,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 17:05:26,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:05:26,445 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:05:26,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 17:05:26,981 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:05:31,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:05:34,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:05:37,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=431566.6666666667, ans=0.1 2023-09-29 17:05:40,316 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 2.121e+02 2.375e+02 2.805e+02 4.363e+02, threshold=4.749e+02, percent-clipped=0.0 2023-09-29 17:05:40,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:05:41,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 17:05:41,974 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 17:05:46,076 INFO [train.py:1039] (3/4) Epoch 13, batch 1000, loss[loss=0.1905, simple_loss=0.2361, pruned_loss=0.07242, over 19573.00 frames. ], tot_loss[loss=0.1959, simple_loss=0.2666, pruned_loss=0.06258, over 4657931.75 frames. ], batch size: 388, lr: 8.08e-03, grad_scale: 16.0 2023-09-29 17:05:46,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:05:48,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 17:05:48,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:05:54,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:05:55,142 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=431633.3333333333, ans=0.125 2023-09-29 17:05:56,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 17:05:56,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 17:06:01,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:06:01,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:06:03,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:06:06,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 17:06:10,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 17:06:11,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 17:06:12,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:06:14,943 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 17:06:15,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 17:06:17,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 17:06:17,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:06:19,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:06:22,541 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=431766.6666666667, ans=0.0 2023-09-29 17:06:27,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:06:28,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:06:28,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:06:29,762 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.93 vs. limit=15.0 2023-09-29 17:06:30,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:06:30,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 17:06:31,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:06:32,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:06:33,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:06:33,478 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 17:06:38,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 17:06:39,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 17:06:41,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 17:06:44,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:06:44,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=431833.3333333333, ans=0.09899494936611666 2023-09-29 17:06:51,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:06:51,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:06:51,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:06:54,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:06:56,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 17:06:57,933 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:06:58,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 17:07:00,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 17:07:00,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:07:00,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:07:01,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:07:04,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:07:07,639 INFO [train.py:1039] (3/4) Epoch 13, batch 1050, loss[loss=0.2127, simple_loss=0.2787, pruned_loss=0.07333, over 23843.00 frames. ], tot_loss[loss=0.1942, simple_loss=0.2648, pruned_loss=0.06181, over 4668881.13 frames. ], batch size: 195, lr: 8.07e-03, grad_scale: 16.0 2023-09-29 17:07:07,771 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:07:11,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:07:11,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:07:14,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 17:07:15,783 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.54 vs. limit=10.0 2023-09-29 17:07:16,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:07:18,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:07:19,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:07:21,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:07:24,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:07:24,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:07:24,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=432033.3333333333, ans=0.125 2023-09-29 17:07:26,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:07:26,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:07:27,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 17:07:28,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:07:28,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 17:07:33,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:07:33,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 17:07:33,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:07:33,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=432033.3333333333, ans=0.125 2023-09-29 17:07:39,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:07:41,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:07:41,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:07:42,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 17:07:44,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 17:07:44,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:07:48,048 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:07:49,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 17:07:50,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 17:07:52,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:07:55,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 17:07:57,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:07:57,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:07:57,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:08:02,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:08:05,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 17:08:09,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 17:08:09,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 17:08:09,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:08:09,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:08:10,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 17:08:15,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:08:18,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:08:18,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:08:18,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:08:18,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:08:24,668 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.979e+02 2.212e+02 2.486e+02 3.871e+02, threshold=4.425e+02, percent-clipped=0.0 2023-09-29 17:08:24,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:08:24,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 17:08:26,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:08:26,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 17:08:26,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 17:08:27,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:08:29,425 INFO [train.py:1039] (3/4) Epoch 13, batch 1100, loss[loss=0.1978, simple_loss=0.2665, pruned_loss=0.06457, over 23543.00 frames. ], tot_loss[loss=0.1936, simple_loss=0.2645, pruned_loss=0.06139, over 4684355.49 frames. ], batch size: 256, lr: 8.07e-03, grad_scale: 16.0 2023-09-29 17:08:30,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:08:34,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=432300.0, ans=0.1 2023-09-29 17:08:36,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:08:38,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=432300.0, ans=0.125 2023-09-29 17:08:41,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:08:43,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:08:43,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:08:43,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 17:08:45,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:08:48,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 17:08:49,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:08:53,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:08:53,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 17:08:54,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 17:08:54,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:08:54,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:08:58,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:08:58,297 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=432366.6666666667, ans=0.1 2023-09-29 17:08:59,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:09:05,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:09:05,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=432433.3333333333, ans=0.1 2023-09-29 17:09:08,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 17:09:10,238 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 17:09:10,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:09:13,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:09:15,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 17:09:15,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:09:17,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 17:09:17,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:09:18,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:09:18,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:09:18,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:09:18,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 17:09:24,838 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:09:26,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 17:09:28,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:09:31,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:09:36,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 17:09:36,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 17:09:38,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:09:39,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:09:41,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:09:41,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 17:09:42,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:09:42,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:09:43,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 17:09:43,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:09:45,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 17:09:45,714 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.38 vs. limit=22.5 2023-09-29 17:09:48,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:09:48,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:09:50,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:09:53,089 INFO [train.py:1039] (3/4) Epoch 13, batch 1150, loss[loss=0.1816, simple_loss=0.2545, pruned_loss=0.0543, over 24477.00 frames. ], tot_loss[loss=0.1954, simple_loss=0.266, pruned_loss=0.06233, over 4674703.10 frames. ], batch size: 58, lr: 8.07e-03, grad_scale: 16.0 2023-09-29 17:09:54,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:09:55,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=432633.3333333333, ans=15.0 2023-09-29 17:09:57,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:10:01,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:10:01,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:10:01,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 17:10:02,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:10:04,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 17:10:07,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:10:07,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:10:12,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 17:10:15,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:10:20,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:10:21,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:10:21,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 17:10:21,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:10:23,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:10:27,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 17:10:28,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:10:29,317 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=432766.6666666667, ans=0.125 2023-09-29 17:10:30,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:10:36,628 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.56 vs. limit=15.0 2023-09-29 17:10:38,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:10:45,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:10:46,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 17:10:46,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:10:46,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:10:53,214 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 17:10:54,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:11:03,411 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 17:11:06,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:11:07,529 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.31 vs. limit=15.0 2023-09-29 17:11:08,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:11:08,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=432900.0, ans=0.125 2023-09-29 17:11:09,441 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.852e+02 2.092e+02 2.448e+02 3.672e+02, threshold=4.183e+02, percent-clipped=0.0 2023-09-29 17:11:09,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:11:09,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:11:14,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:11:15,569 INFO [train.py:1039] (3/4) Epoch 13, batch 1200, loss[loss=0.1925, simple_loss=0.272, pruned_loss=0.05654, over 24670.00 frames. ], tot_loss[loss=0.1961, simple_loss=0.267, pruned_loss=0.06259, over 4688149.74 frames. ], batch size: 68, lr: 8.07e-03, grad_scale: 32.0 2023-09-29 17:11:17,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:11:17,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:11:18,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:11:18,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:11:20,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:11:21,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:11:22,107 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=432966.6666666667, ans=0.2 2023-09-29 17:11:23,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:11:24,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:11:25,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:11:26,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=432966.6666666667, ans=0.1 2023-09-29 17:11:29,237 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 17:11:31,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 17:11:36,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:11:39,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:11:40,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=433033.3333333333, ans=0.125 2023-09-29 17:11:41,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:11:43,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:11:43,179 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 17:11:43,337 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=433033.3333333333, ans=0.125 2023-09-29 17:11:45,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:11:51,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 17:11:51,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:11:53,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 17:11:54,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:11:58,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=433100.0, ans=0.125 2023-09-29 17:11:59,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 17:12:01,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=433100.0, ans=0.035 2023-09-29 17:12:03,581 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.91 vs. limit=8.0 2023-09-29 17:12:03,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 17:12:04,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:12:05,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:12:07,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:12:09,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:12:11,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:12:11,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:12:13,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:12:13,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 17:12:13,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:12:15,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:12:15,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:12:18,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:12:18,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:12:22,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 17:12:24,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:12:27,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 17:12:32,417 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 17:12:34,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:12:36,883 INFO [train.py:1039] (3/4) Epoch 13, batch 1250, loss[loss=0.1964, simple_loss=0.277, pruned_loss=0.05787, over 24333.00 frames. ], tot_loss[loss=0.1973, simple_loss=0.2678, pruned_loss=0.0634, over 4694675.61 frames. ], batch size: 74, lr: 8.06e-03, grad_scale: 32.0 2023-09-29 17:12:37,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:12:38,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:12:40,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:12:41,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 17:12:47,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:12:47,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:12:49,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 17:12:50,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:12:52,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:12:57,891 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=433366.6666666667, ans=0.125 2023-09-29 17:12:59,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 17:12:59,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:13:01,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:13:01,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:13:02,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:13:04,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 17:13:04,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 17:13:04,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:13:06,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:13:06,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:13:09,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:13:10,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:13:12,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=433433.3333333333, ans=0.125 2023-09-29 17:13:16,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 17:13:17,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:13:21,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:13:21,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 17:13:22,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:13:22,810 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 17:13:23,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=433433.3333333333, ans=0.125 2023-09-29 17:13:24,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:13:24,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:13:29,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:13:35,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:13:35,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:13:37,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 17:13:37,546 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 17:13:37,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 17:13:39,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:13:39,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 17:13:39,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:13:42,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 17:13:44,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:13:45,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 17:13:45,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 17:13:47,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:13:47,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:13:48,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:13:50,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 17:13:52,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:13:54,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:13:55,653 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.895e+02 2.072e+02 2.279e+02 3.563e+02, threshold=4.144e+02, percent-clipped=0.0 2023-09-29 17:13:55,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:13:57,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 17:14:00,452 INFO [train.py:1039] (3/4) Epoch 13, batch 1300, loss[loss=0.2017, simple_loss=0.2567, pruned_loss=0.0733, over 23786.00 frames. ], tot_loss[loss=0.1969, simple_loss=0.2679, pruned_loss=0.06295, over 4711215.19 frames. ], batch size: 212, lr: 8.06e-03, grad_scale: 32.0 2023-09-29 17:14:02,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:14:02,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 17:14:07,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:14:08,934 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=433633.3333333333, ans=0.125 2023-09-29 17:14:10,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:14:11,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:14:13,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:14:13,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:14:14,260 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.52 vs. limit=15.0 2023-09-29 17:14:15,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 17:14:19,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:14:21,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:14:23,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 17:14:24,733 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=433700.0, ans=0.125 2023-09-29 17:14:24,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=433700.0, ans=0.125 2023-09-29 17:14:27,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:14:31,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:14:32,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=433766.6666666667, ans=0.0 2023-09-29 17:14:33,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:14:33,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:14:33,959 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=433766.6666666667, ans=0.05 2023-09-29 17:14:36,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:14:36,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:14:38,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 17:14:38,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 17:14:46,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:14:46,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:14:46,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 17:14:48,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 17:14:48,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:14:51,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:14:51,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 17:14:52,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:14:53,022 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 17:14:55,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:14:59,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:14:59,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:15:02,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 17:15:04,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 17:15:06,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 17:15:11,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:15:11,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=433900.0, ans=0.2 2023-09-29 17:15:14,460 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 17:15:15,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:15:22,686 INFO [train.py:1039] (3/4) Epoch 13, batch 1350, loss[loss=0.1885, simple_loss=0.2326, pruned_loss=0.07215, over 19439.00 frames. ], tot_loss[loss=0.1961, simple_loss=0.2672, pruned_loss=0.06254, over 4709067.05 frames. ], batch size: 388, lr: 8.06e-03, grad_scale: 16.0 2023-09-29 17:15:22,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 17:15:25,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:15:28,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:15:33,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:15:33,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:15:36,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:15:36,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:15:40,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:15:42,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 17:15:44,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:15:44,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:15:46,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 17:15:47,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:15:49,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:15:49,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 17:15:50,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 17:15:53,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 17:15:53,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:15:55,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 17:16:07,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:16:18,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:16:18,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:16:19,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 17:16:22,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:16:23,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 17:16:23,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:16:23,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:16:28,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:16:30,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 17:16:31,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten.whitening_limit, batch_count=434233.3333333333, ans=22.5 2023-09-29 17:16:31,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:16:37,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 17:16:39,144 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=10.41 vs. limit=15.0 2023-09-29 17:16:40,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 17:16:41,595 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.935e+02 2.126e+02 2.533e+02 4.347e+02, threshold=4.251e+02, percent-clipped=1.0 2023-09-29 17:16:43,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 17:16:43,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=434300.0, ans=0.125 2023-09-29 17:16:44,928 INFO [train.py:1039] (3/4) Epoch 13, batch 1400, loss[loss=0.1699, simple_loss=0.2107, pruned_loss=0.06458, over 19232.00 frames. ], tot_loss[loss=0.1936, simple_loss=0.2643, pruned_loss=0.06148, over 4700814.54 frames. ], batch size: 389, lr: 8.05e-03, grad_scale: 16.0 2023-09-29 17:16:45,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=434300.0, ans=0.125 2023-09-29 17:16:47,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:16:50,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:16:52,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:16:56,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 17:16:58,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 17:17:10,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:17:11,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:17:14,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:17:14,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 17:17:18,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:17:20,427 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 17:17:20,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=434433.3333333333, ans=0.0 2023-09-29 17:17:30,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:17:30,298 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:17:34,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 17:17:34,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:17:36,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:17:37,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:17:39,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:17:39,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:17:39,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:17:39,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=434500.0, ans=0.0 2023-09-29 17:17:41,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:17:43,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 17:17:43,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:17:47,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:17:51,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:17:55,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=434566.6666666667, ans=0.07 2023-09-29 17:18:01,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 17:18:03,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 17:18:03,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:18:06,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 17:18:07,624 INFO [train.py:1039] (3/4) Epoch 13, batch 1450, loss[loss=0.1854, simple_loss=0.2501, pruned_loss=0.06035, over 23445.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2636, pruned_loss=0.06085, over 4703181.27 frames. ], batch size: 134, lr: 8.05e-03, grad_scale: 16.0 2023-09-29 17:18:07,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:18:12,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:18:13,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:18:14,579 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.47 vs. limit=15.0 2023-09-29 17:18:17,283 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:18:17,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:17,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 17:18:22,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:18:22,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:18:25,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:18:25,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 17:18:26,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=434700.0, ans=0.0 2023-09-29 17:18:27,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:18:27,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 17:18:29,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:29,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:18:29,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 17:18:31,597 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:18:33,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:18:33,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 17:18:33,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:18:34,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:18:36,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:37,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=434700.0, ans=10.0 2023-09-29 17:18:38,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:18:42,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:18:44,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:18:45,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:18:45,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:48,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:18:48,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:18:48,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:18:50,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:18:53,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 17:18:55,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:18:55,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=434833.3333333333, ans=0.0 2023-09-29 17:18:58,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=434833.3333333333, ans=0.125 2023-09-29 17:19:00,129 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 17:19:02,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:19:03,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:19:03,902 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:19:05,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 17:19:08,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=434833.3333333333, ans=0.1 2023-09-29 17:19:09,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:19:11,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 17:19:14,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 17:19:15,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:19:17,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:19:18,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:19:19,210 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=434900.0, ans=0.125 2023-09-29 17:19:20,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 17:19:22,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 17:19:23,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 17:19:25,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:19:25,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:19:27,163 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.823e+02 1.982e+02 2.343e+02 3.097e+02, threshold=3.963e+02, percent-clipped=0.0 2023-09-29 17:19:29,854 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.85 vs. limit=15.0 2023-09-29 17:19:30,425 INFO [train.py:1039] (3/4) Epoch 13, batch 1500, loss[loss=0.1907, simple_loss=0.275, pruned_loss=0.05324, over 24467.00 frames. ], tot_loss[loss=0.1932, simple_loss=0.2643, pruned_loss=0.06108, over 4697284.75 frames. ], batch size: 69, lr: 8.05e-03, grad_scale: 16.0 2023-09-29 17:19:36,638 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.04 vs. limit=10.0 2023-09-29 17:19:39,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 17:19:39,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:19:39,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:19:40,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:19:40,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=434966.6666666667, ans=0.125 2023-09-29 17:19:42,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:19:42,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:19:44,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 17:19:45,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:19:45,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:19:45,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:19:47,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:19:47,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:19:49,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:19:54,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:19:54,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 17:19:55,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:19:55,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:19:57,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:20:01,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 17:20:06,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 17:20:08,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:20:08,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 17:20:08,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=435100.0, ans=0.1 2023-09-29 17:20:11,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 17:20:13,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:20:14,838 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:20:16,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:20:18,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 17:20:18,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:20:18,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:20:20,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 17:20:20,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:20:20,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=435166.6666666667, ans=0.1 2023-09-29 17:20:26,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:20:26,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 17:20:32,448 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:20:34,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:20:37,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_na.min_abs, batch_count=435233.3333333333, ans=0.02 2023-09-29 17:20:39,147 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 17:20:39,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:20:40,619 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 17:20:40,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:20:42,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:20:42,783 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 17:20:44,258 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:20:47,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 17:20:49,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:20:52,182 INFO [train.py:1039] (3/4) Epoch 13, batch 1550, loss[loss=0.1987, simple_loss=0.2748, pruned_loss=0.06134, over 23334.00 frames. ], tot_loss[loss=0.1942, simple_loss=0.2656, pruned_loss=0.06141, over 4704082.96 frames. ], batch size: 93, lr: 8.04e-03, grad_scale: 16.0 2023-09-29 17:20:54,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:20:54,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:20:54,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:20:56,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:20:56,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:20:57,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 17:20:57,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 17:20:59,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:21:00,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 17:21:00,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 17:21:02,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:21:04,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:21:05,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:21:05,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:21:07,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:21:07,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:21:10,258 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 17:21:10,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:21:10,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:21:10,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:21:15,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:21:15,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 17:21:16,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:21:16,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 17:21:19,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 17:21:19,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 17:21:19,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:21:20,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:21:25,365 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.46 vs. limit=15.0 2023-09-29 17:21:25,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:21:27,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 17:21:27,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 17:21:27,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=435433.3333333333, ans=0.0 2023-09-29 17:21:35,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:21:40,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:21:40,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 17:21:40,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:21:40,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 17:21:45,750 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.99 vs. limit=10.0 2023-09-29 17:21:46,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:21:49,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:21:51,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:21:54,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:21:54,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:21:55,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 17:21:55,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:21:57,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:21:58,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:21:58,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 17:21:58,722 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 17:22:01,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:22:05,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=435566.6666666667, ans=0.0 2023-09-29 17:22:05,421 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.47 vs. limit=15.0 2023-09-29 17:22:08,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 17:22:09,285 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.54 vs. limit=15.0 2023-09-29 17:22:11,511 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 2.000e+02 2.251e+02 2.787e+02 4.721e+02, threshold=4.502e+02, percent-clipped=2.0 2023-09-29 17:22:11,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:22:13,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:22:14,596 INFO [train.py:1039] (3/4) Epoch 13, batch 1600, loss[loss=0.2034, simple_loss=0.2593, pruned_loss=0.0737, over 23762.00 frames. ], tot_loss[loss=0.1958, simple_loss=0.2669, pruned_loss=0.06238, over 4701634.61 frames. ], batch size: 195, lr: 8.04e-03, grad_scale: 32.0 2023-09-29 17:22:14,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 17:22:16,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:22:16,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:22:16,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:22:16,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:22:18,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:22:18,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=435633.3333333333, ans=0.0 2023-09-29 17:22:22,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:22:23,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 17:22:24,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 17:22:26,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 17:22:28,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:22:30,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 17:22:31,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:22:34,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:22:38,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=435700.0, ans=0.1 2023-09-29 17:22:39,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:22:40,343 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=435700.0, ans=0.1 2023-09-29 17:22:44,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 17:22:45,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=435700.0, ans=0.0 2023-09-29 17:22:46,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:22:46,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 17:22:46,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:22:47,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 17:22:51,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 17:22:56,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=435766.6666666667, ans=0.125 2023-09-29 17:22:58,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:23:00,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 17:23:01,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:23:02,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:23:02,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:23:06,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 17:23:09,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 17:23:11,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:23:12,246 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.37 vs. limit=10.0 2023-09-29 17:23:13,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:23:13,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=435833.3333333333, ans=0.0 2023-09-29 17:23:14,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:23:14,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:23:16,417 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:23:17,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:23:19,432 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:23:25,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:23:27,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:23:30,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 17:23:30,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:23:30,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 17:23:35,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:23:37,175 INFO [train.py:1039] (3/4) Epoch 13, batch 1650, loss[loss=0.1727, simple_loss=0.2507, pruned_loss=0.04738, over 19033.00 frames. ], tot_loss[loss=0.1963, simple_loss=0.2674, pruned_loss=0.06253, over 4687366.17 frames. ], batch size: 41, lr: 8.04e-03, grad_scale: 32.0 2023-09-29 17:23:37,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:23:37,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:23:37,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 17:23:38,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 17:23:38,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 17:23:38,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 17:23:43,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:23:43,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:23:45,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:23:45,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:23:48,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:23:50,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 17:23:53,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:23:54,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:23:54,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:23:54,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:23:54,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 17:23:54,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 17:23:59,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:24:01,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:24:01,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=436033.3333333333, ans=0.125 2023-09-29 17:24:11,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 17:24:13,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:24:14,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 17:24:18,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:24:20,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:24:20,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:24:21,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:24:23,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:24:23,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:24:26,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:24:27,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:24:29,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:24:29,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:24:30,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:24:30,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:24:36,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:24:36,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 17:24:39,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:24:39,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 17:24:40,227 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=8.62 vs. limit=15.0 2023-09-29 17:24:41,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 17:24:41,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 17:24:43,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:24:43,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:24:43,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:24:44,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:24:44,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 17:24:50,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:24:51,584 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:24:51,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:24:55,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 17:24:56,551 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.665e+02 2.021e+02 2.239e+02 2.775e+02 4.189e+02, threshold=4.478e+02, percent-clipped=0.0 2023-09-29 17:24:59,915 INFO [train.py:1039] (3/4) Epoch 13, batch 1700, loss[loss=0.2051, simple_loss=0.288, pruned_loss=0.06104, over 24369.00 frames. ], tot_loss[loss=0.1957, simple_loss=0.2671, pruned_loss=0.06211, over 4696331.90 frames. ], batch size: 77, lr: 8.03e-03, grad_scale: 32.0 2023-09-29 17:25:00,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:25:00,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:25:00,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 17:25:00,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:25:00,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:25:00,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:25:01,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=436300.0, ans=0.2 2023-09-29 17:25:03,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:25:04,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:25:04,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 17:25:07,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:25:16,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:25:18,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:25:22,262 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=436366.6666666667, ans=0.0 2023-09-29 17:25:24,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:25:24,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:25:26,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:25:26,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:25:29,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 17:25:33,010 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:25:33,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:25:33,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:25:34,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:25:36,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 17:25:37,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 17:25:39,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:25:40,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 17:25:44,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:25:49,946 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=436500.0, ans=0.125 2023-09-29 17:25:53,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:25:53,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:25:54,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:25:55,164 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=436500.0, ans=0.125 2023-09-29 17:25:56,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 17:25:56,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 17:25:57,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:25:59,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:25:59,445 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 17:26:01,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:26:01,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:26:01,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:26:01,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:26:04,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:26:04,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:26:04,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:26:04,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:26:05,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:26:09,148 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.60 vs. limit=15.0 2023-09-29 17:26:09,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:26:11,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 17:26:14,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:26:16,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:26:17,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 17:26:22,895 INFO [train.py:1039] (3/4) Epoch 13, batch 1750, loss[loss=0.2028, simple_loss=0.2822, pruned_loss=0.06172, over 23588.00 frames. ], tot_loss[loss=0.1947, simple_loss=0.266, pruned_loss=0.06169, over 4711481.34 frames. ], batch size: 94, lr: 8.03e-03, grad_scale: 32.0 2023-09-29 17:26:24,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:26:26,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:26:28,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:26:28,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 17:26:28,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:26:32,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:26:32,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:26:39,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 17:26:40,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:26:43,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 17:26:44,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:26:45,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:26:48,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 17:26:50,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 17:26:52,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:26:52,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 17:27:00,765 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:27:00,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=436766.6666666667, ans=0.0 2023-09-29 17:27:04,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:27:04,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:27:07,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:27:07,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:27:07,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=436766.6666666667, ans=0.125 2023-09-29 17:27:08,455 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=5.00 vs. limit=15.0 2023-09-29 17:27:11,102 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:27:12,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:27:14,298 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:27:14,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:27:15,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 17:27:17,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:27:20,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 17:27:22,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:27:22,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:27:22,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:27:23,011 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.48 vs. limit=15.0 2023-09-29 17:27:26,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=436833.3333333333, ans=0.2 2023-09-29 17:27:28,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:27:28,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 17:27:29,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:27:32,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:27:37,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:27:40,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:27:42,027 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.899e+02 2.041e+02 2.421e+02 3.023e+02, threshold=4.083e+02, percent-clipped=0.0 2023-09-29 17:27:43,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:27:45,432 INFO [train.py:1039] (3/4) Epoch 13, batch 1800, loss[loss=0.1819, simple_loss=0.268, pruned_loss=0.04789, over 24442.00 frames. ], tot_loss[loss=0.1935, simple_loss=0.2651, pruned_loss=0.06091, over 4716939.77 frames. ], batch size: 69, lr: 8.03e-03, grad_scale: 32.0 2023-09-29 17:27:45,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 17:27:45,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:27:46,569 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.58 vs. limit=15.0 2023-09-29 17:27:48,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:27:48,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:27:48,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:27:48,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:27:48,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:27:51,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:27:52,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:27:55,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 17:27:56,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:27:59,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 17:28:01,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:28:04,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:28:08,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:28:08,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:28:08,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:28:12,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:28:12,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 17:28:13,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:28:18,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:28:22,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 17:28:23,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 17:28:23,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 17:28:23,771 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=437100.0, ans=0.2 2023-09-29 17:28:25,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:28:26,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:28:26,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:28:28,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:28:32,214 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.41 vs. limit=15.0 2023-09-29 17:28:32,924 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 17:28:34,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:28:38,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:28:38,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 17:28:39,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 17:28:41,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:28:43,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:28:45,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:28:46,066 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.78 vs. limit=22.5 2023-09-29 17:28:48,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 17:28:50,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=437233.3333333333, ans=0.125 2023-09-29 17:28:54,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:28:56,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 17:28:56,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:28:56,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:28:56,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:28:58,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 17:29:01,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:29:01,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:29:03,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 17:29:03,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:29:04,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:29:04,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:29:04,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:29:07,689 INFO [train.py:1039] (3/4) Epoch 13, batch 1850, loss[loss=0.2154, simple_loss=0.2775, pruned_loss=0.07662, over 23621.00 frames. ], tot_loss[loss=0.1944, simple_loss=0.2658, pruned_loss=0.06155, over 4708159.66 frames. ], batch size: 256, lr: 8.03e-03, grad_scale: 16.0 2023-09-29 17:29:07,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:29:07,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:29:09,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:29:10,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:29:14,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:29:14,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:29:22,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:29:23,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 17:29:25,362 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.53 vs. limit=15.0 2023-09-29 17:29:26,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 17:29:29,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 17:29:32,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:29:34,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 17:29:34,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 17:29:36,742 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.98 vs. limit=12.0 2023-09-29 17:29:41,167 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.09 vs. limit=10.0 2023-09-29 17:29:43,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:29:47,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 17:29:48,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:29:49,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:29:52,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 17:29:52,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:29:54,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:29:54,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:29:54,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=437433.3333333333, ans=0.07 2023-09-29 17:29:58,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:29:59,286 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.07 vs. limit=15.0 2023-09-29 17:30:01,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:30:01,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=437500.0, ans=0.125 2023-09-29 17:30:05,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:30:05,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:30:05,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 17:30:05,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:30:08,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:30:10,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:30:13,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 17:30:13,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:30:16,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:30:17,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:30:17,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 17:30:17,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 17:30:19,385 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 17:30:21,352 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 17:30:23,899 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=437566.6666666667, ans=0.125 2023-09-29 17:30:24,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:30:24,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:30:24,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:30:24,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:30:25,099 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 17:30:25,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:30:27,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:30:28,476 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.931e+02 2.222e+02 2.775e+02 3.962e+02, threshold=4.445e+02, percent-clipped=0.0 2023-09-29 17:30:28,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:30:28,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:30:30,312 INFO [train.py:1039] (3/4) Epoch 13, batch 1900, loss[loss=0.1936, simple_loss=0.2744, pruned_loss=0.05642, over 24663.00 frames. ], tot_loss[loss=0.1949, simple_loss=0.2665, pruned_loss=0.06165, over 4717067.31 frames. ], batch size: 65, lr: 8.02e-03, grad_scale: 16.0 2023-09-29 17:30:30,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:30:30,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 17:30:33,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:30:33,565 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 17:30:33,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:30:35,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:30:36,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=437633.3333333333, ans=0.125 2023-09-29 17:30:40,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:30:43,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:30:43,744 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 17:30:45,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 17:30:45,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:30:46,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:30:46,821 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 17:30:48,201 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 17:30:51,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 17:30:53,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:30:56,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 17:31:00,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 17:31:11,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 17:31:13,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 17:31:14,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:31:14,902 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 17:31:14,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 17:31:14,980 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 17:31:16,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 17:31:16,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:31:18,626 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.60 vs. limit=15.0 2023-09-29 17:31:21,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 17:31:25,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:31:27,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:31:27,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 17:31:28,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:31:35,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 17:31:36,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:31:41,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:31:41,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:31:41,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:31:42,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:31:46,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:31:46,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 17:31:46,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:31:46,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=437900.0, ans=0.0 2023-09-29 17:31:49,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:31:49,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:31:52,147 INFO [train.py:1039] (3/4) Epoch 13, batch 1950, loss[loss=0.2391, simple_loss=0.2889, pruned_loss=0.09469, over 23876.00 frames. ], tot_loss[loss=0.1949, simple_loss=0.2667, pruned_loss=0.06158, over 4728715.11 frames. ], batch size: 195, lr: 8.02e-03, grad_scale: 8.0 2023-09-29 17:31:52,261 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:31:52,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:31:52,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:31:53,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:31:55,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:32:00,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:32:00,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:32:00,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:32:01,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 17:32:04,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 17:32:04,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:32:06,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:32:08,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:32:10,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:32:10,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:32:12,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:32:16,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:32:16,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:32:17,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:32:17,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:32:20,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:32:24,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:32:24,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:32:24,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 17:32:24,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 17:32:24,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:32:24,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:32:25,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:32:29,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:32:30,037 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.32 vs. limit=22.5 2023-09-29 17:32:30,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:32:35,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:32:38,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:32:38,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:32:40,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 17:32:40,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:32:40,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=438166.6666666667, ans=0.0 2023-09-29 17:32:45,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:32:47,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:32:47,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=438166.6666666667, ans=0.125 2023-09-29 17:32:48,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:32:49,107 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:32:56,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:32:57,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:32:59,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:33:00,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:33:03,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:33:03,831 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:33:04,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=438233.3333333333, ans=0.05 2023-09-29 17:33:05,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 17:33:05,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:33:06,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:33:07,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=438233.3333333333, ans=0.125 2023-09-29 17:33:08,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 17:33:11,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:33:13,235 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=438300.0, ans=0.125 2023-09-29 17:33:14,893 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.616e+02 2.051e+02 2.174e+02 2.503e+02 4.017e+02, threshold=4.347e+02, percent-clipped=0.0 2023-09-29 17:33:14,936 INFO [train.py:1039] (3/4) Epoch 13, batch 2000, loss[loss=0.1972, simple_loss=0.2811, pruned_loss=0.05668, over 24395.00 frames. ], tot_loss[loss=0.1961, simple_loss=0.2679, pruned_loss=0.06211, over 4722357.59 frames. ], batch size: 77, lr: 8.02e-03, grad_scale: 16.0 2023-09-29 17:33:15,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:33:17,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:33:17,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:33:18,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:33:21,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:33:24,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 17:33:25,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:33:28,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:33:28,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=438300.0, ans=0.0 2023-09-29 17:33:31,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 17:33:33,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:33:33,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:33:36,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:33:37,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 17:33:39,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=438366.6666666667, ans=0.125 2023-09-29 17:33:40,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:33:40,790 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=438366.6666666667, ans=0.0 2023-09-29 17:33:43,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:33:43,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:33:44,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 17:33:45,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:33:46,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 17:33:46,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:33:50,727 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:33:52,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 17:33:52,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:33:54,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:33:54,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:33:56,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 17:34:00,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 17:34:00,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:34:00,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:34:04,891 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=438500.0, ans=0.1 2023-09-29 17:34:06,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:34:07,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:34:07,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:34:07,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:34:09,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:34:10,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:34:10,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:34:10,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:34:12,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:15,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:34:15,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 17:34:19,339 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.50 vs. limit=12.0 2023-09-29 17:34:21,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:34:23,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:34:27,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:34:27,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:34:29,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=438566.6666666667, ans=0.125 2023-09-29 17:34:31,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=438566.6666666667, ans=0.1 2023-09-29 17:34:33,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:35,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:34:35,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:36,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:34:36,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:34:36,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=438633.3333333333, ans=0.125 2023-09-29 17:34:37,992 INFO [train.py:1039] (3/4) Epoch 13, batch 2050, loss[loss=0.2028, simple_loss=0.2708, pruned_loss=0.06737, over 23606.00 frames. ], tot_loss[loss=0.1957, simple_loss=0.2672, pruned_loss=0.06209, over 4711303.68 frames. ], batch size: 106, lr: 8.01e-03, grad_scale: 16.0 2023-09-29 17:34:38,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:34:39,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:42,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:34:44,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:48,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=438633.3333333333, ans=0.2 2023-09-29 17:34:49,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:34:52,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:34:52,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:34:53,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:34:54,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 17:34:54,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:34:55,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:34:55,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:35:03,136 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=438700.0, ans=0.125 2023-09-29 17:35:09,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:35:09,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:35:11,534 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 17:35:13,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:35:14,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 17:35:14,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:35:19,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:35:20,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:35:22,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:35:22,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:35:24,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:35:25,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:35:25,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:35:30,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:35:32,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:35:35,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:35:35,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:35:39,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:35:41,953 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.53 vs. limit=15.0 2023-09-29 17:35:44,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:35:46,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 17:35:50,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:35:52,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:35:52,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=438900.0, ans=0.125 2023-09-29 17:35:53,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:35:55,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 17:35:59,782 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.857e+02 2.010e+02 2.339e+02 3.458e+02, threshold=4.021e+02, percent-clipped=0.0 2023-09-29 17:35:59,845 INFO [train.py:1039] (3/4) Epoch 13, batch 2100, loss[loss=0.1804, simple_loss=0.2375, pruned_loss=0.06161, over 22691.00 frames. ], tot_loss[loss=0.1938, simple_loss=0.2648, pruned_loss=0.06138, over 4691300.30 frames. ], batch size: 322, lr: 8.01e-03, grad_scale: 16.0 2023-09-29 17:36:00,003 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 17:36:00,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:36:01,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:36:01,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:36:03,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:36:03,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 17:36:03,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 17:36:05,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:36:08,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:36:09,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:36:11,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:36:11,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:36:11,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 17:36:12,026 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=438966.6666666667, ans=0.125 2023-09-29 17:36:13,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:36:13,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 17:36:13,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 17:36:15,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:36:17,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:36:17,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 17:36:17,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 17:36:17,638 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=439033.3333333333, ans=0.125 2023-09-29 17:36:23,383 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 17:36:23,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:36:27,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:36:27,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:36:30,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:36:31,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 17:36:32,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:36:32,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 17:36:34,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 17:36:35,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:36:35,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 17:36:37,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 17:36:37,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 17:36:39,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:36:42,284 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:36:44,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:36:44,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 17:36:47,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:36:49,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:36:49,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 17:36:49,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:36:51,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:36:52,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:36:52,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 17:36:53,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 17:36:54,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 17:36:58,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:36:58,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=439166.6666666667, ans=0.125 2023-09-29 17:37:00,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=439166.6666666667, ans=0.125 2023-09-29 17:37:01,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:37:03,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 17:37:06,000 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.33 vs. limit=6.0 2023-09-29 17:37:07,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:37:09,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:37:10,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:37:10,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:37:10,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 17:37:10,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:37:13,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:37:13,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:37:14,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:37:14,991 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:37:16,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 17:37:18,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 17:37:18,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:37:21,813 INFO [train.py:1039] (3/4) Epoch 13, batch 2150, loss[loss=0.1791, simple_loss=0.2605, pruned_loss=0.04889, over 24434.00 frames. ], tot_loss[loss=0.1927, simple_loss=0.2636, pruned_loss=0.06093, over 4685970.40 frames. ], batch size: 66, lr: 8.01e-03, grad_scale: 16.0 2023-09-29 17:37:22,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:37:22,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:37:23,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:37:23,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:37:30,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 17:37:31,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:37:33,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:37:34,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:37:34,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:37:36,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:37:39,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:37:41,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:37:41,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:37:42,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:37:43,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 17:37:48,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:37:49,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=439366.6666666667, ans=0.0 2023-09-29 17:37:50,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:37:52,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:37:52,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:37:52,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:37:53,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:37:53,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:37:53,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:37:55,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:37:56,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 17:37:58,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:37:58,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=439433.3333333333, ans=0.95 2023-09-29 17:37:59,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:38:00,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:38:01,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:38:01,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:38:05,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:38:06,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:38:06,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:38:06,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 17:38:06,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 17:38:11,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:38:12,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:13,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:38:14,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:38:15,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:16,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:16,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 17:38:18,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 17:38:19,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:38:20,002 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 17:38:20,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=439500.0, ans=0.2 2023-09-29 17:38:21,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:21,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:38:23,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 17:38:23,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:38:23,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 17:38:23,610 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 17:38:23,610 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 17:38:23,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 17:38:25,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:26,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:38:26,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:38:28,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:29,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 17:38:31,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:31,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:39,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:38:41,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 17:38:41,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=439566.6666666667, ans=0.0 2023-09-29 17:38:41,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=439566.6666666667, ans=0.1 2023-09-29 17:38:44,268 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.622e+02 1.873e+02 2.053e+02 2.392e+02 4.399e+02, threshold=4.106e+02, percent-clipped=1.0 2023-09-29 17:38:44,310 INFO [train.py:1039] (3/4) Epoch 13, batch 2200, loss[loss=0.1902, simple_loss=0.2756, pruned_loss=0.05247, over 24559.00 frames. ], tot_loss[loss=0.193, simple_loss=0.2641, pruned_loss=0.06092, over 4689886.96 frames. ], batch size: 71, lr: 8.00e-03, grad_scale: 16.0 2023-09-29 17:38:44,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:38:49,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:38:50,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:38:52,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:38:52,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:38:57,169 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.23 vs. limit=15.0 2023-09-29 17:38:57,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:38:57,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:38:57,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 17:39:03,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 17:39:03,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:39:08,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 17:39:11,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:39:12,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:39:13,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:39:14,624 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.69 vs. limit=15.0 2023-09-29 17:39:19,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:39:19,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 17:39:23,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:39:26,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:39:28,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 17:39:31,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:39:32,186 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=439833.3333333333, ans=0.1 2023-09-29 17:39:33,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:39:35,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:39:37,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:39:38,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 17:39:40,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:39:41,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 17:39:44,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:39:44,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 17:39:44,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:39:48,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:39:48,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:39:48,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:39:48,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:39:48,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:39:49,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:39:51,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 17:39:54,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 17:39:54,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:39:58,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:39:58,407 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 17:39:58,556 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=439900.0, ans=0.0 2023-09-29 17:40:00,790 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=439900.0, ans=0.1 2023-09-29 17:40:01,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:40:01,926 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 17:40:02,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:40:03,477 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 17:40:05,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:40:05,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 17:40:07,072 INFO [train.py:1039] (3/4) Epoch 13, batch 2250, loss[loss=0.1862, simple_loss=0.2597, pruned_loss=0.05631, over 23279.00 frames. ], tot_loss[loss=0.1931, simple_loss=0.2647, pruned_loss=0.06078, over 4706888.57 frames. ], batch size: 119, lr: 8.00e-03, grad_scale: 16.0 2023-09-29 17:40:08,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:40:08,872 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 17:40:11,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:40:13,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:40:19,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:40:21,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:40:25,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:40:27,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:40:27,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:40:28,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 17:40:30,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:40:30,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:40:33,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 17:40:34,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:40:34,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:40:37,630 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:40:43,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:40:45,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 17:40:45,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:40:46,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 17:40:48,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:40:49,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:40:55,021 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.85 vs. limit=15.0 2023-09-29 17:40:55,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:40:57,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:40:58,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:40:58,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:41:00,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:41:02,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:41:02,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=440166.6666666667, ans=0.2 2023-09-29 17:41:06,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:41:07,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 17:41:08,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=440166.6666666667, ans=0.025 2023-09-29 17:41:15,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 17:41:15,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=440233.3333333333, ans=0.125 2023-09-29 17:41:16,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:41:17,350 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=440233.3333333333, ans=0.125 2023-09-29 17:41:18,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:41:23,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 17:41:25,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=440233.3333333333, ans=0.125 2023-09-29 17:41:26,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 17:41:26,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 17:41:26,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:41:26,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:41:28,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 17:41:29,316 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.929e+02 2.161e+02 2.428e+02 3.244e+02, threshold=4.321e+02, percent-clipped=0.0 2023-09-29 17:41:29,359 INFO [train.py:1039] (3/4) Epoch 13, batch 2300, loss[loss=0.2745, simple_loss=0.3211, pruned_loss=0.1139, over 19399.00 frames. ], tot_loss[loss=0.1942, simple_loss=0.2659, pruned_loss=0.06126, over 4695300.79 frames. ], batch size: 388, lr: 8.00e-03, grad_scale: 16.0 2023-09-29 17:41:32,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:41:33,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:41:37,521 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=440300.0, ans=0.5 2023-09-29 17:41:38,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:41:38,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:41:42,452 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 17:41:43,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:41:52,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:41:52,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:41:54,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:41:54,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:41:54,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 17:41:55,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:41:58,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:41:59,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:42:02,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:42:03,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=440433.3333333333, ans=0.0 2023-09-29 17:42:06,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:42:08,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:42:13,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:42:13,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:42:15,096 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=440433.3333333333, ans=0.125 2023-09-29 17:42:16,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:42:16,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=440500.0, ans=0.125 2023-09-29 17:42:19,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:42:23,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:42:23,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:42:25,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:42:25,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 17:42:30,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 17:42:30,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:42:31,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:42:32,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:42:33,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:42:34,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 17:42:34,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 17:42:35,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 17:42:35,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:42:35,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:42:35,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 17:42:41,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:42:44,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:42:49,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:42:49,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:42:51,352 INFO [train.py:1039] (3/4) Epoch 13, batch 2350, loss[loss=0.176, simple_loss=0.2464, pruned_loss=0.05282, over 21994.00 frames. ], tot_loss[loss=0.1957, simple_loss=0.2668, pruned_loss=0.06229, over 4675159.00 frames. ], batch size: 48, lr: 8.00e-03, grad_scale: 16.0 2023-09-29 17:42:51,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 17:42:51,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:42:51,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:42:53,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:42:53,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 17:43:00,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:43:00,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 17:43:07,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 17:43:08,392 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.14 vs. limit=10.0 2023-09-29 17:43:10,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:43:13,597 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:43:14,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:43:14,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:43:14,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:43:15,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 17:43:18,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:43:25,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 17:43:26,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:43:30,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:43:30,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:43:33,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:43:35,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 17:43:36,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:43:38,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:43:39,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:43:39,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:43:42,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:43:43,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 17:43:45,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:43:45,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=440833.3333333333, ans=0.125 2023-09-29 17:43:46,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:43:46,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:43:47,375 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=440833.3333333333, ans=0.125 2023-09-29 17:43:49,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 17:43:49,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:43:54,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 17:43:54,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:43:55,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=440900.0, ans=0.125 2023-09-29 17:43:59,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 17:44:04,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 17:44:04,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:44:04,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 17:44:05,815 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 17:44:05,868 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 17:44:07,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 17:44:10,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:44:14,661 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.459e+02 1.807e+02 2.059e+02 2.357e+02 3.650e+02, threshold=4.118e+02, percent-clipped=0.0 2023-09-29 17:44:14,730 INFO [train.py:1039] (3/4) Epoch 13, batch 2400, loss[loss=0.1853, simple_loss=0.2679, pruned_loss=0.05136, over 24541.00 frames. ], tot_loss[loss=0.195, simple_loss=0.2665, pruned_loss=0.06177, over 4677649.54 frames. ], batch size: 66, lr: 7.99e-03, grad_scale: 32.0 2023-09-29 17:44:14,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:44:17,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:44:20,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:44:21,050 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 17:44:21,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 17:44:24,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=440966.6666666667, ans=0.1 2023-09-29 17:44:30,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 17:44:30,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:44:32,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 17:44:32,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:44:32,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:44:33,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 17:44:40,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:44:42,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 17:44:47,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:44:50,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 17:44:51,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=441100.0, ans=0.125 2023-09-29 17:44:53,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:44:54,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=441100.0, ans=0.125 2023-09-29 17:44:55,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:44:59,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:44:59,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 17:44:59,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 17:45:08,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:45:12,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:45:12,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=441166.6666666667, ans=0.0 2023-09-29 17:45:14,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:45:16,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:45:17,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 17:45:17,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:45:17,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:45:17,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:45:19,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 17:45:19,899 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.81 vs. limit=15.0 2023-09-29 17:45:23,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:45:24,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 17:45:24,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 17:45:25,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 17:45:26,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:45:26,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:45:26,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 17:45:28,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 17:45:29,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 17:45:29,858 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 17:45:31,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 17:45:32,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:45:34,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:45:34,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:45:35,885 INFO [train.py:1039] (3/4) Epoch 13, batch 2450, loss[loss=0.1814, simple_loss=0.2645, pruned_loss=0.04911, over 24488.00 frames. ], tot_loss[loss=0.1942, simple_loss=0.2655, pruned_loss=0.0615, over 4698910.76 frames. ], batch size: 66, lr: 7.99e-03, grad_scale: 32.0 2023-09-29 17:45:36,019 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 17:45:36,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:45:37,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 17:45:42,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:45:42,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:45:48,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:45:48,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:45:50,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 17:45:56,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:45:56,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:45:56,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=441366.6666666667, ans=0.1 2023-09-29 17:45:59,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:45:59,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:45:59,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:45:59,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 17:46:00,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=441366.6666666667, ans=0.125 2023-09-29 17:46:04,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:46:05,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:46:07,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:46:10,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 17:46:10,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:46:11,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys.whitening_limit, batch_count=441433.3333333333, ans=6.0 2023-09-29 17:46:11,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:46:12,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:46:13,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 17:46:15,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:46:17,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=441433.3333333333, ans=0.125 2023-09-29 17:46:20,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=441433.3333333333, ans=0.125 2023-09-29 17:46:23,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:46:24,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:46:25,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:46:25,088 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:46:25,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:46:26,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:46:28,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 17:46:29,862 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=441500.0, ans=0.0 2023-09-29 17:46:31,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 17:46:31,283 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:46:35,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:46:35,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:46:40,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:46:40,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 17:46:42,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:46:42,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=441566.6666666667, ans=0.125 2023-09-29 17:46:43,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:46:45,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 17:46:45,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:46:46,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:46:49,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:46:53,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:46:53,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:46:58,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 17:46:59,950 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.565e+02 1.935e+02 2.161e+02 2.595e+02 3.888e+02, threshold=4.322e+02, percent-clipped=0.0 2023-09-29 17:46:59,992 INFO [train.py:1039] (3/4) Epoch 13, batch 2500, loss[loss=0.1784, simple_loss=0.2502, pruned_loss=0.05335, over 24387.00 frames. ], tot_loss[loss=0.193, simple_loss=0.2641, pruned_loss=0.06098, over 4698531.98 frames. ], batch size: 58, lr: 7.99e-03, grad_scale: 32.0 2023-09-29 17:47:00,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 17:47:06,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:47:12,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=441633.3333333333, ans=0.125 2023-09-29 17:47:15,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:47:15,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:47:17,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:47:17,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 17:47:18,344 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.08 vs. limit=12.0 2023-09-29 17:47:25,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:47:25,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:47:27,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 17:47:29,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 17:47:29,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 17:47:29,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:47:30,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:47:32,026 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 17:47:32,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:47:32,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 17:47:33,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:47:37,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=441766.6666666667, ans=0.025 2023-09-29 17:47:38,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:47:39,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:47:41,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 17:47:41,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 17:47:41,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:47:44,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:47:47,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:47:51,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:47:53,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=441833.3333333333, ans=0.125 2023-09-29 17:47:54,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:48:00,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 17:48:05,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 17:48:05,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:48:05,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 17:48:06,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:48:06,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 17:48:08,449 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 17:48:08,449 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 17:48:08,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 17:48:12,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:48:15,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 17:48:17,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 17:48:17,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:48:18,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 17:48:21,496 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.46 vs. limit=15.0 2023-09-29 17:48:22,041 INFO [train.py:1039] (3/4) Epoch 13, batch 2550, loss[loss=0.2109, simple_loss=0.2842, pruned_loss=0.06874, over 24330.00 frames. ], tot_loss[loss=0.1929, simple_loss=0.2643, pruned_loss=0.0608, over 4698626.34 frames. ], batch size: 77, lr: 7.98e-03, grad_scale: 32.0 2023-09-29 17:48:22,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 17:48:25,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:48:26,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:48:26,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:48:28,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:48:30,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 17:48:30,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:48:36,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 17:48:39,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:48:43,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:48:43,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:48:43,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 17:48:44,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:48:44,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:48:44,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:48:47,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:48:48,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 17:48:48,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 17:48:48,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:48:48,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 17:48:51,447 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=442033.3333333333, ans=0.125 2023-09-29 17:49:00,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:49:04,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=442100.0, ans=0.125 2023-09-29 17:49:05,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:49:05,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:49:05,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:49:07,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 17:49:09,826 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=442100.0, ans=0.125 2023-09-29 17:49:12,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:49:12,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=442166.6666666667, ans=0.05 2023-09-29 17:49:16,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 17:49:17,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:49:17,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:49:17,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 17:49:19,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:49:19,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=442166.6666666667, ans=0.5 2023-09-29 17:49:22,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:49:23,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:49:25,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=442166.6666666667, ans=0.125 2023-09-29 17:49:26,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:49:26,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 17:49:26,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:49:28,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:49:29,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 17:49:31,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:49:32,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:49:38,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:49:41,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:49:43,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=442300.0, ans=0.0 2023-09-29 17:49:44,605 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.935e+02 2.262e+02 2.614e+02 3.523e+02, threshold=4.524e+02, percent-clipped=0.0 2023-09-29 17:49:44,658 INFO [train.py:1039] (3/4) Epoch 13, batch 2600, loss[loss=0.2061, simple_loss=0.271, pruned_loss=0.07059, over 23452.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.265, pruned_loss=0.06093, over 4711776.72 frames. ], batch size: 285, lr: 7.98e-03, grad_scale: 32.0 2023-09-29 17:49:46,283 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 17:49:46,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=442300.0, ans=0.125 2023-09-29 17:49:49,372 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 17:49:50,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:49:50,831 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 17:49:50,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 17:49:50,976 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 17:49:54,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:49:56,155 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 17:49:56,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 17:49:57,738 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 17:49:59,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:50:02,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 17:50:02,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 17:50:05,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 17:50:05,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 17:50:08,502 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 17:50:08,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 17:50:17,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:50:17,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:50:19,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:50:19,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 17:50:21,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 17:50:24,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=442433.3333333333, ans=0.0 2023-09-29 17:50:27,107 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 17:50:32,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:50:33,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:50:35,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 17:50:35,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=442500.0, ans=0.2 2023-09-29 17:50:36,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:50:36,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:50:36,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 17:50:37,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=442500.0, ans=0.125 2023-09-29 17:50:38,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:50:38,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:50:41,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:50:45,878 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 17:50:45,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:50:45,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 17:50:50,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:50:52,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 17:50:52,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 17:50:54,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:50:56,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:50:57,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:51:03,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 17:51:04,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:51:07,434 INFO [train.py:1039] (3/4) Epoch 13, batch 2650, loss[loss=0.2249, simple_loss=0.2855, pruned_loss=0.0822, over 23705.00 frames. ], tot_loss[loss=0.1951, simple_loss=0.2664, pruned_loss=0.06185, over 4715960.37 frames. ], batch size: 232, lr: 7.98e-03, grad_scale: 16.0 2023-09-29 17:51:07,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 17:51:10,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 17:51:10,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:51:11,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=442633.3333333333, ans=0.015 2023-09-29 17:51:12,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 17:51:12,554 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 17:51:12,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:51:15,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:51:19,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 17:51:20,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:51:24,305 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:51:25,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 17:51:25,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:51:25,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:51:27,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 17:51:29,846 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 17:51:32,180 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.44 vs. limit=15.0 2023-09-29 17:51:32,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:51:35,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 17:51:35,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:51:37,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 17:51:42,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:51:42,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 17:51:42,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:51:42,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:51:47,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 17:51:47,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 17:51:49,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=442766.6666666667, ans=0.0 2023-09-29 17:51:50,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:51:55,779 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 17:51:55,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:51:55,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:51:57,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:51:57,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:51:59,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:52:00,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:52:01,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=442833.3333333333, ans=0.1 2023-09-29 17:52:03,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:52:04,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:52:06,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:52:07,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:52:07,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:52:09,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:52:11,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:52:11,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:52:12,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 17:52:16,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:52:16,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:52:16,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:52:18,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 17:52:19,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:52:20,193 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:52:22,856 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:52:24,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:52:24,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:52:25,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 17:52:26,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:52:29,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:52:29,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 17:52:31,152 INFO [train.py:1039] (3/4) Epoch 13, batch 2700, loss[loss=0.1966, simple_loss=0.2743, pruned_loss=0.05947, over 24001.00 frames. ], tot_loss[loss=0.1956, simple_loss=0.2668, pruned_loss=0.0622, over 4707302.90 frames. ], batch size: 80, lr: 7.97e-03, grad_scale: 16.0 2023-09-29 17:52:32,545 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.954e+02 2.253e+02 2.566e+02 4.959e+02, threshold=4.505e+02, percent-clipped=1.0 2023-09-29 17:52:32,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:52:33,378 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.57 vs. limit=10.0 2023-09-29 17:52:36,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 17:52:38,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:52:38,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:52:39,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:52:39,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:52:39,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:52:39,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 17:52:40,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 17:52:40,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 17:52:41,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:52:41,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:52:43,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:52:44,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:52:48,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 17:52:50,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 17:52:50,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:52:51,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=443033.3333333333, ans=0.1 2023-09-29 17:52:56,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 17:52:56,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:52:59,965 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 17:53:01,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:53:01,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:53:01,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:53:01,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:53:03,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:53:07,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:53:07,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:53:09,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:53:10,324 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.03 vs. limit=6.0 2023-09-29 17:53:12,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:53:12,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 17:53:19,039 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=443100.0, ans=0.95 2023-09-29 17:53:23,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:53:23,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:53:27,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 17:53:27,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:53:31,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:53:33,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:53:33,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:53:34,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:53:36,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:53:36,473 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=443233.3333333333, ans=0.0 2023-09-29 17:53:38,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:53:39,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:53:42,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:53:42,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:53:46,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 17:53:46,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:53:48,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:53:48,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 17:53:49,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 17:53:51,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:53:53,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:53:53,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:53:54,523 INFO [train.py:1039] (3/4) Epoch 13, batch 2750, loss[loss=0.2019, simple_loss=0.2907, pruned_loss=0.0566, over 24373.00 frames. ], tot_loss[loss=0.1941, simple_loss=0.2657, pruned_loss=0.06129, over 4718941.80 frames. ], batch size: 77, lr: 7.97e-03, grad_scale: 16.0 2023-09-29 17:53:57,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:53:57,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 17:53:57,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:01,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:54:01,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 17:54:01,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:54:01,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:01,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 17:54:01,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:54:02,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:54:08,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 17:54:11,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:54:11,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=443366.6666666667, ans=0.0 2023-09-29 17:54:12,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:13,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:54:14,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 17:54:15,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:54:16,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:54:17,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:54:18,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:54:21,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 17:54:21,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 17:54:23,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 17:54:23,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:23,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=443366.6666666667, ans=0.0 2023-09-29 17:54:25,126 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=443433.3333333333, ans=0.125 2023-09-29 17:54:26,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:54:29,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=443433.3333333333, ans=0.0 2023-09-29 17:54:35,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:54:37,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 17:54:37,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:54:37,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=443433.3333333333, ans=0.1 2023-09-29 17:54:42,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:54:42,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:54:43,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 17:54:51,202 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 17:54:51,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 17:54:51,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 17:54:57,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:54:58,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 17:54:59,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=443566.6666666667, ans=0.07 2023-09-29 17:54:59,754 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.62 vs. limit=15.0 2023-09-29 17:55:04,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 17:55:06,336 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:55:06,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 17:55:07,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:55:09,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 17:55:09,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 17:55:10,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:55:14,471 INFO [train.py:1039] (3/4) Epoch 13, batch 2800, loss[loss=0.1952, simple_loss=0.2683, pruned_loss=0.06104, over 23313.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.2647, pruned_loss=0.06102, over 4714430.94 frames. ], batch size: 93, lr: 7.97e-03, grad_scale: 32.0 2023-09-29 17:55:14,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 17:55:15,771 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.948e+02 2.222e+02 2.625e+02 4.530e+02, threshold=4.443e+02, percent-clipped=1.0 2023-09-29 17:55:15,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:55:15,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:55:17,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 17:55:17,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:55:17,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:55:17,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=443633.3333333333, ans=0.0 2023-09-29 17:55:21,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:55:21,085 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 17:55:21,086 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 17:55:23,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:55:26,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:55:26,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:55:30,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:55:33,345 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 17:55:35,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 17:55:36,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 17:55:36,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:55:38,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:55:38,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:55:43,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:55:43,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:55:43,317 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:55:44,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:55:52,129 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=7.16 vs. limit=15.0 2023-09-29 17:55:53,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:55:54,847 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=443766.6666666667, ans=0.125 2023-09-29 17:55:56,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:55:58,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:55:59,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:55:59,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:56:05,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:56:06,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 17:56:06,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:56:08,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:56:08,022 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 17:56:11,358 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:56:12,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:56:17,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:56:17,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=443833.3333333333, ans=0.1 2023-09-29 17:56:18,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:56:18,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:56:18,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 17:56:19,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=443900.0, ans=0.0 2023-09-29 17:56:20,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 17:56:20,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 17:56:22,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:56:22,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 17:56:22,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:56:24,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:56:24,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:56:25,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 17:56:27,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:56:27,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:56:29,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:56:29,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=443900.0, ans=0.125 2023-09-29 17:56:30,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 17:56:33,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=443900.0, ans=0.125 2023-09-29 17:56:34,867 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=443900.0, ans=10.0 2023-09-29 17:56:37,555 INFO [train.py:1039] (3/4) Epoch 13, batch 2850, loss[loss=0.1906, simple_loss=0.254, pruned_loss=0.06359, over 23693.00 frames. ], tot_loss[loss=0.193, simple_loss=0.2642, pruned_loss=0.06088, over 4715937.96 frames. ], batch size: 232, lr: 7.97e-03, grad_scale: 16.0 2023-09-29 17:56:37,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 17:56:37,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 17:56:39,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:56:41,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:56:44,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:56:45,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:56:46,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:56:50,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:56:50,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:56:52,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:56:52,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 17:57:00,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 17:57:00,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:57:00,762 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=444033.3333333333, ans=0.2 2023-09-29 17:57:01,848 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.36 vs. limit=10.0 2023-09-29 17:57:02,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 17:57:02,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:57:02,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=444033.3333333333, ans=0.125 2023-09-29 17:57:04,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 17:57:05,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 17:57:07,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:57:20,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:57:22,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:57:22,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 17:57:23,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 17:57:23,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 17:57:23,779 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 17:57:25,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 17:57:25,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 17:57:27,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 17:57:27,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:57:28,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:57:28,964 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=444166.6666666667, ans=0.125 2023-09-29 17:57:28,987 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=444166.6666666667, ans=0.05 2023-09-29 17:57:30,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:57:33,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:57:33,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:57:33,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=444166.6666666667, ans=0.125 2023-09-29 17:57:37,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:57:38,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 17:57:40,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:57:40,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:57:42,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:57:45,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 17:57:48,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:57:50,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 17:57:50,465 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 17:57:52,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 17:57:53,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:57:53,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 17:57:53,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=444233.3333333333, ans=0.125 2023-09-29 17:57:55,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 17:57:55,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:57:55,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:57:55,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 17:57:55,183 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 17:57:56,665 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 17:57:56,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:57:56,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:57:59,679 INFO [train.py:1039] (3/4) Epoch 13, batch 2900, loss[loss=0.2035, simple_loss=0.2622, pruned_loss=0.07241, over 23573.00 frames. ], tot_loss[loss=0.1931, simple_loss=0.2644, pruned_loss=0.0609, over 4710959.08 frames. ], batch size: 256, lr: 7.96e-03, grad_scale: 16.0 2023-09-29 17:58:01,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:58:02,696 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.578e+02 1.932e+02 2.253e+02 2.547e+02 3.848e+02, threshold=4.506e+02, percent-clipped=0.0 2023-09-29 17:58:02,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:58:02,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:58:04,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 17:58:09,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:58:10,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 17:58:11,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 17:58:13,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 17:58:13,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:58:16,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:58:17,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 17:58:20,147 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 17:58:21,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:58:24,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 17:58:24,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 17:58:24,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 17:58:26,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:58:29,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 17:58:29,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=444366.6666666667, ans=0.125 2023-09-29 17:58:30,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 17:58:33,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:58:33,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 17:58:34,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:58:37,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:58:37,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 17:58:42,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 17:58:42,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:58:46,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:58:47,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=444500.0, ans=0.125 2023-09-29 17:58:48,010 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=444500.0, ans=0.0 2023-09-29 17:58:49,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:58:49,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=444500.0, ans=0.125 2023-09-29 17:58:53,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 17:58:53,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 17:58:53,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 17:58:55,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=444500.0, ans=0.125 2023-09-29 17:58:56,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 17:58:59,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 17:58:59,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 17:59:04,905 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.63 vs. limit=22.5 2023-09-29 17:59:05,095 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.90 vs. limit=6.0 2023-09-29 17:59:05,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 17:59:13,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 17:59:13,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 17:59:15,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 17:59:18,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:59:18,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 17:59:20,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:59:21,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=444633.3333333333, ans=10.0 2023-09-29 17:59:21,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=444633.3333333333, ans=0.04949747468305833 2023-09-29 17:59:22,119 INFO [train.py:1039] (3/4) Epoch 13, batch 2950, loss[loss=0.2035, simple_loss=0.2825, pruned_loss=0.06226, over 24364.00 frames. ], tot_loss[loss=0.1928, simple_loss=0.2647, pruned_loss=0.06048, over 4724093.24 frames. ], batch size: 77, lr: 7.96e-03, grad_scale: 16.0 2023-09-29 17:59:22,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 17:59:29,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 17:59:30,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 17:59:31,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:59:31,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:59:33,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 17:59:34,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 17:59:35,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 17:59:37,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 17:59:37,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 17:59:37,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 17:59:42,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 17:59:43,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 17:59:45,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 17:59:47,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 17:59:49,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 17:59:49,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 17:59:50,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:59:52,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 17:59:52,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 17:59:54,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 18:00:01,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 18:00:01,571 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 18:00:02,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:00:05,778 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 18:00:05,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 18:00:07,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:00:07,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:00:07,577 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 18:00:07,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 18:00:10,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 18:00:12,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:00:12,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:00:16,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:00:18,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:00:18,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:00:19,646 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 18:00:19,721 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:00:19,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 18:00:26,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:00:28,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:00:28,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 18:00:29,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:00:31,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 18:00:35,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:00:36,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=444900.0, ans=0.2 2023-09-29 18:00:36,615 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.41 vs. limit=12.0 2023-09-29 18:00:37,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:00:37,597 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=444900.0, ans=0.125 2023-09-29 18:00:38,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:00:38,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:00:39,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 18:00:40,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:00:40,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:00:40,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 18:00:42,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:00:42,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:00:43,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:00:45,246 INFO [train.py:1039] (3/4) Epoch 13, batch 3000, loss[loss=0.1806, simple_loss=0.2687, pruned_loss=0.04629, over 24645.00 frames. ], tot_loss[loss=0.1935, simple_loss=0.2654, pruned_loss=0.06081, over 4727758.89 frames. ], batch size: 73, lr: 7.96e-03, grad_scale: 16.0 2023-09-29 18:00:45,247 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 18:01:00,607 INFO [train.py:1071] (3/4) Epoch 13, validation: loss=0.3476, simple_loss=0.2869, pruned_loss=0.2041, over 1125622.00 frames. 2023-09-29 18:01:00,608 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 18:01:00,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:01:00,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 18:01:02,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:01:04,369 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.477e+02 1.886e+02 2.154e+02 2.482e+02 3.380e+02, threshold=4.309e+02, percent-clipped=0.0 2023-09-29 18:01:04,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:01:06,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:01:09,640 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 18:01:09,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 18:01:11,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:01:11,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:01:12,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 18:01:14,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:01:20,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=445033.3333333333, ans=0.125 2023-09-29 18:01:21,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 18:01:25,550 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.71 vs. limit=15.0 2023-09-29 18:01:30,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:01:40,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 18:01:41,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:01:44,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:01:45,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:01:45,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:01:47,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:01:47,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 18:01:49,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 18:01:50,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:01:50,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:01:53,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:01:53,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:01:55,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:01:55,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:01:59,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:01:59,823 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=445166.6666666667, ans=0.0 2023-09-29 18:02:01,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:02:01,143 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:02:02,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:02:03,166 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:02:04,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 18:02:05,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:02:05,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:02:07,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:02:09,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:02:09,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:02:11,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 18:02:11,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 18:02:12,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:02:12,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 18:02:13,465 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:02:16,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 18:02:19,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:02:20,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:02:21,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 18:02:22,432 INFO [train.py:1039] (3/4) Epoch 13, batch 3050, loss[loss=0.1831, simple_loss=0.2655, pruned_loss=0.05031, over 24656.00 frames. ], tot_loss[loss=0.194, simple_loss=0.2661, pruned_loss=0.06088, over 4729076.80 frames. ], batch size: 68, lr: 7.95e-03, grad_scale: 16.0 2023-09-29 18:02:23,914 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 18:02:23,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 18:02:24,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:02:25,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:02:25,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 18:02:26,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:02:27,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:02:28,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 18:02:30,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:02:33,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:02:33,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:02:37,523 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.64 vs. limit=15.0 2023-09-29 18:02:38,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:02:39,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 18:02:45,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 18:02:45,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 18:02:47,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:02:52,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:02:55,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:02:55,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:02:56,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:02:59,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:03:01,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:03:01,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:03:02,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:03:02,856 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:03:03,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:03:06,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:03:09,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:03:09,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 18:03:09,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:03:09,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:03:12,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:03:13,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:03:14,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:03:14,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:03:19,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:03:20,681 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:03:26,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:03:28,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:03:28,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:03:28,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:03:29,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:03:31,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:03:31,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 18:03:32,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:03:34,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:03:34,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 18:03:36,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:03:42,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:03:43,545 INFO [train.py:1039] (3/4) Epoch 13, batch 3100, loss[loss=0.1861, simple_loss=0.2667, pruned_loss=0.05276, over 24466.00 frames. ], tot_loss[loss=0.1955, simple_loss=0.2674, pruned_loss=0.06185, over 4718330.96 frames. ], batch size: 66, lr: 7.95e-03, grad_scale: 16.0 2023-09-29 18:03:43,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:03:45,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:03:46,665 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.826e+02 2.024e+02 2.314e+02 3.606e+02, threshold=4.048e+02, percent-clipped=0.0 2023-09-29 18:03:47,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 18:03:50,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 18:03:51,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 18:03:52,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:03:56,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:03:57,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:03:59,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 18:04:02,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:04:07,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 18:04:07,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=445700.0, ans=0.0 2023-09-29 18:04:10,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=445700.0, ans=0.125 2023-09-29 18:04:13,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:04:13,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:13,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=445700.0, ans=0.0 2023-09-29 18:04:14,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:04:14,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:04:15,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 18:04:16,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:04:16,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 18:04:16,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:04:20,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:04:20,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 18:04:22,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:04:22,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=445766.6666666667, ans=0.125 2023-09-29 18:04:24,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=445766.6666666667, ans=0.0 2023-09-29 18:04:25,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:04:25,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=445766.6666666667, ans=0.125 2023-09-29 18:04:27,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 18:04:29,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 18:04:29,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:30,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:04:32,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=445833.3333333333, ans=0.0 2023-09-29 18:04:33,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:04:33,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:33,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:04:35,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:04:35,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:04:37,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:04:37,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:04:37,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:37,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 18:04:41,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:04:42,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 18:04:43,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:04:45,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 18:04:45,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:04:45,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:04:46,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 18:04:59,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 18:05:02,194 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:05:03,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:05:04,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:05:06,264 INFO [train.py:1039] (3/4) Epoch 13, batch 3150, loss[loss=0.2037, simple_loss=0.2677, pruned_loss=0.06987, over 23425.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.2649, pruned_loss=0.06099, over 4714129.95 frames. ], batch size: 105, lr: 7.95e-03, grad_scale: 16.0 2023-09-29 18:05:06,468 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:05:06,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:05:08,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 18:05:09,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:05:09,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 18:05:11,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 18:05:14,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:05:15,778 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 18:05:16,561 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.57 vs. limit=15.0 2023-09-29 18:05:18,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 18:05:18,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:05:20,451 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 18:05:23,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 18:05:25,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 18:05:25,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 18:05:25,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 18:05:25,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:05:25,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:05:27,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:05:28,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 18:05:30,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:05:30,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:05:30,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:05:34,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 18:05:39,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 18:05:39,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:05:41,991 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 18:05:43,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:05:44,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 18:05:47,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 18:05:48,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:05:49,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 18:05:49,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:05:49,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:05:49,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:05:51,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 18:05:51,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:05:52,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 18:05:52,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:05:52,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:05:55,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:05:55,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:05:57,078 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 18:05:58,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:06:00,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 18:06:00,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:02,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 18:06:02,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=446166.6666666667, ans=0.125 2023-09-29 18:06:03,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 18:06:06,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:06:06,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:06:06,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 18:06:07,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 18:06:09,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:06:12,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:06:14,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:14,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:06:18,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:06:19,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:21,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 18:06:23,294 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=446233.3333333333, ans=0.1 2023-09-29 18:06:24,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:06:24,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 18:06:24,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=446233.3333333333, ans=0.0 2023-09-29 18:06:24,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=446233.3333333333, ans=0.0 2023-09-29 18:06:29,132 INFO [train.py:1039] (3/4) Epoch 13, batch 3200, loss[loss=0.1815, simple_loss=0.2479, pruned_loss=0.05755, over 23654.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.2632, pruned_loss=0.06031, over 4712329.20 frames. ], batch size: 232, lr: 7.95e-03, grad_scale: 32.0 2023-09-29 18:06:29,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:30,948 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:06:30,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 18:06:32,615 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.906e+02 2.221e+02 2.638e+02 3.823e+02, threshold=4.442e+02, percent-clipped=0.0 2023-09-29 18:06:34,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:06:39,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:06:44,619 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:06:44,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=446366.6666666667, ans=0.0 2023-09-29 18:06:54,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=446366.6666666667, ans=0.125 2023-09-29 18:06:55,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:06:59,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=446366.6666666667, ans=0.1 2023-09-29 18:07:05,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 18:07:05,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:07:08,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 18:07:10,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:07:14,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:07:14,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:07:15,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:07:20,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 18:07:20,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 18:07:23,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 18:07:26,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 18:07:28,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=446500.0, ans=0.1 2023-09-29 18:07:29,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:07:36,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:07:36,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:07:36,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:07:37,622 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.55 vs. limit=12.0 2023-09-29 18:07:38,308 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 18:07:38,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:07:41,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:07:41,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=446566.6666666667, ans=0.125 2023-09-29 18:07:43,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 18:07:43,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 18:07:45,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 18:07:47,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 18:07:49,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:07:52,565 INFO [train.py:1039] (3/4) Epoch 13, batch 3250, loss[loss=0.1997, simple_loss=0.261, pruned_loss=0.06919, over 23814.00 frames. ], tot_loss[loss=0.1928, simple_loss=0.264, pruned_loss=0.06085, over 4717906.30 frames. ], batch size: 212, lr: 7.94e-03, grad_scale: 16.0 2023-09-29 18:07:52,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 18:07:52,738 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 18:07:52,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:07:52,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:07:54,247 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 18:07:57,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=446633.3333333333, ans=0.125 2023-09-29 18:07:58,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:08:02,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:08:09,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:08:09,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 18:08:10,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:08:12,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:08:12,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:08:13,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:08:14,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:08:17,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:08:17,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:08:17,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:08:17,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:08:17,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:08:19,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:08:23,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:08:24,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:08:26,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:08:26,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:08:28,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:08:28,515 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:08:28,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:08:33,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 18:08:34,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:08:34,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:08:36,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:08:36,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:08:38,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=446766.6666666667, ans=0.1 2023-09-29 18:08:39,165 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.53 vs. limit=15.0 2023-09-29 18:08:44,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:08:53,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:08:54,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:08:54,704 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 18:08:54,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:08:54,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 18:08:54,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:08:59,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 18:09:00,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 18:09:00,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:09:02,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:09:02,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=446900.0, ans=0.0 2023-09-29 18:09:03,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:09:03,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 18:09:03,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:09:06,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:09:06,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:09:08,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 18:09:08,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:09:11,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:09:11,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 18:09:14,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:09:14,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 18:09:16,072 INFO [train.py:1039] (3/4) Epoch 13, batch 3300, loss[loss=0.2094, simple_loss=0.2729, pruned_loss=0.07292, over 23896.00 frames. ], tot_loss[loss=0.194, simple_loss=0.2654, pruned_loss=0.06131, over 4718024.44 frames. ], batch size: 195, lr: 7.94e-03, grad_scale: 16.0 2023-09-29 18:09:16,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 18:09:18,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 18:09:18,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:09:21,247 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.922e+02 2.153e+02 2.771e+02 4.428e+02, threshold=4.306e+02, percent-clipped=0.0 2023-09-29 18:09:22,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:09:24,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:09:24,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:09:26,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 18:09:27,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:09:29,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:09:31,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:09:35,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 18:09:37,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:09:37,255 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:09:40,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:09:40,310 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 18:09:41,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:09:43,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:09:43,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:09:43,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:09:43,330 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 18:09:47,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:09:49,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:09:52,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:09:52,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 18:09:54,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 18:09:54,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:09:55,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:09:57,542 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 18:09:59,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 18:09:59,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:09:59,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=447100.0, ans=0.2 2023-09-29 18:10:02,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 18:10:05,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:10:09,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:10:09,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:10:11,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:10:12,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:10:12,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:10:12,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 18:10:14,863 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=447166.6666666667, ans=0.0 2023-09-29 18:10:15,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:10:16,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:10:17,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:10:19,004 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 18:10:20,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 18:10:22,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 18:10:23,111 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.36 vs. limit=6.0 2023-09-29 18:10:23,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:10:23,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:10:25,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:10:25,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:10:25,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:10:25,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=447233.3333333333, ans=0.07 2023-09-29 18:10:27,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:10:27,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 18:10:28,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:10:31,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 18:10:34,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 18:10:34,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:10:36,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:10:36,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:10:36,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:10:38,619 INFO [train.py:1039] (3/4) Epoch 13, batch 3350, loss[loss=0.168, simple_loss=0.2493, pruned_loss=0.04335, over 24368.00 frames. ], tot_loss[loss=0.1935, simple_loss=0.2654, pruned_loss=0.06082, over 4725671.97 frames. ], batch size: 61, lr: 7.94e-03, grad_scale: 16.0 2023-09-29 18:10:38,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:10:41,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:10:41,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:10:46,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:10:50,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:10:51,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:10:53,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:10:55,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:10:56,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:10:58,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:10:59,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 18:11:01,190 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 18:11:02,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:11:03,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=447366.6666666667, ans=0.0 2023-09-29 18:11:04,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 18:11:04,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 18:11:06,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:11:06,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:11:06,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=447366.6666666667, ans=0.0 2023-09-29 18:11:07,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:11:08,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 18:11:08,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:11:09,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:11:09,874 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=447433.3333333333, ans=0.1 2023-09-29 18:11:11,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:11:12,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:11:14,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:11:14,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:11:18,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:11:21,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:11:21,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:11:26,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:11:28,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:11:29,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:11:29,657 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:11:31,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:11:33,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 18:11:33,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 18:11:33,147 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 18:11:34,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:11:34,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 18:11:36,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:11:37,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:11:44,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:11:46,051 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 18:11:46,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:11:48,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:11:50,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:11:52,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=447566.6666666667, ans=0.125 2023-09-29 18:11:57,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:11:58,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 18:12:00,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:12:01,679 INFO [train.py:1039] (3/4) Epoch 13, batch 3400, loss[loss=0.1718, simple_loss=0.2478, pruned_loss=0.04789, over 24594.00 frames. ], tot_loss[loss=0.1961, simple_loss=0.2675, pruned_loss=0.06235, over 4707675.87 frames. ], batch size: 60, lr: 7.93e-03, grad_scale: 16.0 2023-09-29 18:12:01,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:12:03,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:12:03,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 18:12:03,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:12:03,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 18:12:06,368 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.641e+02 1.928e+02 2.132e+02 2.448e+02 3.305e+02, threshold=4.265e+02, percent-clipped=0.0 2023-09-29 18:12:06,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:12:06,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:12:08,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 18:12:08,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:12:08,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 18:12:13,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 18:12:13,312 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 18:12:13,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:12:18,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:12:18,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:12:20,022 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:12:20,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 18:12:25,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:12:28,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 18:12:32,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=447700.0, ans=0.0 2023-09-29 18:12:34,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:12:37,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:12:37,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:12:38,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 18:12:43,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:12:49,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 18:12:54,284 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:12:54,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:12:56,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 18:12:56,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:12:57,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:12:58,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:12:59,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:13:02,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:13:06,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:13:06,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:13:11,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:13:14,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 18:13:19,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:13:23,689 INFO [train.py:1039] (3/4) Epoch 13, batch 3450, loss[loss=0.2158, simple_loss=0.2882, pruned_loss=0.07168, over 23469.00 frames. ], tot_loss[loss=0.1958, simple_loss=0.2672, pruned_loss=0.06223, over 4710321.67 frames. ], batch size: 93, lr: 7.93e-03, grad_scale: 16.0 2023-09-29 18:13:23,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 18:13:28,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 18:13:28,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:13:30,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:13:31,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 18:13:32,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:13:33,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=447966.6666666667, ans=22.5 2023-09-29 18:13:37,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:13:40,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:13:41,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:13:42,264 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:13:43,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:13:43,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:13:43,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=448033.3333333333, ans=0.0 2023-09-29 18:13:45,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:13:52,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 18:13:52,994 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:13:58,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 18:13:58,856 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:14:00,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:14:00,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:14:09,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 18:14:09,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:14:12,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:14:12,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:14:15,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:14:16,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:14:17,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=448166.6666666667, ans=0.2 2023-09-29 18:14:17,155 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=448166.6666666667, ans=0.125 2023-09-29 18:14:18,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=448166.6666666667, ans=0.035 2023-09-29 18:14:19,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 18:14:19,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:14:19,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:14:22,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:14:25,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 18:14:28,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:14:29,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=448233.3333333333, ans=0.125 2023-09-29 18:14:33,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:14:34,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:14:37,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:14:39,879 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=448233.3333333333, ans=0.125 2023-09-29 18:14:42,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:14:42,975 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:14:44,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:14:44,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:14:48,036 INFO [train.py:1039] (3/4) Epoch 13, batch 3500, loss[loss=0.205, simple_loss=0.2849, pruned_loss=0.06253, over 24065.00 frames. ], tot_loss[loss=0.1948, simple_loss=0.2655, pruned_loss=0.06204, over 4699568.16 frames. ], batch size: 80, lr: 7.93e-03, grad_scale: 16.0 2023-09-29 18:14:49,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:14:52,616 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.904e+02 2.170e+02 2.519e+02 3.488e+02, threshold=4.340e+02, percent-clipped=0.0 2023-09-29 18:14:52,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:14:54,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 18:14:55,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:14:59,019 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 18:15:02,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:15:02,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 18:15:08,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:15:09,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:15:10,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:15:10,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:15:10,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 18:15:11,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:12,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:15:12,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 18:15:15,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:15,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 18:15:17,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:15:20,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:20,828 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=448433.3333333333, ans=0.0 2023-09-29 18:15:22,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 18:15:22,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:15:25,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:15:27,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:15:28,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:30,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:15:31,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:15:33,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 18:15:33,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 18:15:34,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 18:15:34,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:15:37,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:37,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:15:37,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:15:41,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 18:15:41,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:15:45,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=448500.0, ans=0.0 2023-09-29 18:15:47,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:15:49,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 18:15:49,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 18:15:49,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:15:52,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:15:52,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:15:54,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:15:57,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 18:15:57,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:16:00,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:16:01,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 18:16:03,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 18:16:05,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:16:06,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:16:06,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:16:06,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:16:09,913 INFO [train.py:1039] (3/4) Epoch 13, batch 3550, loss[loss=0.2058, simple_loss=0.2697, pruned_loss=0.07099, over 23807.00 frames. ], tot_loss[loss=0.1933, simple_loss=0.2647, pruned_loss=0.06096, over 4716143.94 frames. ], batch size: 164, lr: 7.92e-03, grad_scale: 16.0 2023-09-29 18:16:10,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:16:20,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:16:22,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 18:16:26,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:16:27,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:16:29,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:16:29,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=448700.0, ans=0.95 2023-09-29 18:16:31,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:16:31,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:16:35,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:16:35,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:16:35,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:16:35,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 18:16:37,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:16:37,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=448700.0, ans=0.0 2023-09-29 18:16:37,942 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.21 vs. limit=15.0 2023-09-29 18:16:43,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:16:43,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:16:45,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:16:45,194 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:16:45,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:16:46,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 18:16:46,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:16:49,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:16:50,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=448766.6666666667, ans=0.125 2023-09-29 18:16:51,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 18:16:57,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:16:58,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:17:00,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:17:02,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 18:17:02,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:17:02,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 18:17:03,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:17:05,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:17:05,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:17:07,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=448833.3333333333, ans=0.125 2023-09-29 18:17:08,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 18:17:10,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:17:16,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:17:16,713 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 18:17:18,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:17:21,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:17:23,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 18:17:30,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 18:17:30,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:17:30,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=448900.0, ans=0.0 2023-09-29 18:17:31,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:17:33,875 INFO [train.py:1039] (3/4) Epoch 13, batch 3600, loss[loss=0.2009, simple_loss=0.2647, pruned_loss=0.06854, over 23409.00 frames. ], tot_loss[loss=0.1925, simple_loss=0.2634, pruned_loss=0.06077, over 4709813.82 frames. ], batch size: 285, lr: 7.92e-03, grad_scale: 32.0 2023-09-29 18:17:35,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:17:37,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:17:37,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:17:39,123 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.817e+02 2.056e+02 2.414e+02 4.361e+02, threshold=4.112e+02, percent-clipped=1.0 2023-09-29 18:17:40,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:17:43,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:17:44,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:17:45,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:17:45,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:17:45,629 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 18:17:48,724 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:17:48,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:17:49,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=449033.3333333333, ans=0.0 2023-09-29 18:17:49,891 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=9.73 vs. limit=15.0 2023-09-29 18:17:51,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:17:55,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:17:56,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:17:56,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:17:58,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 18:17:58,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:18:01,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:18:01,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:18:03,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:18:03,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=449033.3333333333, ans=0.125 2023-09-29 18:18:07,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:18:07,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:18:08,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 18:18:16,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:18:18,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:18:18,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 18:18:23,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:18:25,937 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.61 vs. limit=15.0 2023-09-29 18:18:28,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:18:30,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=449166.6666666667, ans=0.125 2023-09-29 18:18:30,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=449166.6666666667, ans=0.0 2023-09-29 18:18:31,673 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:18:37,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 18:18:37,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:18:37,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 18:18:38,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 18:18:40,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 18:18:42,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:18:44,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:18:45,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 18:18:45,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:18:46,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=449233.3333333333, ans=0.125 2023-09-29 18:18:47,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:18:47,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:18:47,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 18:18:49,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 18:18:52,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:18:53,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 18:18:56,399 INFO [train.py:1039] (3/4) Epoch 13, batch 3650, loss[loss=0.2051, simple_loss=0.2687, pruned_loss=0.0707, over 23804.00 frames. ], tot_loss[loss=0.1923, simple_loss=0.2639, pruned_loss=0.06039, over 4723779.24 frames. ], batch size: 212, lr: 7.92e-03, grad_scale: 32.0 2023-09-29 18:18:58,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 18:18:59,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:19:03,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 18:19:05,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 18:19:11,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:19:11,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:19:13,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:19:15,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=449366.6666666667, ans=0.2 2023-09-29 18:19:16,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 18:19:16,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:19:16,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 18:19:18,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:19:18,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:19:20,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 18:19:20,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 18:19:20,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:19:22,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:19:23,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:19:28,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 18:19:28,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 18:19:30,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:19:31,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 18:19:32,321 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.81 vs. limit=6.0 2023-09-29 18:19:33,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:19:33,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:19:39,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:19:41,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:19:41,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:19:43,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:19:43,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:19:44,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=449433.3333333333, ans=0.0 2023-09-29 18:19:45,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:19:48,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:19:49,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:19:51,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:19:53,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 18:19:53,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=449500.0, ans=0.0 2023-09-29 18:19:54,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:19:56,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:20:02,027 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 18:20:05,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:20:05,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:20:06,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 18:20:06,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:20:08,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:20:09,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:20:11,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 18:20:11,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:20:15,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:20:17,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:20:17,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:20:20,471 INFO [train.py:1039] (3/4) Epoch 13, batch 3700, loss[loss=0.19, simple_loss=0.2791, pruned_loss=0.05048, over 24330.00 frames. ], tot_loss[loss=0.1934, simple_loss=0.2652, pruned_loss=0.06084, over 4726532.88 frames. ], batch size: 74, lr: 7.92e-03, grad_scale: 32.0 2023-09-29 18:20:20,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:20:20,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 18:20:20,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:20:22,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 18:20:22,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:20:25,577 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 1.943e+02 2.154e+02 2.473e+02 4.046e+02, threshold=4.307e+02, percent-clipped=0.0 2023-09-29 18:20:25,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:20:30,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:20:32,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:20:33,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:20:33,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:20:34,359 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.33 vs. limit=15.0 2023-09-29 18:20:35,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 18:20:38,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:20:39,771 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 18:20:49,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:20:49,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:20:50,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:20:50,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 18:20:51,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:20:54,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:20:56,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 18:20:57,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:20:58,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:21:01,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:21:02,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:21:04,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 18:21:08,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:21:08,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 18:21:09,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:21:09,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 18:21:13,098 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=449833.3333333333, ans=0.125 2023-09-29 18:21:16,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:21:17,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:21:20,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:21:20,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 18:21:23,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:21:23,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 18:21:23,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:21:23,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:21:25,796 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=449900.0, ans=0.2 2023-09-29 18:21:27,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:21:29,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 18:21:31,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 18:21:31,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:21:31,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:21:32,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:21:34,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:21:37,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:21:39,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:21:40,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:21:41,396 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.02 vs. limit=15.0 2023-09-29 18:21:42,145 INFO [train.py:1039] (3/4) Epoch 13, batch 3750, loss[loss=0.1725, simple_loss=0.246, pruned_loss=0.04944, over 24433.00 frames. ], tot_loss[loss=0.1939, simple_loss=0.2659, pruned_loss=0.06093, over 4725734.15 frames. ], batch size: 58, lr: 7.91e-03, grad_scale: 32.0 2023-09-29 18:21:42,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 18:21:44,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 18:21:45,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 18:21:47,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 18:21:47,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:21:49,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:21:50,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:21:53,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:21:53,686 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=449966.6666666667, ans=0.0 2023-09-29 18:21:57,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:22:01,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:22:01,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:22:04,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:22:08,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:22:08,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 18:22:10,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:22:12,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:22:12,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:22:15,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 18:22:18,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 18:22:20,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:22:21,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:22:23,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:22:25,248 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=450100.0, ans=0.125 2023-09-29 18:22:26,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=450100.0, ans=0.125 2023-09-29 18:22:29,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:22:31,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 18:22:34,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 18:22:36,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=450166.6666666667, ans=0.125 2023-09-29 18:22:39,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:22:42,789 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.78 vs. limit=15.0 2023-09-29 18:22:43,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:22:44,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:22:47,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:22:51,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 18:22:52,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:22:55,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:22:57,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:22:59,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 18:22:59,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=450233.3333333333, ans=0.1 2023-09-29 18:23:03,487 INFO [train.py:1039] (3/4) Epoch 13, batch 3800, loss[loss=0.2027, simple_loss=0.2799, pruned_loss=0.0628, over 24389.00 frames. ], tot_loss[loss=0.1938, simple_loss=0.266, pruned_loss=0.0608, over 4722229.52 frames. ], batch size: 77, lr: 7.91e-03, grad_scale: 32.0 2023-09-29 18:23:06,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:23:08,366 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.674e+02 1.938e+02 2.125e+02 2.387e+02 3.006e+02, threshold=4.251e+02, percent-clipped=0.0 2023-09-29 18:23:12,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:23:12,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 18:23:13,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 18:23:13,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:23:17,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:23:19,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 18:23:22,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 18:23:22,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:23:23,591 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:23:25,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:23:25,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:23:25,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:23:26,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 18:23:29,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 18:23:31,343 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:23:36,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:23:39,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:23:39,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:23:41,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 18:23:41,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:23:41,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=450433.3333333333, ans=0.125 2023-09-29 18:23:42,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:23:43,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:23:45,057 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:23:48,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 18:23:48,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 18:23:50,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=450433.3333333333, ans=0.0 2023-09-29 18:23:51,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:23:57,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:24:03,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:24:05,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 18:24:05,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 18:24:07,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:24:08,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:24:10,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:24:10,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 18:24:13,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 18:24:13,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 18:24:13,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=450566.6666666667, ans=0.125 2023-09-29 18:24:15,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:24:15,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:24:19,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=450566.6666666667, ans=0.125 2023-09-29 18:24:23,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:24:24,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:24:25,918 INFO [train.py:1039] (3/4) Epoch 13, batch 3850, loss[loss=0.1902, simple_loss=0.2694, pruned_loss=0.0555, over 23670.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2645, pruned_loss=0.06035, over 4705446.01 frames. ], batch size: 85, lr: 7.91e-03, grad_scale: 16.0 2023-09-29 18:24:27,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:24:28,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=450633.3333333333, ans=0.1 2023-09-29 18:24:29,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 18:24:29,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:24:30,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:24:35,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:24:37,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:24:40,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 18:24:42,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 18:24:48,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:24:49,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=450700.0, ans=0.125 2023-09-29 18:24:50,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:24:52,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:24:54,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:24:55,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:24:57,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:24:57,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:24:57,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:24:58,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:01,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:03,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:25:03,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:25:03,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 18:25:03,462 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 18:25:05,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:25:05,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:25:09,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:09,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:25:09,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 18:25:09,885 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=450766.6666666667, ans=0.125 2023-09-29 18:25:11,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 18:25:13,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:15,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=450833.3333333333, ans=0.125 2023-09-29 18:25:16,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 18:25:19,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 18:25:22,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:24,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:25:29,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:29,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 18:25:29,852 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=450900.0, ans=0.125 2023-09-29 18:25:32,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 18:25:35,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:25:35,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:25:36,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=450900.0, ans=0.125 2023-09-29 18:25:38,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:25:38,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:25:40,451 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:40,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:40,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:25:40,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 18:25:41,425 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.15 vs. limit=15.0 2023-09-29 18:25:42,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:25:43,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 18:25:43,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:43,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:25:47,077 INFO [train.py:1039] (3/4) Epoch 13, batch 3900, loss[loss=0.2028, simple_loss=0.2743, pruned_loss=0.06563, over 23456.00 frames. ], tot_loss[loss=0.191, simple_loss=0.2631, pruned_loss=0.05946, over 4710086.68 frames. ], batch size: 93, lr: 7.90e-03, grad_scale: 8.0 2023-09-29 18:25:47,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:25:47,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:48,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:25:50,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:25:50,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:25:51,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:25:51,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 18:25:53,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:25:54,604 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.889e+02 2.168e+02 2.543e+02 3.582e+02, threshold=4.337e+02, percent-clipped=0.0 2023-09-29 18:25:56,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:25:57,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:25:57,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:25:57,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:26:03,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:26:04,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:26:06,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:26:07,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 18:26:07,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:26:09,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 18:26:11,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:26:11,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 18:26:12,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 18:26:17,977 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.16 vs. limit=15.0 2023-09-29 18:26:18,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:26:20,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:26:20,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:26:22,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:26:23,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=451100.0, ans=0.2 2023-09-29 18:26:25,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:26:26,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:26:29,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:26:29,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:26:30,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=451100.0, ans=0.1 2023-09-29 18:26:31,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:26:33,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=451166.6666666667, ans=0.125 2023-09-29 18:26:35,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=451166.6666666667, ans=0.125 2023-09-29 18:26:36,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:26:36,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:26:42,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:26:44,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:26:49,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=451233.3333333333, ans=0.2 2023-09-29 18:26:51,279 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=451233.3333333333, ans=0.0 2023-09-29 18:26:51,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=451233.3333333333, ans=0.1 2023-09-29 18:26:57,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:26:57,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=451233.3333333333, ans=0.1 2023-09-29 18:27:00,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:27:00,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 18:27:01,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 18:27:01,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:27:02,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 18:27:03,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:27:05,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=451300.0, ans=0.125 2023-09-29 18:27:06,305 INFO [train.py:1039] (3/4) Epoch 13, batch 3950, loss[loss=0.2227, simple_loss=0.2894, pruned_loss=0.07796, over 23859.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.2643, pruned_loss=0.0598, over 4720074.95 frames. ], batch size: 212, lr: 7.90e-03, grad_scale: 8.0 2023-09-29 18:27:06,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 18:27:12,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:27:14,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 18:27:15,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:27:17,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:27:18,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:27:20,826 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=451300.0, ans=0.125 2023-09-29 18:27:23,671 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 18:27:25,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:27:25,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 18:27:25,194 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 18:27:25,258 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:27:28,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:27:29,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:27:29,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:27:32,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 18:27:35,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:27:35,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:27:35,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:27:36,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:27:36,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:27:50,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:27:50,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:27:57,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 18:27:58,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=451500.0, ans=0.125 2023-09-29 18:28:05,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 18:28:05,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 18:28:05,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:28:05,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:28:11,722 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=451566.6666666667, ans=0.125 2023-09-29 18:28:13,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:28:13,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:28:13,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:28:13,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:28:14,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 18:28:20,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:28:22,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:28:26,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 18:28:29,186 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=451633.3333333333, ans=0.125 2023-09-29 18:28:29,455 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.68 vs. limit=15.0 2023-09-29 18:28:30,236 INFO [train.py:1039] (3/4) Epoch 13, batch 4000, loss[loss=0.2076, simple_loss=0.2823, pruned_loss=0.06646, over 23461.00 frames. ], tot_loss[loss=0.1923, simple_loss=0.2648, pruned_loss=0.05988, over 4732188.43 frames. ], batch size: 93, lr: 7.90e-03, grad_scale: 16.0 2023-09-29 18:28:37,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:28:38,698 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.910e+02 2.120e+02 2.727e+02 3.930e+02, threshold=4.239e+02, percent-clipped=0.0 2023-09-29 18:28:43,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:28:48,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:28:49,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:28:49,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:28:51,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 18:28:51,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 18:28:52,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=451700.0, ans=0.125 2023-09-29 18:28:53,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 18:28:53,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:28:53,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 18:28:55,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:28:58,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:28:58,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:28:58,840 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:29:00,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:29:00,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 18:29:01,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:29:03,411 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 18:29:04,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:29:04,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:29:07,966 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 18:29:09,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:29:09,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:29:12,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=451766.6666666667, ans=0.1 2023-09-29 18:29:16,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 18:29:17,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:29:19,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:29:20,985 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 18:29:24,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:29:24,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 18:29:25,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:29:27,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:29:27,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:29:28,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:29:30,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:29:30,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:29:32,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 18:29:33,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:29:35,564 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 18:29:37,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=451900.0, ans=0.0 2023-09-29 18:29:40,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:29:43,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 18:29:45,165 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=451900.0, ans=0.125 2023-09-29 18:29:46,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:29:46,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:29:46,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=451900.0, ans=0.025 2023-09-29 18:29:48,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:29:48,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:29:50,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=451966.6666666667, ans=0.05 2023-09-29 18:29:51,466 INFO [train.py:1039] (3/4) Epoch 13, batch 4050, loss[loss=0.1882, simple_loss=0.266, pruned_loss=0.05519, over 24644.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2652, pruned_loss=0.05995, over 4727995.23 frames. ], batch size: 68, lr: 7.90e-03, grad_scale: 16.0 2023-09-29 18:29:54,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:29:56,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 18:29:56,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 18:30:00,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:30:00,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:30:01,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:30:03,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:30:04,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:30:04,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=451966.6666666667, ans=0.125 2023-09-29 18:30:08,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:30:11,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:30:13,215 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 18:30:14,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:30:16,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:30:19,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:30:20,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:30:22,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 18:30:24,350 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=452100.0, ans=0.125 2023-09-29 18:30:25,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 18:30:25,578 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 18:30:28,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:30:35,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 18:30:37,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:30:40,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:30:43,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:30:44,594 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:30:44,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:30:46,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:30:50,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 18:30:50,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 18:30:52,465 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:30:54,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 18:30:58,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:31:07,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 18:31:07,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:31:07,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:31:10,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 18:31:10,964 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=452233.3333333333, ans=0.1 2023-09-29 18:31:12,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 18:31:12,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:31:14,164 INFO [train.py:1039] (3/4) Epoch 13, batch 4100, loss[loss=0.2124, simple_loss=0.2758, pruned_loss=0.07446, over 23422.00 frames. ], tot_loss[loss=0.1944, simple_loss=0.2666, pruned_loss=0.06111, over 4709926.60 frames. ], batch size: 285, lr: 7.89e-03, grad_scale: 16.0 2023-09-29 18:31:14,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:31:14,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:31:16,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:31:18,601 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.20 vs. limit=15.0 2023-09-29 18:31:21,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=452300.0, ans=0.2 2023-09-29 18:31:22,213 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.980e+02 2.231e+02 2.743e+02 3.910e+02, threshold=4.461e+02, percent-clipped=0.0 2023-09-29 18:31:22,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 18:31:24,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=452300.0, ans=0.125 2023-09-29 18:31:24,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=452300.0, ans=0.2 2023-09-29 18:31:25,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 18:31:27,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 18:31:27,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=452300.0, ans=0.125 2023-09-29 18:31:28,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 18:31:28,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:31:29,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:31:30,039 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:31:30,060 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:31:30,179 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 18:31:33,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:31:34,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:31:34,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:31:38,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:31:41,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:31:43,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:31:43,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:31:43,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 18:31:43,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:31:43,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:31:43,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:31:45,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:31:45,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 18:31:48,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:31:50,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 18:31:51,011 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=452433.3333333333, ans=0.125 2023-09-29 18:31:52,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:31:55,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:31:55,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 18:31:56,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:31:57,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:31:58,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:32:00,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 18:32:00,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:32:01,704 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:32:04,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 18:32:04,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:32:06,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:32:08,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=452500.0, ans=0.05 2023-09-29 18:32:09,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:32:09,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=452500.0, ans=0.125 2023-09-29 18:32:15,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:32:16,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:32:19,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:32:27,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:32:27,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:32:31,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:32:34,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:32:36,417 INFO [train.py:1039] (3/4) Epoch 13, batch 4150, loss[loss=0.1693, simple_loss=0.2385, pruned_loss=0.05004, over 24293.00 frames. ], tot_loss[loss=0.1948, simple_loss=0.2665, pruned_loss=0.06151, over 4692126.54 frames. ], batch size: 56, lr: 7.89e-03, grad_scale: 16.0 2023-09-29 18:32:38,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:32:38,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:32:39,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:32:39,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:32:41,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 18:32:43,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:32:44,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 18:32:46,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 18:32:46,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 18:32:48,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:32:53,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:32:53,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:32:57,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:32:59,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:33:00,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 18:33:02,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 18:33:02,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:33:03,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 18:33:08,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:33:13,305 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:33:14,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 18:33:16,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 18:33:16,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:33:18,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 18:33:18,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:33:18,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:33:21,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:33:21,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:33:25,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 18:33:29,980 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 18:33:31,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:33:31,775 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 18:33:33,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:33:33,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 18:33:37,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:33:39,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:33:40,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:33:42,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 18:33:42,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:33:42,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 18:33:44,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 18:33:45,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 18:33:45,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:33:45,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:33:47,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:33:48,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 18:33:48,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:33:48,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 18:33:50,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:33:51,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:33:51,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 18:33:52,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 18:33:59,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:34:00,833 INFO [train.py:1039] (3/4) Epoch 13, batch 4200, loss[loss=0.1948, simple_loss=0.2775, pruned_loss=0.05601, over 24436.00 frames. ], tot_loss[loss=0.1937, simple_loss=0.2654, pruned_loss=0.06098, over 4697111.45 frames. ], batch size: 69, lr: 7.89e-03, grad_scale: 16.0 2023-09-29 18:34:01,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 18:34:04,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:34:06,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:34:07,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:34:09,034 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.933e+02 2.202e+02 2.518e+02 4.955e+02, threshold=4.404e+02, percent-clipped=0.0 2023-09-29 18:34:09,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:34:09,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:34:09,897 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.83 vs. limit=15.0 2023-09-29 18:34:10,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 18:34:15,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 18:34:15,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:34:17,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:34:21,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:34:24,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 18:34:26,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:34:27,028 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:34:28,072 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.69 vs. limit=15.0 2023-09-29 18:34:28,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 18:34:28,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:34:28,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:34:30,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:34:30,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:34:31,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:34:35,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 18:34:35,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:34:36,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=453100.0, ans=0.1 2023-09-29 18:34:39,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 18:34:41,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:34:44,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:34:44,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:34:44,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=453100.0, ans=0.1 2023-09-29 18:34:47,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:34:47,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 18:34:47,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:34:49,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:34:55,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:34:56,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:35:00,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=453166.6666666667, ans=0.0 2023-09-29 18:35:02,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:35:05,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 18:35:07,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:35:12,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 18:35:14,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:35:17,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 18:35:19,782 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.01 vs. limit=10.0 2023-09-29 18:35:22,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 18:35:23,579 INFO [train.py:1039] (3/4) Epoch 13, batch 4250, loss[loss=0.1992, simple_loss=0.2586, pruned_loss=0.06988, over 23749.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2641, pruned_loss=0.06054, over 4699427.61 frames. ], batch size: 164, lr: 7.88e-03, grad_scale: 16.0 2023-09-29 18:35:25,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:35:25,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 18:35:27,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:35:37,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:35:37,088 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 18:35:37,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:35:40,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:35:43,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:35:45,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=453366.6666666667, ans=0.0 2023-09-29 18:35:49,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:35:49,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:35:51,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:35:51,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:35:52,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:35:52,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:35:54,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:35:57,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:35:58,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:36:00,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 18:36:03,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 18:36:03,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:36:04,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:36:04,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:36:06,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:36:06,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:36:06,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:36:11,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 18:36:11,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:36:12,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=453433.3333333333, ans=0.1 2023-09-29 18:36:18,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:36:18,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:36:20,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 18:36:20,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:36:20,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=453500.0, ans=0.125 2023-09-29 18:36:22,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 18:36:24,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:36:25,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:36:28,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:36:28,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:36:28,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 18:36:30,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:36:31,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 18:36:34,322 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.14 vs. limit=6.0 2023-09-29 18:36:36,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:36:38,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:36:39,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:36:42,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:36:42,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:36:43,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:36:44,151 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:36:45,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:36:45,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 18:36:46,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:36:50,450 INFO [train.py:1039] (3/4) Epoch 13, batch 4300, loss[loss=0.1887, simple_loss=0.2762, pruned_loss=0.05059, over 24546.00 frames. ], tot_loss[loss=0.1925, simple_loss=0.264, pruned_loss=0.06048, over 4709280.67 frames. ], batch size: 71, lr: 7.88e-03, grad_scale: 16.0 2023-09-29 18:36:52,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:36:52,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:36:55,371 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.66 vs. limit=15.0 2023-09-29 18:36:57,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:36:58,705 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 2.045e+02 2.305e+02 2.959e+02 4.581e+02, threshold=4.610e+02, percent-clipped=2.0 2023-09-29 18:36:59,183 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:37:00,546 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=453633.3333333333, ans=0.0 2023-09-29 18:37:02,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=453633.3333333333, ans=0.125 2023-09-29 18:37:02,173 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=453633.3333333333, ans=0.125 2023-09-29 18:37:04,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:37:04,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 18:37:06,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:37:08,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:37:08,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:37:08,190 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 18:37:11,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:37:14,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:37:19,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 18:37:19,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:37:20,581 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 18:37:22,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 18:37:23,075 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.97 vs. limit=12.0 2023-09-29 18:37:23,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:37:29,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:37:29,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:37:31,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:37:31,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:37:33,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:37:33,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 18:37:34,722 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 18:37:37,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:37:40,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:37:40,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:37:40,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:37:41,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:37:41,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 18:37:41,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 18:37:42,746 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 18:37:44,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:37:44,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 18:37:44,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 18:37:46,118 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=453833.3333333333, ans=0.0 2023-09-29 18:37:47,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:37:47,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=453833.3333333333, ans=0.0 2023-09-29 18:37:49,092 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 18:37:50,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:37:52,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:37:52,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:37:54,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 18:37:55,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:37:55,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:37:55,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:37:55,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:37:57,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:37:58,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:38:02,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:38:04,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:38:04,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:38:09,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 18:38:10,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 18:38:11,801 INFO [train.py:1039] (3/4) Epoch 13, batch 4350, loss[loss=0.1959, simple_loss=0.2778, pruned_loss=0.05702, over 24178.00 frames. ], tot_loss[loss=0.1939, simple_loss=0.2658, pruned_loss=0.06099, over 4710799.78 frames. ], batch size: 80, lr: 7.88e-03, grad_scale: 16.0 2023-09-29 18:38:15,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:38:18,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:38:21,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:38:21,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:38:25,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:38:31,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:38:32,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:38:32,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:38:36,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:38:39,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:38:40,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:38:46,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 18:38:48,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:38:48,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:38:53,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=454100.0, ans=0.125 2023-09-29 18:38:53,602 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.83 vs. limit=15.0 2023-09-29 18:38:54,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:38:56,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=454100.0, ans=0.125 2023-09-29 18:38:57,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 18:39:00,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:39:02,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 18:39:07,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=454166.6666666667, ans=10.0 2023-09-29 18:39:08,769 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 18:39:08,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:39:10,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:39:10,479 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 18:39:12,599 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 18:39:12,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:39:14,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:39:16,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:39:16,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:39:16,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=454233.3333333333, ans=0.125 2023-09-29 18:39:19,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:39:19,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:39:22,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 18:39:22,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:22,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:39:22,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:23,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 18:39:25,286 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 18:39:25,293 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 18:39:25,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 18:39:28,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:39:29,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:39:29,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:39:31,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:39:32,925 INFO [train.py:1039] (3/4) Epoch 13, batch 4400, loss[loss=0.199, simple_loss=0.2622, pruned_loss=0.06788, over 23837.00 frames. ], tot_loss[loss=0.1943, simple_loss=0.2664, pruned_loss=0.06106, over 4712746.94 frames. ], batch size: 195, lr: 7.88e-03, grad_scale: 32.0 2023-09-29 18:39:33,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 18:39:34,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=454300.0, ans=0.125 2023-09-29 18:39:36,021 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 18:39:36,032 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:39,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:39:39,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:41,141 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.590e+02 1.889e+02 2.108e+02 2.568e+02 3.749e+02, threshold=4.217e+02, percent-clipped=0.0 2023-09-29 18:39:42,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:39:44,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 18:39:44,542 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 18:39:44,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 18:39:46,074 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 18:39:46,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 18:39:46,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:39:50,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 18:39:51,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:39:51,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:39:51,796 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 18:39:55,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:39:55,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 18:39:55,377 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 18:39:58,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 18:39:59,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 18:39:59,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 18:39:59,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:40:01,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:40:02,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:40:02,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:40:03,712 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.01 vs. limit=15.0 2023-09-29 18:40:04,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 18:40:04,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 18:40:04,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:40:04,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=454433.3333333333, ans=0.0 2023-09-29 18:40:07,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:40:07,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:40:08,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=454433.3333333333, ans=0.0 2023-09-29 18:40:09,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:40:09,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:40:09,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 18:40:10,774 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 18:40:11,096 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=454433.3333333333, ans=0.0 2023-09-29 18:40:14,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:40:24,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:40:28,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 18:40:33,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:40:34,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:40:37,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:40:37,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 18:40:37,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:40:37,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:40:37,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:40:39,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:40:41,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 18:40:44,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 18:40:45,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 18:40:45,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:40:45,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 18:40:47,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:40:51,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:40:54,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 18:40:56,116 INFO [train.py:1039] (3/4) Epoch 13, batch 4450, loss[loss=0.1883, simple_loss=0.2599, pruned_loss=0.05836, over 23514.00 frames. ], tot_loss[loss=0.1943, simple_loss=0.2666, pruned_loss=0.061, over 4711937.49 frames. ], batch size: 134, lr: 7.87e-03, grad_scale: 32.0 2023-09-29 18:40:56,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:40:58,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:40:58,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:41:05,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=454633.3333333333, ans=0.125 2023-09-29 18:41:06,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:41:06,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:41:12,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:41:13,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:41:17,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:41:17,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:41:18,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 18:41:18,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:41:18,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:41:18,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:41:18,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 18:41:22,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:41:24,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=454700.0, ans=0.0 2023-09-29 18:41:28,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:41:28,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:41:29,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:41:31,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:41:31,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:41:35,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 18:41:37,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 18:41:37,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 18:41:37,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:41:41,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:41:42,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 18:41:45,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:41:50,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:41:50,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 18:41:50,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:41:50,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:41:50,317 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:41:50,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:41:53,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:41:55,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 18:41:57,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 18:41:59,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:42:02,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:42:04,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:42:05,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:42:07,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 18:42:07,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:42:10,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 18:42:13,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:42:16,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=454900.0, ans=10.0 2023-09-29 18:42:17,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:42:18,815 INFO [train.py:1039] (3/4) Epoch 13, batch 4500, loss[loss=0.2004, simple_loss=0.2816, pruned_loss=0.05955, over 24361.00 frames. ], tot_loss[loss=0.1947, simple_loss=0.2668, pruned_loss=0.06128, over 4703618.88 frames. ], batch size: 74, lr: 7.87e-03, grad_scale: 32.0 2023-09-29 18:42:19,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 18:42:19,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 18:42:20,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:42:23,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=454966.6666666667, ans=0.125 2023-09-29 18:42:25,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:42:25,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:42:26,121 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.12 vs. limit=15.0 2023-09-29 18:42:26,532 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 2.036e+02 2.219e+02 2.497e+02 4.181e+02, threshold=4.438e+02, percent-clipped=0.0 2023-09-29 18:42:28,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 18:42:28,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:42:30,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:42:30,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:42:43,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:42:43,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:42:48,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:42:49,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:42:51,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:42:57,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:43:02,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:43:07,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:43:11,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:43:12,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 18:43:13,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:43:13,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:43:14,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:43:14,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:43:17,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:43:17,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 18:43:17,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 18:43:17,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:43:22,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:43:22,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:43:26,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:43:29,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:43:29,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:43:31,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 18:43:32,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 18:43:32,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 18:43:32,983 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:43:38,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 18:43:40,820 INFO [train.py:1039] (3/4) Epoch 13, batch 4550, loss[loss=0.1974, simple_loss=0.278, pruned_loss=0.05842, over 24406.00 frames. ], tot_loss[loss=0.1932, simple_loss=0.2653, pruned_loss=0.06059, over 4715815.75 frames. ], batch size: 77, lr: 7.87e-03, grad_scale: 32.0 2023-09-29 18:43:41,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 18:43:44,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:43:46,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:43:47,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:43:49,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:43:54,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:43:55,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:43:58,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:43:58,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:43:58,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:00,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:44:02,317 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:44:04,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:44:07,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 18:44:07,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=455366.6666666667, ans=0.0 2023-09-29 18:44:08,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 18:44:10,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:44:11,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 18:44:15,430 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=455433.3333333333, ans=0.125 2023-09-29 18:44:16,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 18:44:17,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:44:17,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=455433.3333333333, ans=0.1 2023-09-29 18:44:20,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 18:44:20,728 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=455433.3333333333, ans=0.07 2023-09-29 18:44:20,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=455433.3333333333, ans=0.2 2023-09-29 18:44:23,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:44:25,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:25,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:25,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:44:28,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 18:44:31,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:44:34,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:34,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:44:35,007 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=455500.0, ans=0.05 2023-09-29 18:44:36,979 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.59 vs. limit=15.0 2023-09-29 18:44:37,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:44:37,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 18:44:39,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 18:44:39,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:44:39,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 18:44:43,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 18:44:43,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:44:45,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:44:45,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:44:46,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:44:47,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:44:48,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:44:50,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 18:44:52,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:44:54,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 18:44:54,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 18:44:54,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:44:54,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 18:44:57,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:44:57,322 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:45:00,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:45:00,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:45:01,775 INFO [train.py:1039] (3/4) Epoch 13, batch 4600, loss[loss=0.1992, simple_loss=0.2661, pruned_loss=0.06615, over 23723.00 frames. ], tot_loss[loss=0.1922, simple_loss=0.2641, pruned_loss=0.06013, over 4719558.68 frames. ], batch size: 179, lr: 7.86e-03, grad_scale: 32.0 2023-09-29 18:45:01,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 18:45:03,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:45:05,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 18:45:05,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=455633.3333333333, ans=0.2 2023-09-29 18:45:07,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:07,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:45:09,167 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=455633.3333333333, ans=0.2 2023-09-29 18:45:09,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=455633.3333333333, ans=0.1 2023-09-29 18:45:10,374 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.841e+02 2.065e+02 2.321e+02 3.867e+02, threshold=4.130e+02, percent-clipped=0.0 2023-09-29 18:45:10,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:45:10,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:45:10,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=455633.3333333333, ans=0.0 2023-09-29 18:45:12,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:45:12,765 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.15 vs. limit=12.0 2023-09-29 18:45:13,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 18:45:15,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:45:19,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:45:21,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:45:22,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:30,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 18:45:31,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:34,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:38,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:45:38,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:45:44,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 18:45:44,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:45:44,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:45:49,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:45:49,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:45:51,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:45:54,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 18:45:58,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 18:46:01,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:03,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:46:03,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=455833.3333333333, ans=0.07 2023-09-29 18:46:05,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:05,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 18:46:06,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:46:06,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 18:46:06,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:08,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:46:09,311 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.63 vs. limit=10.0 2023-09-29 18:46:09,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:09,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:46:11,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:46:11,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 18:46:13,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 18:46:13,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 18:46:13,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:46:16,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:46:16,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:46:16,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:46:23,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=455966.6666666667, ans=0.125 2023-09-29 18:46:24,453 INFO [train.py:1039] (3/4) Epoch 13, batch 4650, loss[loss=0.1988, simple_loss=0.2765, pruned_loss=0.06058, over 24640.00 frames. ], tot_loss[loss=0.1916, simple_loss=0.2638, pruned_loss=0.05975, over 4728968.09 frames. ], batch size: 68, lr: 7.86e-03, grad_scale: 32.0 2023-09-29 18:46:24,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:46:28,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:46:28,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:46:28,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:46:28,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:46:30,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:46:30,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:46:31,327 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.79 vs. limit=12.0 2023-09-29 18:46:35,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 18:46:39,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:46:43,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 18:46:43,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:46:43,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 18:46:43,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:46:44,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 18:46:44,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 18:46:46,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:46,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:46:49,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:46:50,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:46:52,448 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 18:46:54,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:46:55,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 18:46:57,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=456100.0, ans=0.1 2023-09-29 18:46:58,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:46:58,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:47:00,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 18:47:00,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:47:04,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:47:09,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:47:10,151 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.94 vs. limit=15.0 2023-09-29 18:47:13,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:47:15,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:47:15,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=456166.6666666667, ans=0.125 2023-09-29 18:47:17,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:47:18,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:47:20,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 18:47:21,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 18:47:22,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 18:47:22,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 18:47:23,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:47:29,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:47:29,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:47:31,277 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 18:47:31,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:47:31,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:47:31,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:47:33,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:47:37,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:47:37,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:47:39,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:47:40,060 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.64 vs. limit=15.0 2023-09-29 18:47:40,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:47:41,847 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.82 vs. limit=6.0 2023-09-29 18:47:42,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:47:42,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:47:42,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 18:47:43,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 18:47:45,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 18:47:46,879 INFO [train.py:1039] (3/4) Epoch 13, batch 4700, loss[loss=0.1831, simple_loss=0.2534, pruned_loss=0.05636, over 23534.00 frames. ], tot_loss[loss=0.1921, simple_loss=0.2645, pruned_loss=0.05985, over 4721921.63 frames. ], batch size: 134, lr: 7.86e-03, grad_scale: 32.0 2023-09-29 18:47:52,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:47:53,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:47:54,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:47:55,240 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 2.032e+02 2.349e+02 2.827e+02 4.344e+02, threshold=4.699e+02, percent-clipped=1.0 2023-09-29 18:47:55,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:47:56,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 18:48:02,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 18:48:02,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 18:48:06,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:48:08,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:48:08,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:48:08,934 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.70 vs. limit=15.0 2023-09-29 18:48:12,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:48:12,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=456366.6666666667, ans=0.125 2023-09-29 18:48:20,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:48:20,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 18:48:21,978 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=456433.3333333333, ans=0.0 2023-09-29 18:48:23,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:48:31,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 18:48:31,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=456433.3333333333, ans=0.1 2023-09-29 18:48:33,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:48:36,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:48:39,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 18:48:41,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:48:46,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:48:46,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 18:48:48,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:48:49,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:48:52,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:48:52,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:48:54,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 18:48:55,787 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 18:48:55,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:48:57,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=456566.6666666667, ans=0.1 2023-09-29 18:48:59,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:48:59,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:48:59,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 18:48:59,939 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.76 vs. limit=12.0 2023-09-29 18:49:00,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:49:04,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 18:49:07,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:49:07,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:49:10,435 INFO [train.py:1039] (3/4) Epoch 13, batch 4750, loss[loss=0.1562, simple_loss=0.2365, pruned_loss=0.03794, over 24607.00 frames. ], tot_loss[loss=0.192, simple_loss=0.265, pruned_loss=0.05954, over 4728998.56 frames. ], batch size: 60, lr: 7.86e-03, grad_scale: 32.0 2023-09-29 18:49:11,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=456633.3333333333, ans=0.125 2023-09-29 18:49:12,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:49:13,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:49:15,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 18:49:15,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:49:18,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 18:49:20,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:49:20,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:49:22,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:49:29,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 18:49:33,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:49:34,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 18:49:36,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:49:39,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:49:39,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:49:39,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:49:40,961 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 18:49:40,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 18:49:44,654 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.92 vs. limit=15.0 2023-09-29 18:49:45,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=456766.6666666667, ans=0.0 2023-09-29 18:49:48,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 18:49:50,872 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.67 vs. limit=22.5 2023-09-29 18:49:51,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:49:53,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:49:56,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:49:56,686 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 18:49:56,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:49:59,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:50:02,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 18:50:03,332 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=456833.3333333333, ans=0.1 2023-09-29 18:50:04,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 18:50:05,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 18:50:06,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:50:06,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:50:07,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:50:07,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:50:07,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 18:50:11,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 18:50:14,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:50:17,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:50:17,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 18:50:17,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:50:18,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:50:19,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=456900.0, ans=0.125 2023-09-29 18:50:21,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 18:50:22,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:50:23,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 18:50:25,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:50:25,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 18:50:27,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 18:50:28,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 18:50:31,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:50:31,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:50:33,039 INFO [train.py:1039] (3/4) Epoch 13, batch 4800, loss[loss=0.2185, simple_loss=0.2813, pruned_loss=0.07781, over 23322.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2655, pruned_loss=0.05987, over 4739394.60 frames. ], batch size: 285, lr: 7.85e-03, grad_scale: 32.0 2023-09-29 18:50:33,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 18:50:33,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=456966.6666666667, ans=0.0 2023-09-29 18:50:39,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=456966.6666666667, ans=0.125 2023-09-29 18:50:40,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:50:40,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:50:43,542 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.667e+02 1.965e+02 2.180e+02 2.463e+02 4.053e+02, threshold=4.360e+02, percent-clipped=0.0 2023-09-29 18:50:47,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 18:50:48,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:50:49,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:50:50,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 18:50:50,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:50:51,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:50:53,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 18:50:58,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:50:59,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:50:59,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 18:50:59,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:50:59,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 18:50:59,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:51:02,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:51:03,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:51:03,695 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:51:06,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:51:08,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:51:08,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:51:09,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 18:51:12,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:51:12,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 18:51:14,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 18:51:14,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:51:15,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:51:15,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:51:15,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:51:15,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:51:20,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:51:20,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:51:25,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:51:25,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=457166.6666666667, ans=0.125 2023-09-29 18:51:28,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:31,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:51:36,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 18:51:36,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:51:36,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:38,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:51:39,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:51:42,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:51:44,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:51:44,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:46,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:51:46,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 18:51:48,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:51:50,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:51:51,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:51,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:51:52,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=457233.3333333333, ans=0.0 2023-09-29 18:51:53,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 18:51:54,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 18:51:54,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:51:54,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:51:56,739 INFO [train.py:1039] (3/4) Epoch 13, batch 4850, loss[loss=0.1787, simple_loss=0.2566, pruned_loss=0.05039, over 24495.00 frames. ], tot_loss[loss=0.193, simple_loss=0.2662, pruned_loss=0.05993, over 4742224.74 frames. ], batch size: 66, lr: 7.85e-03, grad_scale: 32.0 2023-09-29 18:51:56,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:51:56,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:51:57,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=457300.0, ans=0.0 2023-09-29 18:52:00,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=457300.0, ans=0.2 2023-09-29 18:52:01,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:52:07,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 18:52:08,062 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:52:13,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:52:14,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 18:52:14,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:52:20,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:52:21,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:52:23,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:52:23,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 18:52:27,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:52:29,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:52:29,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 18:52:30,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 18:52:30,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 18:52:35,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 18:52:35,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:52:39,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:52:39,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 18:52:39,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 18:52:40,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:52:44,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=457500.0, ans=0.0 2023-09-29 18:52:49,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:52:49,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 18:52:50,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:52:50,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 18:52:52,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:52:54,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 18:52:54,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:52:56,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 18:52:56,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:52:58,171 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:52:58,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 18:53:07,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:53:14,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:53:14,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:53:18,647 INFO [train.py:1039] (3/4) Epoch 13, batch 4900, loss[loss=0.1835, simple_loss=0.2347, pruned_loss=0.06609, over 23386.00 frames. ], tot_loss[loss=0.1921, simple_loss=0.2648, pruned_loss=0.05968, over 4732695.20 frames. ], batch size: 285, lr: 7.85e-03, grad_scale: 16.0 2023-09-29 18:53:22,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 18:53:22,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:53:24,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.whiten.whitening_limit, batch_count=457633.3333333333, ans=12.0 2023-09-29 18:53:28,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:53:28,733 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=457633.3333333333, ans=0.04949747468305833 2023-09-29 18:53:29,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:53:29,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:53:32,645 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 2.076e+02 2.390e+02 2.815e+02 4.365e+02, threshold=4.780e+02, percent-clipped=1.0 2023-09-29 18:53:32,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 18:53:38,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 18:53:39,607 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.37 vs. limit=12.0 2023-09-29 18:53:43,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 18:53:43,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 18:53:45,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:53:45,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:53:45,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:53:46,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:53:46,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 18:53:46,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 18:53:50,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 18:53:51,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:53:52,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 18:53:53,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 18:53:55,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:53:56,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:53:58,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:53:58,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 18:54:00,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:54:01,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:54:01,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 18:54:01,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 18:54:06,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 18:54:06,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 18:54:08,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:54:08,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:54:09,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:54:09,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 18:54:09,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:54:11,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 18:54:14,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:54:15,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 18:54:17,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:54:20,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 18:54:21,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:54:23,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 18:54:23,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 18:54:31,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:54:33,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:54:35,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 18:54:35,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:54:35,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:54:37,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:54:40,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:54:40,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 18:54:40,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:54:41,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 18:54:41,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 18:54:43,210 INFO [train.py:1039] (3/4) Epoch 13, batch 4950, loss[loss=0.1926, simple_loss=0.2595, pruned_loss=0.06282, over 23806.00 frames. ], tot_loss[loss=0.191, simple_loss=0.2623, pruned_loss=0.05982, over 4699683.76 frames. ], batch size: 212, lr: 7.84e-03, grad_scale: 8.0 2023-09-29 18:54:46,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:54:46,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 18:54:48,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=457966.6666666667, ans=0.125 2023-09-29 18:54:49,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 18:54:50,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 18:54:50,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 18:54:52,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 18:54:52,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:54:52,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:54:52,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 18:54:52,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:54:55,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:54:57,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:54:57,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:54:58,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:55:03,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:55:04,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:55:07,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=458033.3333333333, ans=0.0 2023-09-29 18:55:08,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 18:55:09,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=458033.3333333333, ans=0.035 2023-09-29 18:55:12,275 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=458033.3333333333, ans=0.125 2023-09-29 18:55:13,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:55:13,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 18:55:15,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:55:16,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:55:19,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:55:19,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 18:55:21,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 18:55:23,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:55:26,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 18:55:26,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:55:27,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:55:27,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:55:29,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 18:55:30,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:55:33,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 18:55:34,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:55:35,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:55:35,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:55:38,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 18:55:38,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:55:40,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 18:55:40,830 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:55:43,252 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=16.16 vs. limit=15.0 2023-09-29 18:55:44,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:55:45,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 18:55:45,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:55:45,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:55:47,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 18:55:47,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 18:55:48,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:55:50,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 18:55:50,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=458233.3333333333, ans=0.125 2023-09-29 18:55:51,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:55:51,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 18:55:56,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:55:58,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=458233.3333333333, ans=0.0 2023-09-29 18:56:01,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 18:56:02,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 18:56:05,427 INFO [train.py:1039] (3/4) Epoch 13, batch 5000, loss[loss=0.1767, simple_loss=0.2467, pruned_loss=0.05341, over 23477.00 frames. ], tot_loss[loss=0.1907, simple_loss=0.2621, pruned_loss=0.05966, over 4700456.86 frames. ], batch size: 134, lr: 7.84e-03, grad_scale: 8.0 2023-09-29 18:56:07,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:56:07,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:56:09,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 18:56:10,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 18:56:13,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:56:14,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 18:56:14,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 18:56:14,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 18:56:16,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 18:56:16,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:56:17,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:56:19,080 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.601e+02 1.946e+02 2.294e+02 2.903e+02 4.132e+02, threshold=4.587e+02, percent-clipped=0.0 2023-09-29 18:56:19,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 18:56:19,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:56:19,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:56:20,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 18:56:20,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 18:56:22,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 18:56:23,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 18:56:23,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 18:56:23,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:56:25,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:56:25,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 18:56:25,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 18:56:25,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=458366.6666666667, ans=0.05 2023-09-29 18:56:27,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 18:56:28,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:56:28,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:56:30,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 18:56:30,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:56:30,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=458366.6666666667, ans=0.1 2023-09-29 18:56:31,718 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:56:31,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:56:33,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 18:56:36,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 18:56:36,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:56:39,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:56:39,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=458433.3333333333, ans=0.95 2023-09-29 18:56:43,416 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 18:56:47,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 18:56:49,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:56:49,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:56:52,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 18:56:52,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:56:53,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:56:53,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:56:55,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 18:56:56,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:56:59,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 18:57:01,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:57:07,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 18:57:09,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=458566.6666666667, ans=0.125 2023-09-29 18:57:11,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:57:23,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:57:23,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:57:23,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 18:57:25,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:57:25,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 18:57:25,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 18:57:25,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:57:28,277 INFO [train.py:1039] (3/4) Epoch 13, batch 5050, loss[loss=0.1938, simple_loss=0.2662, pruned_loss=0.06066, over 23371.00 frames. ], tot_loss[loss=0.1916, simple_loss=0.2633, pruned_loss=0.05993, over 4713492.76 frames. ], batch size: 93, lr: 7.84e-03, grad_scale: 8.0 2023-09-29 18:57:29,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:57:29,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 18:57:31,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 18:57:32,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:57:34,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 18:57:34,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 18:57:36,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:57:38,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:57:39,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 18:57:41,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 18:57:42,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 18:57:44,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=458700.0, ans=0.0 2023-09-29 18:57:49,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=458700.0, ans=0.125 2023-09-29 18:57:53,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=458700.0, ans=0.0 2023-09-29 18:57:54,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 18:57:54,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 18:57:54,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:57:54,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 18:57:55,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=458700.0, ans=0.125 2023-09-29 18:57:57,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:57:58,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:57:58,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:57:58,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:57:58,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 18:58:00,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 18:58:01,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:58:03,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:58:07,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 18:58:07,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 18:58:10,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:58:12,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 18:58:12,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 18:58:13,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 18:58:14,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:58:15,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 18:58:17,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:58:20,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 18:58:20,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:58:20,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:58:21,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 18:58:21,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 18:58:23,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 18:58:26,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 18:58:32,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 18:58:33,442 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 18:58:33,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 18:58:33,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:58:33,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:58:35,830 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 18:58:39,114 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=458900.0, ans=0.0 2023-09-29 18:58:40,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:58:40,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 18:58:40,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:58:43,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:58:43,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:58:43,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=458900.0, ans=0.1 2023-09-29 18:58:44,458 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.93 vs. limit=15.0 2023-09-29 18:58:44,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 18:58:46,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 18:58:48,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:58:48,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:58:49,636 INFO [train.py:1039] (3/4) Epoch 13, batch 5100, loss[loss=0.2113, simple_loss=0.2886, pruned_loss=0.06696, over 23433.00 frames. ], tot_loss[loss=0.1917, simple_loss=0.2639, pruned_loss=0.0597, over 4717507.44 frames. ], batch size: 93, lr: 7.84e-03, grad_scale: 8.0 2023-09-29 18:58:49,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 18:58:52,916 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 18:58:54,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 18:58:54,728 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=458966.6666666667, ans=0.1 2023-09-29 18:58:57,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 18:58:57,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 18:58:59,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:59:01,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 18:59:02,847 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.978e+02 2.231e+02 2.583e+02 5.581e+02, threshold=4.463e+02, percent-clipped=1.0 2023-09-29 18:59:03,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 18:59:04,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 18:59:04,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 18:59:09,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 18:59:11,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 18:59:15,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 18:59:19,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 18:59:19,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:59:22,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 18:59:22,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 18:59:24,372 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 18:59:25,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:59:25,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:59:26,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 18:59:27,733 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=6.85 vs. limit=15.0 2023-09-29 18:59:28,612 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 18:59:28,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:59:28,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=459100.0, ans=0.125 2023-09-29 18:59:30,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 18:59:30,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 18:59:34,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 18:59:43,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 18:59:47,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 18:59:47,417 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 18:59:47,430 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 18:59:47,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=459166.6666666667, ans=0.125 2023-09-29 18:59:49,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 18:59:49,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 18:59:53,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 18:59:56,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 18:59:57,301 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.49 vs. limit=22.5 2023-09-29 18:59:59,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 18:59:59,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:00:02,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 19:00:04,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:00:04,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 19:00:10,222 INFO [train.py:1039] (3/4) Epoch 13, batch 5150, loss[loss=0.1845, simple_loss=0.2613, pruned_loss=0.05378, over 24654.00 frames. ], tot_loss[loss=0.1928, simple_loss=0.2652, pruned_loss=0.06015, over 4727402.22 frames. ], batch size: 65, lr: 7.83e-03, grad_scale: 8.0 2023-09-29 19:00:11,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:00:11,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:00:11,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:00:11,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:00:11,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:00:12,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:00:14,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 19:00:14,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 19:00:16,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 19:00:16,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:00:16,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 19:00:17,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:00:17,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 19:00:20,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:00:23,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:00:27,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 19:00:27,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 19:00:29,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:00:30,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:00:32,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 19:00:32,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:00:32,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:00:32,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:00:32,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:00:33,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 19:00:34,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:00:34,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:00:37,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 19:00:38,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 19:00:40,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:00:42,243 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=459433.3333333333, ans=0.0 2023-09-29 19:00:47,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:00:49,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 19:00:51,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:00:59,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:00:59,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:01:04,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:01:05,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:01:07,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 19:01:12,052 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:01:12,363 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=459500.0, ans=0.5 2023-09-29 19:01:12,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=459500.0, ans=0.125 2023-09-29 19:01:13,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:01:13,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:01:15,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=459566.6666666667, ans=0.125 2023-09-29 19:01:16,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:01:18,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:01:18,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 19:01:22,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:01:25,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 19:01:28,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:01:28,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:01:30,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:01:30,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 19:01:30,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:01:32,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:01:33,947 INFO [train.py:1039] (3/4) Epoch 13, batch 5200, loss[loss=0.1803, simple_loss=0.2542, pruned_loss=0.05316, over 24295.00 frames. ], tot_loss[loss=0.1931, simple_loss=0.2657, pruned_loss=0.06022, over 4733848.68 frames. ], batch size: 61, lr: 7.83e-03, grad_scale: 16.0 2023-09-29 19:01:35,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:01:35,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=459633.3333333333, ans=0.0 2023-09-29 19:01:37,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:01:40,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:01:42,043 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=459633.3333333333, ans=0.0 2023-09-29 19:01:43,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 19:01:46,059 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.634e+02 2.009e+02 2.232e+02 2.701e+02 3.997e+02, threshold=4.463e+02, percent-clipped=0.0 2023-09-29 19:01:46,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:01:46,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:01:49,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:01:49,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=459700.0, ans=0.125 2023-09-29 19:01:50,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:01:51,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:01:53,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 19:01:55,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 19:01:57,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:01:59,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 19:02:00,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:02:02,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 19:02:02,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 19:02:03,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 19:02:05,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 19:02:05,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:02:05,605 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 19:02:05,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:02:09,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:02:10,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:02:10,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 19:02:10,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:02:12,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:02:12,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=459766.6666666667, ans=0.2 2023-09-29 19:02:16,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 19:02:16,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 19:02:18,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 19:02:23,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 19:02:23,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:02:30,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:02:30,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:02:33,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 19:02:33,418 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:02:33,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=459833.3333333333, ans=0.1 2023-09-29 19:02:34,325 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.50 vs. limit=12.0 2023-09-29 19:02:34,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 19:02:34,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:02:34,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:02:39,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:02:40,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:02:44,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:02:44,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:02:44,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:02:45,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=459900.0, ans=0.0 2023-09-29 19:02:49,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:02:50,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 19:02:51,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:02:52,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:02:52,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:02:54,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:02:55,516 INFO [train.py:1039] (3/4) Epoch 13, batch 5250, loss[loss=0.1754, simple_loss=0.2498, pruned_loss=0.05053, over 24498.00 frames. ], tot_loss[loss=0.193, simple_loss=0.2652, pruned_loss=0.06041, over 4723499.85 frames. ], batch size: 63, lr: 7.83e-03, grad_scale: 16.0 2023-09-29 19:02:55,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:02:56,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=459966.6666666667, ans=0.125 2023-09-29 19:02:57,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:03:01,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:03:01,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:03:01,795 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.01 vs. limit=15.0 2023-09-29 19:03:03,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:03:03,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=459966.6666666667, ans=0.2 2023-09-29 19:03:08,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=459966.6666666667, ans=0.125 2023-09-29 19:03:08,899 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=459966.6666666667, ans=0.2 2023-09-29 19:03:10,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:03:11,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:03:14,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:03:14,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:03:18,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 19:03:18,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:03:19,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:03:38,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=460100.0, ans=0.2 2023-09-29 19:03:40,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=460100.0, ans=0.125 2023-09-29 19:03:56,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=460233.3333333333, ans=0.5 2023-09-29 19:04:11,483 INFO [train.py:1039] (3/4) Epoch 13, batch 5300, loss[loss=0.1735, simple_loss=0.2468, pruned_loss=0.05006, over 24324.00 frames. ], tot_loss[loss=0.1918, simple_loss=0.2631, pruned_loss=0.06029, over 4717323.10 frames. ], batch size: 61, lr: 7.82e-03, grad_scale: 16.0 2023-09-29 19:04:19,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=460300.0, ans=0.125 2023-09-29 19:04:22,459 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.900e+02 2.152e+02 2.840e+02 4.256e+02, threshold=4.304e+02, percent-clipped=0.0 2023-09-29 19:04:26,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:04:26,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 19:04:26,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 19:04:26,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:04:26,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:04:27,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:04:27,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:04:27,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:04:27,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:04:27,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:04:27,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:04:28,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:04:28,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 19:04:28,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 19:04:28,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 19:04:28,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:04:28,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 19:04:28,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 19:04:29,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:04:29,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:04:29,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:04:29,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:04:30,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:04:30,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:04:30,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:04:30,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:04:30,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:04:30,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:04:30,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:04:30,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:04:30,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:04:32,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 19:04:32,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:04:32,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:04:32,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 19:04:32,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 19:04:33,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:04:33,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:04:33,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 19:04:33,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 19:04:33,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 19:04:34,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:04:34,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:04:34,590 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 19:04:34,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 19:04:34,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 19:04:34,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:04:35,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 19:04:35,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 19:04:35,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 19:04:35,511 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 19:04:43,509 INFO [train.py:1039] (3/4) Epoch 14, batch 0, loss[loss=0.182, simple_loss=0.2516, pruned_loss=0.05621, over 20672.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2516, pruned_loss=0.05621, over 20672.00 frames. ], batch size: 45, lr: 7.54e-03, grad_scale: 32.0 2023-09-29 19:04:43,510 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 19:04:58,067 INFO [train.py:1071] (3/4) Epoch 14, validation: loss=0.2893, simple_loss=0.2709, pruned_loss=0.1538, over 1125622.00 frames. 2023-09-29 19:04:58,068 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 19:05:00,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 19:05:01,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:05:03,149 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:05:09,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:05:09,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:05:10,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:05:10,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 19:05:12,544 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=460446.6666666667, ans=0.0 2023-09-29 19:05:12,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=460446.6666666667, ans=0.0 2023-09-29 19:05:13,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 19:05:17,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:05:18,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:05:22,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:05:22,298 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=460446.6666666667, ans=0.2 2023-09-29 19:05:24,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:05:24,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:05:24,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:05:26,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 19:05:28,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:05:37,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:05:37,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:05:40,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 19:05:42,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=460513.3333333333, ans=0.125 2023-09-29 19:05:45,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:05:45,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:05:46,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:05:50,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:05:54,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=460580.0, ans=0.125 2023-09-29 19:05:55,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:06:00,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 19:06:05,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 19:06:05,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:06:05,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:06:07,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:06:07,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:06:10,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 19:06:13,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:06:14,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:06:14,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=460646.6666666667, ans=0.125 2023-09-29 19:06:17,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:06:20,126 INFO [train.py:1039] (3/4) Epoch 14, batch 50, loss[loss=0.2107, simple_loss=0.28, pruned_loss=0.07075, over 23706.00 frames. ], tot_loss[loss=0.1937, simple_loss=0.2668, pruned_loss=0.06029, over 1069340.12 frames. ], batch size: 85, lr: 7.54e-03, grad_scale: 32.0 2023-09-29 19:06:20,403 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 19:06:21,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:06:24,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:06:26,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:06:26,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 19:06:28,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:06:28,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:06:31,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:06:33,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:06:35,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:06:35,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=460780.0, ans=0.2 2023-09-29 19:06:38,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 19:06:38,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:06:38,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=460780.0, ans=0.125 2023-09-29 19:06:40,301 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:06:45,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:06:47,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=460780.0, ans=0.125 2023-09-29 19:06:48,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 19:06:50,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 19:06:50,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:06:50,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=460780.0, ans=0.0 2023-09-29 19:06:52,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:06:52,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:06:52,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:06:53,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:06:55,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 19:06:55,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:07:01,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:07:04,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:07:04,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:07:05,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=460846.6666666667, ans=15.0 2023-09-29 19:07:06,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 19:07:06,639 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=460846.6666666667, ans=0.125 2023-09-29 19:07:09,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:07:11,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:07:11,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 19:07:11,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:07:12,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 19:07:20,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:07:20,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:07:20,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:07:22,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:07:22,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:07:25,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 19:07:26,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 19:07:26,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:07:26,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:07:28,703 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=460980.0, ans=0.1 2023-09-29 19:07:29,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:07:29,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:07:30,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 19:07:31,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 19:07:33,143 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 19:07:34,622 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.50 vs. limit=12.0 2023-09-29 19:07:35,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:07:35,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:07:35,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 19:07:35,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 19:07:37,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:07:38,452 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.985e+02 2.220e+02 2.670e+02 4.594e+02, threshold=4.441e+02, percent-clipped=1.0 2023-09-29 19:07:38,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:07:40,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 19:07:40,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:07:43,617 INFO [train.py:1039] (3/4) Epoch 14, batch 100, loss[loss=0.1866, simple_loss=0.2706, pruned_loss=0.05126, over 24283.00 frames. ], tot_loss[loss=0.1952, simple_loss=0.2679, pruned_loss=0.06125, over 1878784.45 frames. ], batch size: 74, lr: 7.53e-03, grad_scale: 16.0 2023-09-29 19:07:43,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:07:45,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:07:50,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:07:51,307 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.30 vs. limit=10.0 2023-09-29 19:07:52,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 19:07:52,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:07:57,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:07:58,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:07:58,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:07:58,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:07:58,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:08:00,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 19:08:00,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 19:08:01,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:08:01,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:08:01,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:08:05,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 19:08:07,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:08:07,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:08:08,721 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:08:10,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:08:12,095 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=461113.3333333333, ans=0.125 2023-09-29 19:08:13,326 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 19:08:15,241 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 19:08:16,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:08:16,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:08:19,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:08:22,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:08:25,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:30,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:31,789 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 19:08:32,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=461246.6666666667, ans=0.2 2023-09-29 19:08:33,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 19:08:37,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:08:40,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:08:41,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:44,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:08:47,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:08:47,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:08:50,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=461313.3333333333, ans=0.0 2023-09-29 19:08:51,044 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.82 vs. limit=22.5 2023-09-29 19:08:51,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:53,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:08:54,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:08:54,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:08:54,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:08:56,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 19:08:56,508 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 19:08:57,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:08:58,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:09:00,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:00,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:09:00,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 19:09:00,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 19:09:01,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 19:09:01,569 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:03,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:09:05,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:09:05,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:09:05,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:09:06,631 INFO [train.py:1039] (3/4) Epoch 14, batch 150, loss[loss=0.1921, simple_loss=0.2781, pruned_loss=0.05304, over 24330.00 frames. ], tot_loss[loss=0.1947, simple_loss=0.2675, pruned_loss=0.06097, over 2517594.40 frames. ], batch size: 74, lr: 7.53e-03, grad_scale: 16.0 2023-09-29 19:09:08,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:09:08,544 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=461380.0, ans=0.035 2023-09-29 19:09:10,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:09:10,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:09:11,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:13,319 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=461380.0, ans=0.125 2023-09-29 19:09:15,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:09:15,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=461380.0, ans=0.07 2023-09-29 19:09:16,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:19,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:09:19,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:24,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 19:09:24,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 19:09:24,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 19:09:27,824 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:09:27,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:09:29,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:09:29,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:09:29,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:09:30,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:31,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:09:32,679 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 19:09:34,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:09:40,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:09:40,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=461513.3333333333, ans=15.0 2023-09-29 19:09:43,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:09:43,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=461513.3333333333, ans=0.2 2023-09-29 19:09:44,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 19:09:47,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:09:47,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:09:47,764 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:09:51,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:09:52,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:09:54,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:09:56,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:09:56,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 19:10:03,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:10:04,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:10:04,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:10:04,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:10:07,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:10:08,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 19:10:12,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:10:13,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:10:13,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:10:15,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:10:15,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 19:10:15,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:10:15,731 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 19:10:15,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=461646.6666666667, ans=0.0 2023-09-29 19:10:18,237 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=461646.6666666667, ans=0.125 2023-09-29 19:10:21,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:10:23,770 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.853e+02 2.115e+02 2.469e+02 4.470e+02, threshold=4.229e+02, percent-clipped=1.0 2023-09-29 19:10:26,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:10:26,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:10:29,881 INFO [train.py:1039] (3/4) Epoch 14, batch 200, loss[loss=0.1922, simple_loss=0.2723, pruned_loss=0.05605, over 24435.00 frames. ], tot_loss[loss=0.1946, simple_loss=0.2673, pruned_loss=0.0609, over 3004358.65 frames. ], batch size: 69, lr: 7.53e-03, grad_scale: 16.0 2023-09-29 19:10:30,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 19:10:30,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:10:30,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:10:31,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=461713.3333333333, ans=0.0 2023-09-29 19:10:34,802 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 19:10:36,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:10:37,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:10:38,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:10:40,318 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.43 vs. limit=22.5 2023-09-29 19:10:43,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:10:43,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:10:43,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:10:55,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=461780.0, ans=0.125 2023-09-29 19:10:58,487 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=461780.0, ans=0.2 2023-09-29 19:11:02,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:11:03,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:11:04,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:11:05,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:11:05,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 19:11:05,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:11:06,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:08,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:11:08,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:11:09,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:11:11,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 19:11:12,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 19:11:12,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:11:16,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:11:23,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:11:33,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:34,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=461980.0, ans=0.1 2023-09-29 19:11:35,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:11:42,050 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:42,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=461980.0, ans=0.2 2023-09-29 19:11:45,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 19:11:46,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:11:46,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:11:47,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:11:47,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:11:49,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 19:11:49,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:11:49,535 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 19:11:50,903 INFO [train.py:1039] (3/4) Epoch 14, batch 250, loss[loss=0.1796, simple_loss=0.2655, pruned_loss=0.04688, over 24449.00 frames. ], tot_loss[loss=0.1936, simple_loss=0.2665, pruned_loss=0.06041, over 3401172.62 frames. ], batch size: 69, lr: 7.53e-03, grad_scale: 16.0 2023-09-29 19:11:52,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:54,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:11:56,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:56,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:11:57,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:11:57,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:11:59,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:11:59,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=462046.6666666667, ans=0.1 2023-09-29 19:12:03,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:12:17,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:12:19,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:12:20,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:12:26,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:12:28,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:12:28,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:12:28,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:12:30,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:12:30,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:12:30,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:12:32,204 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:12:35,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 19:12:35,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:12:36,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:12:36,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:12:36,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:12:38,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:12:38,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:12:38,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:12:42,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:12:42,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:12:44,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:12:48,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:12:51,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:12:55,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:13:00,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:13:01,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:13:04,915 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.06 vs. limit=15.0 2023-09-29 19:13:05,590 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 19:13:07,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:13:07,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:13:10,064 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.958e+02 2.110e+02 2.520e+02 4.183e+02, threshold=4.220e+02, percent-clipped=0.0 2023-09-29 19:13:10,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 19:13:10,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 19:13:11,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:13:11,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 19:13:13,256 INFO [train.py:1039] (3/4) Epoch 14, batch 300, loss[loss=0.2055, simple_loss=0.2827, pruned_loss=0.06419, over 24061.00 frames. ], tot_loss[loss=0.1924, simple_loss=0.2645, pruned_loss=0.06013, over 3672604.28 frames. ], batch size: 80, lr: 7.52e-03, grad_scale: 8.0 2023-09-29 19:13:17,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=462380.0, ans=6.0 2023-09-29 19:13:19,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:13:19,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:13:22,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:13:24,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 19:13:26,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:13:26,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=462380.0, ans=0.1 2023-09-29 19:13:27,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:13:27,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 19:13:27,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:13:31,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 19:13:37,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:13:37,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 19:13:42,597 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 19:13:43,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:13:45,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:13:47,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:13:47,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 19:13:47,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:13:50,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:13:53,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:13:53,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:13:58,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 19:13:58,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 19:13:58,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:14:01,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=462513.3333333333, ans=0.0 2023-09-29 19:14:02,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:14:04,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 19:14:05,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:14:08,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:14:12,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:14:12,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 19:14:17,728 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.15 vs. limit=10.0 2023-09-29 19:14:18,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:14:18,554 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:14:21,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:14:22,032 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:14:22,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 19:14:22,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=462646.6666666667, ans=0.0 2023-09-29 19:14:23,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 19:14:23,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:14:26,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 19:14:27,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:14:27,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:29,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:14:29,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:14:29,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:29,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=462646.6666666667, ans=0.0 2023-09-29 19:14:35,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:14:37,256 INFO [train.py:1039] (3/4) Epoch 14, batch 350, loss[loss=0.1712, simple_loss=0.2477, pruned_loss=0.04741, over 24339.00 frames. ], tot_loss[loss=0.1914, simple_loss=0.2643, pruned_loss=0.05923, over 3918181.97 frames. ], batch size: 61, lr: 7.52e-03, grad_scale: 8.0 2023-09-29 19:14:37,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 19:14:39,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:46,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:14:50,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:14:50,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:53,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 19:14:55,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:14:55,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 19:14:57,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=462780.0, ans=0.2 2023-09-29 19:14:58,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:14:58,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 19:15:00,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:15:02,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 19:15:03,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:15:04,601 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.78 vs. limit=15.0 2023-09-29 19:15:07,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:15:07,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:15:09,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:15:09,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:15:09,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:15:09,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:15:10,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:15:13,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:15:13,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:15:16,945 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.95 vs. limit=15.0 2023-09-29 19:15:23,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:15:23,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:15:23,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:15:24,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:15:27,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=462913.3333333333, ans=0.125 2023-09-29 19:15:30,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 19:15:30,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:15:35,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:15:35,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:15:35,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:15:38,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 19:15:41,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:15:43,178 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 19:15:45,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 19:15:45,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=462980.0, ans=0.125 2023-09-29 19:15:46,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:15:48,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:15:48,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 19:15:49,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:15:52,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:15:53,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:15:57,093 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.848e+02 2.063e+02 2.317e+02 4.440e+02, threshold=4.125e+02, percent-clipped=1.0 2023-09-29 19:15:57,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:15:57,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:15:58,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:16:00,119 INFO [train.py:1039] (3/4) Epoch 14, batch 400, loss[loss=0.199, simple_loss=0.2699, pruned_loss=0.06408, over 23359.00 frames. ], tot_loss[loss=0.1916, simple_loss=0.2647, pruned_loss=0.05923, over 4108204.62 frames. ], batch size: 105, lr: 7.52e-03, grad_scale: 16.0 2023-09-29 19:16:02,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:16:05,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:16:05,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 19:16:05,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:16:06,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:16:08,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:16:08,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:16:10,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:16:13,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:16:17,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 19:16:18,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 19:16:18,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:16:20,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 19:16:20,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:16:23,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:16:24,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:16:24,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 19:16:24,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=463113.3333333333, ans=0.0 2023-09-29 19:16:26,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:16:26,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:16:26,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:16:28,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:16:31,384 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 19:16:31,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 19:16:34,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:16:36,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:16:36,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=463180.0, ans=0.125 2023-09-29 19:16:37,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 19:16:39,255 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 19:16:42,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:16:45,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:16:52,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 19:16:55,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 19:16:57,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 19:16:57,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=463246.6666666667, ans=0.0 2023-09-29 19:16:59,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=463246.6666666667, ans=0.04949747468305833 2023-09-29 19:17:00,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:17:01,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:17:01,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 19:17:05,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:17:07,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:17:08,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:17:11,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:17:13,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 19:17:14,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 19:17:15,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 19:17:17,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:17:18,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:17:20,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=463380.0, ans=0.0 2023-09-29 19:17:20,951 INFO [train.py:1039] (3/4) Epoch 14, batch 450, loss[loss=0.19, simple_loss=0.2685, pruned_loss=0.05578, over 24629.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.2645, pruned_loss=0.05864, over 4243311.02 frames. ], batch size: 68, lr: 7.52e-03, grad_scale: 16.0 2023-09-29 19:17:21,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 19:17:25,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:17:25,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:17:25,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 19:17:26,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 19:17:26,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:17:28,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:17:28,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:17:28,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 19:17:29,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:17:30,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:17:31,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:17:35,851 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=463380.0, ans=0.125 2023-09-29 19:17:42,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:17:42,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:17:45,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 19:17:46,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 19:17:48,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:17:50,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:17:52,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:17:57,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:17:57,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:18:00,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 19:18:00,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 19:18:01,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 19:18:01,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:18:03,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:18:04,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:18:07,123 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 19:18:07,137 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 19:18:07,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:18:10,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:18:10,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 19:18:15,143 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:18:15,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:18:16,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 19:18:18,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 19:18:21,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:18:24,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:18:24,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:18:25,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 19:18:27,575 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:18:30,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:18:32,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 19:18:32,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 19:18:34,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:18:39,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:18:40,721 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.457e+02 1.869e+02 2.102e+02 2.617e+02 3.390e+02, threshold=4.204e+02, percent-clipped=0.0 2023-09-29 19:18:40,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:18:43,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:18:43,143 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 19:18:44,442 INFO [train.py:1039] (3/4) Epoch 14, batch 500, loss[loss=0.1775, simple_loss=0.2497, pruned_loss=0.05266, over 24566.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.2654, pruned_loss=0.05918, over 4344347.16 frames. ], batch size: 60, lr: 7.51e-03, grad_scale: 16.0 2023-09-29 19:18:45,395 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.84 vs. limit=12.0 2023-09-29 19:18:46,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:18:47,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=463713.3333333333, ans=0.125 2023-09-29 19:18:48,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:18:48,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:18:48,465 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 19:18:51,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 19:18:51,350 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:18:55,136 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=463713.3333333333, ans=15.0 2023-09-29 19:18:55,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 19:18:59,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 19:19:00,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:19:03,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:19:03,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:19:05,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:15,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:19:16,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:19:16,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 19:19:18,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:19:18,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 19:19:18,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:19:22,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:19:23,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:19:23,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:19:25,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:19:26,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 19:19:29,661 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 19:19:32,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:19:34,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:34,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:34,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:36,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:19:39,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 19:19:42,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:19:44,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:19:47,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:19:50,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:19:52,745 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=463980.0, ans=0.125 2023-09-29 19:19:57,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:19:59,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 19:19:59,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:19:59,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:20:04,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 19:20:05,350 INFO [train.py:1039] (3/4) Epoch 14, batch 550, loss[loss=0.1876, simple_loss=0.2676, pruned_loss=0.05378, over 23981.00 frames. ], tot_loss[loss=0.1916, simple_loss=0.2656, pruned_loss=0.05877, over 4434052.74 frames. ], batch size: 86, lr: 7.51e-03, grad_scale: 16.0 2023-09-29 19:20:05,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 19:20:07,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:20:10,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=464046.6666666667, ans=0.125 2023-09-29 19:20:13,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 19:20:14,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 19:20:14,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:20:14,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 19:20:15,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:20:17,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:20:17,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:17,465 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=464046.6666666667, ans=0.125 2023-09-29 19:20:19,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:19,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:20:20,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:20:22,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:20:23,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 19:20:23,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:20:28,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:20:28,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:30,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:20:30,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:30,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=464113.3333333333, ans=0.09899494936611666 2023-09-29 19:20:37,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 19:20:38,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 19:20:40,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:20:45,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:20:45,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:20:46,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:20:50,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:20:50,481 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 19:20:50,828 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=464180.0, ans=0.1 2023-09-29 19:20:50,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer_na.min_abs, batch_count=464180.0, ans=0.02 2023-09-29 19:20:52,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:20:52,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 19:20:55,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:20:57,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:20:57,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:20:57,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:20:58,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 19:21:00,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 19:21:01,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:21:01,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:21:01,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:21:01,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:21:05,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:21:07,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:21:10,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:21:10,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:21:12,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 19:21:13,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:21:15,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:21:15,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:21:16,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:21:17,872 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.55 vs. limit=6.0 2023-09-29 19:21:18,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 19:21:18,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 19:21:24,674 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.62 vs. limit=6.0 2023-09-29 19:21:25,257 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.884e+02 2.077e+02 2.403e+02 3.738e+02, threshold=4.154e+02, percent-clipped=0.0 2023-09-29 19:21:25,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 19:21:28,500 INFO [train.py:1039] (3/4) Epoch 14, batch 600, loss[loss=0.186, simple_loss=0.2696, pruned_loss=0.05117, over 24618.00 frames. ], tot_loss[loss=0.1933, simple_loss=0.2668, pruned_loss=0.05989, over 4491898.57 frames. ], batch size: 68, lr: 7.51e-03, grad_scale: 16.0 2023-09-29 19:21:30,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 19:21:31,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:21:31,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:21:31,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:21:40,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:21:40,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:21:42,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 19:21:44,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=464446.6666666667, ans=0.0 2023-09-29 19:21:44,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=464446.6666666667, ans=0.125 2023-09-29 19:21:45,255 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:21:46,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:21:49,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:21:51,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 19:21:51,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:21:58,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 19:21:59,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=464446.6666666667, ans=0.125 2023-09-29 19:22:01,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:22:01,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:22:02,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:22:08,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:22:08,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:22:10,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:22:16,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:22:20,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=464580.0, ans=0.0 2023-09-29 19:22:21,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:22:21,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:22:21,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:22:32,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 19:22:35,114 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=464646.6666666667, ans=0.1 2023-09-29 19:22:37,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:22:37,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:22:40,131 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.19 vs. limit=12.0 2023-09-29 19:22:42,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 19:22:42,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:22:42,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=464646.6666666667, ans=0.125 2023-09-29 19:22:43,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=464646.6666666667, ans=0.1 2023-09-29 19:22:44,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 19:22:44,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:22:46,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:22:50,975 INFO [train.py:1039] (3/4) Epoch 14, batch 650, loss[loss=0.181, simple_loss=0.2627, pruned_loss=0.04968, over 24626.00 frames. ], tot_loss[loss=0.1931, simple_loss=0.2659, pruned_loss=0.06018, over 4540937.49 frames. ], batch size: 68, lr: 7.50e-03, grad_scale: 16.0 2023-09-29 19:22:51,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 19:22:53,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 19:22:56,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:22:56,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:22:58,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:22:58,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 19:23:00,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:23:06,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:23:06,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:23:10,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:23:13,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 19:23:15,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:23:15,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:23:20,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:23:20,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 19:23:23,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:23:23,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:23,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 19:23:25,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:26,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:23:29,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:23:29,599 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 19:23:29,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:23:29,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:23:33,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:34,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:23:34,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:23:34,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:23:36,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 19:23:36,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:23:38,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:23:39,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:23:39,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:23:41,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 19:23:41,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 19:23:44,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 19:23:44,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:44,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:23:44,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:23:44,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:23:48,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:23:54,868 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:23:54,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:23:55,483 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.29 vs. limit=12.0 2023-09-29 19:23:56,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:24:00,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:24:00,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:24:02,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:24:09,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:24:09,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:24:09,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:24:10,964 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.710e+02 2.056e+02 2.592e+02 3.186e+02 5.109e+02, threshold=5.184e+02, percent-clipped=6.0 2023-09-29 19:24:11,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:24:14,035 INFO [train.py:1039] (3/4) Epoch 14, batch 700, loss[loss=0.1969, simple_loss=0.2783, pruned_loss=0.05775, over 24374.00 frames. ], tot_loss[loss=0.192, simple_loss=0.2647, pruned_loss=0.05967, over 4586505.55 frames. ], batch size: 77, lr: 7.50e-03, grad_scale: 16.0 2023-09-29 19:24:17,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 19:24:17,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 19:24:20,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 19:24:20,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=465046.6666666667, ans=0.125 2023-09-29 19:24:22,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:24:23,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:24:25,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 19:24:30,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:24:33,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:24:34,167 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=465113.3333333333, ans=0.125 2023-09-29 19:24:35,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:24:38,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:24:38,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:24:40,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:24:43,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 19:24:43,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:24:47,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 19:24:50,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 19:24:51,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=465180.0, ans=0.0 2023-09-29 19:24:53,859 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:24:53,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:24:55,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:24:55,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=465180.0, ans=0.0 2023-09-29 19:25:00,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:25:00,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 19:25:06,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:25:06,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:25:06,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 19:25:10,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:25:11,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=465246.6666666667, ans=0.0 2023-09-29 19:25:12,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:25:14,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=465246.6666666667, ans=0.1 2023-09-29 19:25:17,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:25:22,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=465313.3333333333, ans=0.125 2023-09-29 19:25:24,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:25:24,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 19:25:26,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=465313.3333333333, ans=0.125 2023-09-29 19:25:27,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 19:25:27,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 19:25:27,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=465313.3333333333, ans=0.2 2023-09-29 19:25:30,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:25:32,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:25:32,429 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:25:34,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:25:34,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 19:25:37,465 INFO [train.py:1039] (3/4) Epoch 14, batch 750, loss[loss=0.2136, simple_loss=0.2927, pruned_loss=0.0673, over 23353.00 frames. ], tot_loss[loss=0.1903, simple_loss=0.2632, pruned_loss=0.05869, over 4611663.01 frames. ], batch size: 93, lr: 7.50e-03, grad_scale: 8.0 2023-09-29 19:25:37,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=465380.0, ans=0.09899494936611666 2023-09-29 19:25:39,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 19:25:39,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 19:25:39,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 19:25:39,779 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.48 vs. limit=12.0 2023-09-29 19:25:42,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 19:25:42,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 19:25:42,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:25:44,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 19:25:45,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:25:45,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:25:47,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:25:50,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:25:50,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 19:25:50,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:25:52,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:25:52,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:25:56,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:25:57,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:25:59,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:25:59,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 19:26:00,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:26:02,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:26:02,829 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=465446.6666666667, ans=0.1 2023-09-29 19:26:04,126 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:26:05,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 19:26:05,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 19:26:05,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:26:09,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 19:26:09,647 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 19:26:11,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 19:26:11,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:26:12,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 19:26:14,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:26:21,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:26:21,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:26:21,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:26:24,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:26:26,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:26:26,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 19:26:28,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:26:28,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 19:26:29,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:26:31,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:26:32,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 19:26:33,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:26:33,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=465580.0, ans=0.2 2023-09-29 19:26:36,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=465580.0, ans=0.0 2023-09-29 19:26:37,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=465580.0, ans=0.1 2023-09-29 19:26:39,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:26:40,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:26:42,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:26:44,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:26:47,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 19:26:47,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:26:49,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:26:51,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:26:52,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:26:54,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:26:54,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:26:59,100 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 2.067e+02 2.400e+02 2.919e+02 4.074e+02, threshold=4.801e+02, percent-clipped=0.0 2023-09-29 19:27:01,188 INFO [train.py:1039] (3/4) Epoch 14, batch 800, loss[loss=0.278, simple_loss=0.3212, pruned_loss=0.1175, over 19315.00 frames. ], tot_loss[loss=0.1912, simple_loss=0.2642, pruned_loss=0.05904, over 4634870.06 frames. ], batch size: 388, lr: 7.50e-03, grad_scale: 16.0 2023-09-29 19:27:02,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:27:02,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:04,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:27:04,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:27:05,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:06,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:27:07,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:10,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:27:12,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:27:16,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 19:27:17,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:27:18,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:27:18,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:27:18,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:27:21,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 19:27:21,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:27:21,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 19:27:24,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:25,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:27:28,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:27:28,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:27:32,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:27:32,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:27:35,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:27:37,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:27:37,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 19:27:38,854 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 19:27:40,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 19:27:40,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:27:40,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:27:43,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:27:43,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:27:46,334 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 19:27:46,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 19:27:48,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:27:50,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:27:54,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:27:57,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:28:00,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 19:28:00,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:28:03,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 19:28:06,362 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=11.84 vs. limit=15.0 2023-09-29 19:28:10,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:28:14,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:28:14,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 19:28:16,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:28:17,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:28:17,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 19:28:19,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:28:19,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=465980.0, ans=0.125 2023-09-29 19:28:20,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:28:20,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:28:22,346 INFO [train.py:1039] (3/4) Epoch 14, batch 850, loss[loss=0.1905, simple_loss=0.2649, pruned_loss=0.05801, over 23338.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.2641, pruned_loss=0.05883, over 4668184.32 frames. ], batch size: 93, lr: 7.49e-03, grad_scale: 16.0 2023-09-29 19:28:22,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:28:23,943 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:28:26,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 19:28:26,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 19:28:26,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 19:28:27,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:28:27,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:28:30,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:28:30,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:28:30,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:28:36,389 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.35 vs. limit=6.0 2023-09-29 19:28:37,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:28:37,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:28:39,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 19:28:43,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 19:28:46,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:28:47,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 19:28:52,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 19:28:52,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 19:28:55,749 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 19:28:55,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:28:55,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:28:55,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 19:28:59,462 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:29:01,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:29:01,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 19:29:03,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:29:04,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:29:06,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:29:06,355 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:29:08,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=466180.0, ans=0.2 2023-09-29 19:29:09,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:29:10,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 19:29:10,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 19:29:11,225 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:29:16,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:29:16,194 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:29:16,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:29:17,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:29:19,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:29:22,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:29:24,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:29:27,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:29:27,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:29:28,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:29:38,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 19:29:39,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:29:39,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 19:29:39,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:29:40,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:29:42,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 19:29:43,736 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.955e+02 2.156e+02 2.451e+02 4.149e+02, threshold=4.312e+02, percent-clipped=0.0 2023-09-29 19:29:45,196 INFO [train.py:1039] (3/4) Epoch 14, batch 900, loss[loss=0.2266, simple_loss=0.2854, pruned_loss=0.08389, over 22648.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2651, pruned_loss=0.06004, over 4676214.29 frames. ], batch size: 322, lr: 7.49e-03, grad_scale: 16.0 2023-09-29 19:29:46,974 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:29:48,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:29:50,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 19:29:53,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:29:53,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 19:29:56,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 19:29:57,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:29:57,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:29:57,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:29:59,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:30:12,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:30:12,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:30:12,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:30:17,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:30:22,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 19:30:25,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:30:25,456 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=466513.3333333333, ans=0.0 2023-09-29 19:30:27,902 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.18 vs. limit=15.0 2023-09-29 19:30:30,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:30:30,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=466513.3333333333, ans=0.0 2023-09-29 19:30:31,207 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.98 vs. limit=15.0 2023-09-29 19:30:31,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:30:33,392 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 19:30:34,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 19:30:41,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 19:30:41,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:30:42,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:30:49,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:30:49,581 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:30:53,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 19:30:53,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:30:56,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 19:30:57,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:30:57,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:30:59,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:30:59,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:31:04,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 19:31:04,499 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 19:31:06,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 19:31:06,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 19:31:08,191 INFO [train.py:1039] (3/4) Epoch 14, batch 950, loss[loss=0.247, simple_loss=0.307, pruned_loss=0.0935, over 19710.00 frames. ], tot_loss[loss=0.1928, simple_loss=0.265, pruned_loss=0.06034, over 4684283.80 frames. ], batch size: 388, lr: 7.49e-03, grad_scale: 16.0 2023-09-29 19:31:08,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:31:12,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 19:31:18,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:31:19,963 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.71 vs. limit=10.0 2023-09-29 19:31:20,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:31:22,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:31:22,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 19:31:25,941 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 19:31:31,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:31:33,117 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:31:33,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:31:34,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:31:34,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 19:31:34,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 19:31:36,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:31:38,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 19:31:39,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:31:39,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=466846.6666666667, ans=0.0 2023-09-29 19:31:42,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=466846.6666666667, ans=0.1 2023-09-29 19:31:44,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:31:44,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:31:44,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:31:44,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 19:31:45,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=466846.6666666667, ans=0.125 2023-09-29 19:31:46,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 19:31:46,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:31:50,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:31:54,100 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=466846.6666666667, ans=0.0 2023-09-29 19:31:56,726 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:31:56,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:31:58,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 19:32:02,440 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 19:32:02,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:32:03,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:32:04,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:32:04,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:32:10,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 19:32:10,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:32:10,504 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=466913.3333333333, ans=0.125 2023-09-29 19:32:11,728 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:32:13,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:32:13,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 19:32:13,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:32:13,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:32:14,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 19:32:18,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:32:20,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=466980.0, ans=0.125 2023-09-29 19:32:23,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:32:28,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:32:30,240 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.850e+02 2.095e+02 2.342e+02 3.294e+02, threshold=4.189e+02, percent-clipped=0.0 2023-09-29 19:32:30,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 19:32:30,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 19:32:31,857 INFO [train.py:1039] (3/4) Epoch 14, batch 1000, loss[loss=0.1767, simple_loss=0.2385, pruned_loss=0.05745, over 23562.00 frames. ], tot_loss[loss=0.1923, simple_loss=0.264, pruned_loss=0.06029, over 4677225.63 frames. ], batch size: 256, lr: 7.49e-03, grad_scale: 16.0 2023-09-29 19:32:32,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:32:33,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=467046.6666666667, ans=0.1 2023-09-29 19:32:37,398 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 19:32:37,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:32:42,566 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=467046.6666666667, ans=0.125 2023-09-29 19:32:43,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:32:45,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 19:32:45,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 19:32:45,436 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=467046.6666666667, ans=0.125 2023-09-29 19:32:47,096 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=13.17 vs. limit=15.0 2023-09-29 19:32:49,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:32:49,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:32:51,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:32:54,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 19:32:59,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 19:33:00,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 19:33:02,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:33:05,028 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 19:33:06,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 19:33:06,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 19:33:08,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:33:10,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:33:18,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:33:18,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:33:18,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:33:18,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=467180.0, ans=0.0 2023-09-29 19:33:18,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=467180.0, ans=0.0 2023-09-29 19:33:19,012 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.42 vs. limit=10.0 2023-09-29 19:33:19,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:33:19,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 19:33:19,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:33:21,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:33:21,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:33:22,934 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 19:33:24,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 19:33:26,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 19:33:29,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 19:33:33,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:33:38,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:33:38,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:33:38,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:33:39,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:33:43,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 19:33:43,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:33:43,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=467313.3333333333, ans=0.04949747468305833 2023-09-29 19:33:44,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 19:33:45,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 19:33:45,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=467313.3333333333, ans=0.2 2023-09-29 19:33:46,681 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:33:46,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:33:48,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:33:51,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:33:51,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:33:54,424 INFO [train.py:1039] (3/4) Epoch 14, batch 1050, loss[loss=0.1861, simple_loss=0.2532, pruned_loss=0.05948, over 23297.00 frames. ], tot_loss[loss=0.1914, simple_loss=0.2628, pruned_loss=0.05997, over 4682959.20 frames. ], batch size: 105, lr: 7.48e-03, grad_scale: 16.0 2023-09-29 19:33:56,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:33:56,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:33:59,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:34:01,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:34:04,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:34:06,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:34:08,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:34:08,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=467380.0, ans=0.0 2023-09-29 19:34:10,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:34:10,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:34:10,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:34:10,654 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=467446.6666666667, ans=0.125 2023-09-29 19:34:12,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:34:12,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 19:34:13,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:34:15,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 19:34:17,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:34:17,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 19:34:17,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 19:34:25,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:34:25,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:34:27,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:34:27,796 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=467513.3333333333, ans=0.125 2023-09-29 19:34:29,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 19:34:29,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 19:34:29,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=467513.3333333333, ans=0.0 2023-09-29 19:34:30,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:34:32,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 19:34:34,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 19:34:36,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:34:38,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 19:34:42,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 19:34:44,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:34:45,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:34:47,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:34:51,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 19:34:53,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 19:34:53,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 19:34:54,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:34:55,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:34:56,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 19:35:01,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:35:02,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:35:02,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:35:02,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:35:04,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:35:06,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=467646.6666666667, ans=0.125 2023-09-29 19:35:08,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:35:08,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 19:35:09,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:35:09,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 19:35:09,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 19:35:11,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:35:15,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:35:17,479 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.954e+02 2.210e+02 2.556e+02 3.990e+02, threshold=4.421e+02, percent-clipped=0.0 2023-09-29 19:35:19,001 INFO [train.py:1039] (3/4) Epoch 14, batch 1100, loss[loss=0.1773, simple_loss=0.2605, pruned_loss=0.04705, over 23976.00 frames. ], tot_loss[loss=0.1907, simple_loss=0.2628, pruned_loss=0.05929, over 4693996.03 frames. ], batch size: 80, lr: 7.48e-03, grad_scale: 16.0 2023-09-29 19:35:23,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:35:26,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:35:28,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:35:28,620 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:35:28,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 19:35:30,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:35:31,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 19:35:32,334 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:35:33,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:35:33,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=467780.0, ans=0.2 2023-09-29 19:35:37,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:35:37,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=467780.0, ans=0.125 2023-09-29 19:35:38,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 19:35:38,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 19:35:40,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:35:40,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:35:43,750 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.02 vs. limit=10.0 2023-09-29 19:35:44,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:35:46,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 19:35:52,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:35:52,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=467846.6666666667, ans=0.125 2023-09-29 19:35:54,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 19:35:55,758 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 19:35:57,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:36:00,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:36:00,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 19:36:01,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:36:03,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 19:36:03,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:36:03,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:36:03,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:36:04,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:36:04,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 19:36:12,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:36:12,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 19:36:14,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:36:19,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:36:22,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 19:36:22,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 19:36:24,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:36:28,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:36:28,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:36:29,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 19:36:29,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:36:31,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:36:31,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 19:36:31,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:36:31,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 19:36:33,828 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.21 vs. limit=15.0 2023-09-29 19:36:34,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:36:34,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:36:34,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=467980.0, ans=0.2 2023-09-29 19:36:35,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:36:40,200 INFO [train.py:1039] (3/4) Epoch 14, batch 1150, loss[loss=0.198, simple_loss=0.2672, pruned_loss=0.06439, over 23700.00 frames. ], tot_loss[loss=0.1912, simple_loss=0.2636, pruned_loss=0.05938, over 4702089.41 frames. ], batch size: 232, lr: 7.48e-03, grad_scale: 16.0 2023-09-29 19:36:41,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:36:45,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:36:46,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:36:48,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:36:48,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 19:36:48,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:36:50,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=468046.6666666667, ans=0.125 2023-09-29 19:36:51,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 19:36:52,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:36:52,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:36:58,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 19:36:59,053 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.78 vs. limit=22.5 2023-09-29 19:37:02,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:37:06,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:37:07,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:37:08,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 19:37:08,534 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:37:08,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:37:11,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 19:37:11,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:37:13,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:37:23,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:37:23,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer_ff2.min_abs, batch_count=468180.0, ans=0.1 2023-09-29 19:37:28,835 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=468246.6666666667, ans=0.2 2023-09-29 19:37:31,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:37:31,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 19:37:31,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:37:33,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:37:37,566 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 19:37:39,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:37:46,576 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 19:37:49,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:37:50,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=468313.3333333333, ans=0.1 2023-09-29 19:37:51,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:37:51,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:37:51,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:37:56,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:38:01,589 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.871e+02 2.319e+02 2.937e+02 5.340e+02, threshold=4.639e+02, percent-clipped=1.0 2023-09-29 19:38:01,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:38:01,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:38:03,176 INFO [train.py:1039] (3/4) Epoch 14, batch 1200, loss[loss=0.1941, simple_loss=0.2582, pruned_loss=0.06497, over 23721.00 frames. ], tot_loss[loss=0.1917, simple_loss=0.2639, pruned_loss=0.05976, over 4699629.11 frames. ], batch size: 212, lr: 7.48e-03, grad_scale: 32.0 2023-09-29 19:38:03,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:38:03,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:38:04,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:38:08,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:38:10,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:38:11,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:38:13,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:38:15,002 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 19:38:16,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 19:38:16,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=468380.0, ans=0.125 2023-09-29 19:38:16,966 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=468380.0, ans=0.5 2023-09-29 19:38:21,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:38:22,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:38:26,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:38:27,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:38:27,937 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 19:38:28,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:38:37,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 19:38:37,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:38:37,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 19:38:39,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:38:44,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 19:38:47,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 19:38:47,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:38:48,104 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=468513.3333333333, ans=0.05 2023-09-29 19:38:49,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:38:49,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:38:50,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:38:53,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:38:53,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:38:53,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:38:55,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 19:38:56,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:38:56,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:38:56,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:39:00,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:39:00,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:39:02,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 19:39:03,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:39:06,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 19:39:10,653 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 19:39:14,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:39:17,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:39:18,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:39:21,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:39:23,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 19:39:25,443 INFO [train.py:1039] (3/4) Epoch 14, batch 1250, loss[loss=0.1932, simple_loss=0.2696, pruned_loss=0.05841, over 23351.00 frames. ], tot_loss[loss=0.1925, simple_loss=0.265, pruned_loss=0.06001, over 4703429.98 frames. ], batch size: 93, lr: 7.47e-03, grad_scale: 16.0 2023-09-29 19:39:27,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=468713.3333333333, ans=0.0 2023-09-29 19:39:28,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:39:29,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:39:30,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 19:39:33,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:39:34,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:39:39,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 19:39:40,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:39:41,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:39:41,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:39:44,142 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.16 vs. limit=6.0 2023-09-29 19:39:44,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:39:48,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 19:39:49,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:39:49,377 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:39:50,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:39:50,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:39:51,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=468780.0, ans=0.125 2023-09-29 19:39:54,088 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.48 vs. limit=15.0 2023-09-29 19:39:54,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:39:56,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 19:40:00,813 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.06 vs. limit=22.5 2023-09-29 19:40:01,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 19:40:02,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:40:05,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:40:06,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 19:40:06,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:40:07,629 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 19:40:07,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:40:07,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:40:12,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:40:15,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:40:17,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:40:19,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 19:40:19,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 19:40:19,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 19:40:22,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:40:24,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 19:40:24,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:40:29,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 19:40:29,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:40:32,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 19:40:32,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 19:40:34,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:40:34,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 19:40:34,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:40:37,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 19:40:38,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:40:40,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:40:40,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:40:45,062 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 19:40:46,416 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.815e+02 2.067e+02 2.359e+02 2.982e+02, threshold=4.134e+02, percent-clipped=0.0 2023-09-29 19:40:46,462 INFO [train.py:1039] (3/4) Epoch 14, batch 1300, loss[loss=0.1788, simple_loss=0.2519, pruned_loss=0.0528, over 23453.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2651, pruned_loss=0.06005, over 4716631.31 frames. ], batch size: 93, lr: 7.47e-03, grad_scale: 16.0 2023-09-29 19:40:48,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:40:48,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 19:40:52,091 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=469046.6666666667, ans=0.1 2023-09-29 19:40:54,712 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:40:56,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 19:40:58,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:40:58,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:40:58,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=469046.6666666667, ans=0.125 2023-09-29 19:41:00,102 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:41:01,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 19:41:06,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:41:08,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:41:09,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 19:41:14,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 19:41:18,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:41:19,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:41:21,314 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.06 vs. limit=15.0 2023-09-29 19:41:22,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:41:24,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:41:24,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:41:25,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 19:41:27,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 19:41:30,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:41:32,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:41:34,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 19:41:34,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=469246.6666666667, ans=0.0 2023-09-29 19:41:35,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 19:41:37,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:41:40,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:41:40,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 19:41:40,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:41:40,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 19:41:43,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:41:47,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:41:47,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:41:49,828 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.16 vs. limit=10.0 2023-09-29 19:41:50,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 19:41:50,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 19:41:52,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 19:41:57,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:42:00,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 19:42:02,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:42:02,690 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=469313.3333333333, ans=0.04949747468305833 2023-09-29 19:42:08,195 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=469380.0, ans=0.125 2023-09-29 19:42:09,251 INFO [train.py:1039] (3/4) Epoch 14, batch 1350, loss[loss=0.1834, simple_loss=0.2296, pruned_loss=0.06859, over 19624.00 frames. ], tot_loss[loss=0.1919, simple_loss=0.2638, pruned_loss=0.05998, over 4714677.22 frames. ], batch size: 389, lr: 7.47e-03, grad_scale: 8.0 2023-09-29 19:42:09,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=469380.0, ans=0.2 2023-09-29 19:42:11,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 19:42:14,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:42:17,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:42:17,974 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=469380.0, ans=0.125 2023-09-29 19:42:18,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=469380.0, ans=0.125 2023-09-29 19:42:18,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=469380.0, ans=0.125 2023-09-29 19:42:20,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:42:20,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:42:22,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:42:24,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:42:27,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:42:29,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 19:42:30,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:42:30,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:42:31,478 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.29 vs. limit=15.0 2023-09-29 19:42:33,606 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.99 vs. limit=15.0 2023-09-29 19:42:35,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 19:42:35,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:42:36,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:42:36,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 19:42:37,094 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=469446.6666666667, ans=0.125 2023-09-29 19:42:38,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=469446.6666666667, ans=0.1 2023-09-29 19:42:39,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 19:42:41,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 19:42:43,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:42:43,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 19:42:45,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=469513.3333333333, ans=0.125 2023-09-29 19:42:56,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:43:06,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:43:06,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:43:06,869 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 19:43:11,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:43:11,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 19:43:12,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 19:43:12,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:43:14,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:43:17,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 19:43:19,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:43:21,433 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=469646.6666666667, ans=0.2 2023-09-29 19:43:24,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 19:43:25,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 19:43:31,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 19:43:32,804 INFO [train.py:1039] (3/4) Epoch 14, batch 1400, loss[loss=0.1831, simple_loss=0.2569, pruned_loss=0.0546, over 21496.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2622, pruned_loss=0.0595, over 4716377.51 frames. ], batch size: 47, lr: 7.46e-03, grad_scale: 8.0 2023-09-29 19:43:32,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:43:34,254 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.861e+02 2.134e+02 2.363e+02 3.336e+02, threshold=4.269e+02, percent-clipped=0.0 2023-09-29 19:43:34,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=469713.3333333333, ans=0.125 2023-09-29 19:43:36,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:43:36,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=469713.3333333333, ans=0.025 2023-09-29 19:43:38,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:43:40,529 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.75 vs. limit=15.0 2023-09-29 19:43:43,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 19:43:44,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 19:43:48,183 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=469780.0, ans=0.125 2023-09-29 19:43:49,901 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=469780.0, ans=0.5 2023-09-29 19:43:50,453 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.01 vs. limit=15.0 2023-09-29 19:43:54,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:43:56,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:43:58,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:43:58,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 19:44:02,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=469780.0, ans=0.125 2023-09-29 19:44:03,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:44:05,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 19:44:16,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:44:16,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:44:18,472 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=469846.6666666667, ans=0.125 2023-09-29 19:44:21,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 19:44:21,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:44:21,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:44:22,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:44:23,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=469913.3333333333, ans=0.125 2023-09-29 19:44:24,061 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:44:24,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:44:25,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:44:25,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:44:27,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 19:44:27,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:44:31,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:44:34,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:44:41,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 19:44:42,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 19:44:43,235 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.84 vs. limit=15.0 2023-09-29 19:44:44,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:44:46,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 19:44:48,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:44:49,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:44:52,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:44:53,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:44:53,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:44:54,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 19:44:56,256 INFO [train.py:1039] (3/4) Epoch 14, batch 1450, loss[loss=0.1802, simple_loss=0.2481, pruned_loss=0.05614, over 23457.00 frames. ], tot_loss[loss=0.1902, simple_loss=0.2616, pruned_loss=0.05942, over 4712190.16 frames. ], batch size: 285, lr: 7.46e-03, grad_scale: 8.0 2023-09-29 19:44:59,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:45:00,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 19:45:01,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:45:01,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 19:45:02,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:45:02,933 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=470046.6666666667, ans=0.2 2023-09-29 19:45:05,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 19:45:05,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:45:07,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:45:07,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 19:45:10,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:45:10,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:45:12,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 19:45:12,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:45:13,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:45:15,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:45:16,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:45:17,952 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.71 vs. limit=15.0 2023-09-29 19:45:22,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:45:22,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:45:24,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:45:24,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:45:25,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:45:27,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:45:27,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:45:28,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:45:31,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 19:45:34,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:45:37,560 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 19:45:40,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:45:41,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:45:43,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:45:44,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 19:45:45,865 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.96 vs. limit=15.0 2023-09-29 19:45:50,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:45:51,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 19:45:51,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 19:45:54,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:45:54,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=470246.6666666667, ans=0.1 2023-09-29 19:45:57,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:45:59,391 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:46:00,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 19:46:04,120 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.76 vs. limit=5.0 2023-09-29 19:46:04,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 19:46:05,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 19:46:07,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:46:09,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 19:46:10,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=470313.3333333333, ans=0.125 2023-09-29 19:46:14,087 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=470313.3333333333, ans=0.125 2023-09-29 19:46:18,288 INFO [train.py:1039] (3/4) Epoch 14, batch 1500, loss[loss=0.1898, simple_loss=0.2605, pruned_loss=0.05949, over 23594.00 frames. ], tot_loss[loss=0.1892, simple_loss=0.2616, pruned_loss=0.05837, over 4724534.11 frames. ], batch size: 149, lr: 7.46e-03, grad_scale: 8.0 2023-09-29 19:46:19,676 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.877e+02 2.089e+02 2.456e+02 3.474e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-29 19:46:21,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 19:46:21,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:46:21,468 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:46:22,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:46:23,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:46:25,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:46:26,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 19:46:28,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 19:46:28,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 19:46:28,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:46:30,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:46:30,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:46:32,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:46:36,433 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.58 vs. limit=12.0 2023-09-29 19:46:38,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:46:39,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 19:46:39,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:46:39,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:46:41,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:46:44,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 19:46:48,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 19:46:50,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:46:51,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 19:46:53,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 19:46:56,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:46:57,816 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:46:57,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:46:59,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 19:47:00,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:47:00,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:47:01,515 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 19:47:01,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:47:07,444 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.69 vs. limit=15.0 2023-09-29 19:47:07,550 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.02 vs. limit=6.0 2023-09-29 19:47:08,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:47:08,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 19:47:08,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=470580.0, ans=0.125 2023-09-29 19:47:15,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 19:47:16,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:47:21,562 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 19:47:22,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:47:22,955 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 19:47:23,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:47:25,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:47:26,046 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 19:47:27,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 19:47:29,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 19:47:31,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:47:32,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:47:32,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:47:34,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:47:34,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:47:36,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:47:37,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 19:47:39,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 19:47:39,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:47:40,998 INFO [train.py:1039] (3/4) Epoch 14, batch 1550, loss[loss=0.1795, simple_loss=0.2574, pruned_loss=0.05082, over 24462.00 frames. ], tot_loss[loss=0.1908, simple_loss=0.2633, pruned_loss=0.05913, over 4723260.99 frames. ], batch size: 66, lr: 7.46e-03, grad_scale: 8.0 2023-09-29 19:47:41,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 19:47:42,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 19:47:44,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:47:46,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:47:46,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:47:46,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:47:46,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=470713.3333333333, ans=0.1 2023-09-29 19:47:48,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:47:50,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:47:53,228 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 19:47:53,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:47:53,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:47:54,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 19:47:56,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=470780.0, ans=0.2 2023-09-29 19:47:58,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:47:58,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 19:48:00,059 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.18 vs. limit=22.5 2023-09-29 19:48:00,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:48:00,802 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 19:48:02,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 19:48:02,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 19:48:02,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:48:04,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:48:08,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:48:10,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 19:48:10,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 19:48:15,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=470846.6666666667, ans=0.015 2023-09-29 19:48:19,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=470846.6666666667, ans=0.1 2023-09-29 19:48:21,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:48:26,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:48:26,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 19:48:26,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:48:27,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 19:48:27,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=470846.6666666667, ans=0.0 2023-09-29 19:48:30,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 19:48:33,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:48:35,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:48:38,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:48:38,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:48:39,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 19:48:39,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:48:41,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:48:41,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=470913.3333333333, ans=0.0 2023-09-29 19:48:42,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:48:44,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 19:48:44,357 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 19:48:47,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:48:51,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 19:48:57,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:48:57,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:48:58,512 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.11 vs. limit=22.5 2023-09-29 19:48:58,672 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.98 vs. limit=15.0 2023-09-29 19:48:59,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 19:49:01,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:49:02,597 INFO [train.py:1039] (3/4) Epoch 14, batch 1600, loss[loss=0.2052, simple_loss=0.2768, pruned_loss=0.06678, over 23338.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2647, pruned_loss=0.06021, over 4718455.77 frames. ], batch size: 93, lr: 7.45e-03, grad_scale: 16.0 2023-09-29 19:49:02,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:49:02,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:49:02,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:49:02,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:49:04,177 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.865e+02 2.125e+02 2.416e+02 3.474e+02, threshold=4.250e+02, percent-clipped=0.0 2023-09-29 19:49:05,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:49:07,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 19:49:08,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 19:49:10,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 19:49:10,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=471046.6666666667, ans=0.125 2023-09-29 19:49:12,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:49:12,625 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.08 vs. limit=15.0 2023-09-29 19:49:13,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 19:49:13,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:49:16,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:49:24,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:49:27,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 19:49:29,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=471113.3333333333, ans=0.1 2023-09-29 19:49:30,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:49:32,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 19:49:32,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:49:32,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 19:49:34,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=471180.0, ans=0.0 2023-09-29 19:49:37,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 19:49:44,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:49:45,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 19:49:45,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:49:46,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:49:46,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:49:48,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 19:49:55,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 19:49:56,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:49:58,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:49:58,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:50:00,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 19:50:01,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:50:03,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:50:06,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:50:08,116 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=471313.3333333333, ans=0.125 2023-09-29 19:50:11,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:50:12,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:50:16,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 19:50:16,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:50:17,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 19:50:22,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=471380.0, ans=0.0 2023-09-29 19:50:23,427 INFO [train.py:1039] (3/4) Epoch 14, batch 1650, loss[loss=0.1964, simple_loss=0.2715, pruned_loss=0.06065, over 23329.00 frames. ], tot_loss[loss=0.1939, simple_loss=0.2661, pruned_loss=0.06082, over 4710256.67 frames. ], batch size: 93, lr: 7.45e-03, grad_scale: 16.0 2023-09-29 19:50:23,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:50:25,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:50:26,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:50:26,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 19:50:26,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 19:50:26,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 19:50:28,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 19:50:30,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:50:32,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:50:32,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:50:32,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 19:50:33,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:50:37,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 19:50:40,465 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:50:40,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:50:40,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:50:40,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:50:42,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 19:50:42,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 19:50:47,432 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:50:49,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 19:50:56,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 19:50:58,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:51:02,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 19:51:02,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=471513.3333333333, ans=0.125 2023-09-29 19:51:03,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:51:07,260 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 19:51:08,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:51:10,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:51:10,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:51:11,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:51:11,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=471580.0, ans=0.0 2023-09-29 19:51:13,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:51:14,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:51:16,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:51:16,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:51:16,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:51:18,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:51:19,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 19:51:19,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=471580.0, ans=0.125 2023-09-29 19:51:24,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 19:51:25,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 19:51:25,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:51:27,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 19:51:28,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 19:51:28,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 19:51:28,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:51:28,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:51:30,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:51:30,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:51:30,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 19:51:32,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:51:34,089 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:51:36,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:51:41,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 19:51:44,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:51:44,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:51:44,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 19:51:46,308 INFO [train.py:1039] (3/4) Epoch 14, batch 1700, loss[loss=0.1966, simple_loss=0.2814, pruned_loss=0.05593, over 24461.00 frames. ], tot_loss[loss=0.193, simple_loss=0.2651, pruned_loss=0.06048, over 4706409.20 frames. ], batch size: 69, lr: 7.45e-03, grad_scale: 8.0 2023-09-29 19:51:46,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:51:46,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:51:46,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:51:49,270 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.869e+02 2.042e+02 2.278e+02 4.402e+02, threshold=4.084e+02, percent-clipped=1.0 2023-09-29 19:51:49,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:51:49,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:51:49,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 19:51:54,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 19:52:04,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:52:06,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:52:11,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:52:13,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:52:13,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:52:14,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:52:17,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 19:52:18,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:52:18,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:52:18,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=471846.6666666667, ans=0.0 2023-09-29 19:52:20,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 19:52:21,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 19:52:23,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 19:52:24,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 19:52:26,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=471846.6666666667, ans=0.125 2023-09-29 19:52:28,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:52:29,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 19:52:31,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:52:39,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:52:40,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:52:40,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:52:44,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 19:52:44,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 19:52:44,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:52:47,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:52:47,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 19:52:49,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:52:49,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:52:49,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:52:49,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:52:50,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:52:50,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:52:51,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=471980.0, ans=0.125 2023-09-29 19:52:53,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:52:53,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:52:54,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:52:57,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:53:00,079 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 19:53:01,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:53:03,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:53:04,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 19:53:08,008 INFO [train.py:1039] (3/4) Epoch 14, batch 1750, loss[loss=0.1944, simple_loss=0.2515, pruned_loss=0.06865, over 23408.00 frames. ], tot_loss[loss=0.1916, simple_loss=0.2635, pruned_loss=0.05982, over 4702149.15 frames. ], batch size: 285, lr: 7.45e-03, grad_scale: 8.0 2023-09-29 19:53:11,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:53:14,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:53:15,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 19:53:15,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 19:53:15,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:53:19,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:53:19,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:53:24,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 19:53:26,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:53:28,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 19:53:28,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:53:30,556 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.89 vs. limit=15.0 2023-09-29 19:53:31,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:53:34,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 19:53:36,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 19:53:38,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:53:38,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 19:53:46,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:53:51,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:53:51,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:53:53,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:53:54,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:53:54,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=472180.0, ans=0.1 2023-09-29 19:53:56,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:53:56,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:53:59,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:53:59,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:53:59,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=472246.6666666667, ans=0.0 2023-09-29 19:54:01,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 19:54:05,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 19:54:06,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 19:54:06,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:54:08,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:54:09,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:54:14,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:54:15,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 19:54:15,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:54:18,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:54:23,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:54:25,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:54:27,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:54:27,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 19:54:27,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:54:28,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 19:54:28,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:54:28,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 19:54:29,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:54:29,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=472380.0, ans=0.95 2023-09-29 19:54:31,058 INFO [train.py:1039] (3/4) Epoch 14, batch 1800, loss[loss=0.1904, simple_loss=0.2744, pruned_loss=0.05324, over 24614.00 frames. ], tot_loss[loss=0.1908, simple_loss=0.2632, pruned_loss=0.05924, over 4711562.02 frames. ], batch size: 68, lr: 7.44e-03, grad_scale: 8.0 2023-09-29 19:54:31,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:54:34,554 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 2.007e+02 2.286e+02 2.732e+02 4.452e+02, threshold=4.572e+02, percent-clipped=3.0 2023-09-29 19:54:34,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 19:54:34,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:54:36,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 19:54:41,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:54:42,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 19:54:44,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:54:46,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=472446.6666666667, ans=0.125 2023-09-29 19:54:47,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:54:50,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:54:51,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:54:53,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:54:54,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 19:54:55,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 19:54:55,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:55:00,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:55:06,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 19:55:06,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 19:55:08,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 19:55:08,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:55:10,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:55:10,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:55:10,148 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 19:55:15,668 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 19:55:17,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:55:20,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:55:21,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 19:55:21,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 19:55:21,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 19:55:23,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:55:24,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 19:55:27,304 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.91 vs. limit=15.0 2023-09-29 19:55:29,092 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.57 vs. limit=22.5 2023-09-29 19:55:29,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 19:55:36,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:55:36,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 19:55:37,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:55:37,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:55:38,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:55:39,913 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 19:55:45,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:55:45,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:55:47,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=472646.6666666667, ans=0.2 2023-09-29 19:55:48,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 19:55:48,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:55:51,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:55:51,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:55:51,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:55:52,896 INFO [train.py:1039] (3/4) Epoch 14, batch 1850, loss[loss=0.1948, simple_loss=0.2646, pruned_loss=0.06245, over 23819.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2631, pruned_loss=0.05902, over 4710911.67 frames. ], batch size: 212, lr: 7.44e-03, grad_scale: 8.0 2023-09-29 19:55:53,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:55:53,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 19:55:54,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:55:56,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:55:57,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:55:59,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:56:05,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:56:05,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 19:56:10,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 19:56:12,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 19:56:16,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:56:18,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 19:56:18,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 19:56:29,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 19:56:31,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 19:56:34,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:56:34,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:56:39,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 19:56:39,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:56:41,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 19:56:43,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 19:56:46,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 19:56:48,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:56:51,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 19:56:53,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:56:53,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 19:56:53,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:56:55,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:56:57,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 19:57:00,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 19:57:01,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:57:05,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 19:57:06,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 19:57:06,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 19:57:06,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 19:57:08,024 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 19:57:09,558 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 19:57:11,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:57:11,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:57:11,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:57:12,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:57:12,724 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 19:57:12,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:57:14,119 INFO [train.py:1039] (3/4) Epoch 14, batch 1900, loss[loss=0.1772, simple_loss=0.254, pruned_loss=0.05015, over 24650.00 frames. ], tot_loss[loss=0.1915, simple_loss=0.2643, pruned_loss=0.05934, over 4711582.57 frames. ], batch size: 65, lr: 7.44e-03, grad_scale: 8.0 2023-09-29 19:57:14,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:57:14,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 19:57:15,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:57:17,954 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.953e+02 2.438e+02 3.060e+02 4.986e+02, threshold=4.875e+02, percent-clipped=3.0 2023-09-29 19:57:18,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:57:18,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 19:57:19,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:57:19,719 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 19:57:19,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 19:57:21,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:57:27,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:57:28,310 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.53 vs. limit=15.0 2023-09-29 19:57:29,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 19:57:30,220 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.03 vs. limit=15.0 2023-09-29 19:57:32,996 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 19:57:33,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 19:57:34,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 19:57:36,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:57:36,076 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 19:57:36,129 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 19:57:37,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=473113.3333333333, ans=0.125 2023-09-29 19:57:37,815 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=473113.3333333333, ans=0.0 2023-09-29 19:57:41,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 19:57:43,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 19:57:46,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 19:57:48,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 19:57:53,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=473180.0, ans=0.125 2023-09-29 19:57:55,847 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=473180.0, ans=0.0 2023-09-29 19:57:58,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 19:58:01,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 19:58:01,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:58:01,461 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 19:58:01,468 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 19:58:01,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 19:58:03,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 19:58:03,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:58:08,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 19:58:10,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=473246.6666666667, ans=0.0 2023-09-29 19:58:11,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 19:58:13,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:58:13,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 19:58:16,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 19:58:19,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 19:58:20,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:58:27,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 19:58:27,489 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 19:58:27,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:58:27,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 19:58:29,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 19:58:29,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 19:58:30,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 19:58:33,713 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:58:33,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:58:35,793 INFO [train.py:1039] (3/4) Epoch 14, batch 1950, loss[loss=0.2208, simple_loss=0.2755, pruned_loss=0.08303, over 23718.00 frames. ], tot_loss[loss=0.1917, simple_loss=0.2643, pruned_loss=0.05955, over 4715032.18 frames. ], batch size: 179, lr: 7.44e-03, grad_scale: 8.0 2023-09-29 19:58:35,991 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 19:58:35,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:58:36,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 19:58:37,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 19:58:38,110 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.08 vs. limit=22.5 2023-09-29 19:58:42,534 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:58:44,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 19:58:45,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:58:45,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 19:58:47,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 19:58:48,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 19:58:48,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:58:50,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:58:53,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 19:58:53,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:58:54,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:58:57,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:59:00,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 19:59:00,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 19:59:01,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 19:59:01,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:59:06,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:59:09,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 19:59:09,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:59:09,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 19:59:09,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 19:59:11,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 19:59:11,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 19:59:11,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:59:14,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:59:17,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 19:59:23,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 19:59:26,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 19:59:26,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 19:59:27,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 19:59:27,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 19:59:31,245 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=473580.0, ans=0.125 2023-09-29 19:59:32,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 19:59:32,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=473580.0, ans=0.035 2023-09-29 19:59:33,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 19:59:33,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 19:59:43,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:59:45,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:59:47,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 19:59:49,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:59:52,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 19:59:54,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 19:59:54,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 19:59:54,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 19:59:55,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 19:59:55,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 19:59:57,169 INFO [train.py:1039] (3/4) Epoch 14, batch 2000, loss[loss=0.1792, simple_loss=0.2643, pruned_loss=0.04702, over 24644.00 frames. ], tot_loss[loss=0.1916, simple_loss=0.2644, pruned_loss=0.05938, over 4722739.19 frames. ], batch size: 73, lr: 7.43e-03, grad_scale: 16.0 2023-09-29 19:59:58,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:00:00,435 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.917e+02 2.206e+02 2.573e+02 3.762e+02, threshold=4.412e+02, percent-clipped=0.0 2023-09-29 20:00:00,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:00:02,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:00:02,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:00:03,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:00:06,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:00:08,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=473713.3333333333, ans=0.035 2023-09-29 20:00:10,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=473713.3333333333, ans=0.125 2023-09-29 20:00:11,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 20:00:11,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:00:16,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:00:17,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 20:00:19,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:00:19,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:00:21,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:00:24,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 20:00:26,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:26,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:26,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:28,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 20:00:29,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:00:31,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 20:00:31,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:00:34,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:00:34,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 20:00:34,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:35,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:00:36,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=473846.6666666667, ans=0.2 2023-09-29 20:00:37,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:00:38,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 20:00:40,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 20:00:40,459 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:00:40,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:00:45,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:00:46,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:00:46,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:00:47,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:00:49,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:00:50,580 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.00 vs. limit=15.0 2023-09-29 20:00:51,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:00:51,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:00:51,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:00:53,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:00:56,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:00:57,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 20:00:59,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=473913.3333333333, ans=0.0 2023-09-29 20:01:04,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:01:05,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:08,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:08,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:01:12,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=473980.0, ans=0.0 2023-09-29 20:01:13,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:14,166 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.84 vs. limit=15.0 2023-09-29 20:01:14,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:01:14,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:15,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:01:15,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:01:18,021 INFO [train.py:1039] (3/4) Epoch 14, batch 2050, loss[loss=0.2051, simple_loss=0.2864, pruned_loss=0.06193, over 23965.00 frames. ], tot_loss[loss=0.1907, simple_loss=0.2643, pruned_loss=0.0586, over 4742581.93 frames. ], batch size: 80, lr: 7.43e-03, grad_scale: 16.0 2023-09-29 20:01:18,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:18,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=474046.6666666667, ans=0.09899494936611666 2023-09-29 20:01:19,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:25,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:01:25,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:30,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:01:33,753 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:01:35,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:01:35,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:01:36,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 20:01:36,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:01:37,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:01:38,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:01:39,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=474113.3333333333, ans=0.125 2023-09-29 20:01:48,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:01:48,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:49,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 20:01:53,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:01:54,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 20:01:55,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:01:58,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:02:01,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:02:03,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 20:02:03,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:02:05,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:02:06,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:02:06,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:02:08,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:02:10,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:02:13,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:02:13,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:02:18,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:02:23,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:02:23,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 20:02:24,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=474313.3333333333, ans=0.0 2023-09-29 20:02:29,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:02:30,687 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:02:32,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:02:33,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 20:02:38,890 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 20:02:38,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:02:39,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:02:40,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:02:40,636 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:02:41,991 INFO [train.py:1039] (3/4) Epoch 14, batch 2100, loss[loss=0.1804, simple_loss=0.2562, pruned_loss=0.05228, over 24471.00 frames. ], tot_loss[loss=0.1892, simple_loss=0.2621, pruned_loss=0.0582, over 4725932.22 frames. ], batch size: 63, lr: 7.43e-03, grad_scale: 16.0 2023-09-29 20:02:42,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 20:02:42,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 20:02:43,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:02:45,127 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.952e+02 2.197e+02 2.435e+02 3.188e+02, threshold=4.394e+02, percent-clipped=0.0 2023-09-29 20:02:46,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:02:48,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:02:50,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:02:51,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:02:51,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 20:02:52,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:02:54,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 20:02:54,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 20:02:56,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:02:56,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:02:56,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 20:02:57,134 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=474446.6666666667, ans=0.2 2023-09-29 20:02:58,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 20:03:05,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 20:03:05,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:03:08,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:03:10,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:03:13,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:03:15,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 20:03:15,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:03:15,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 20:03:18,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 20:03:18,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:03:18,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 20:03:18,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 20:03:20,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 20:03:21,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:03:23,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:03:23,653 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=474513.3333333333, ans=0.125 2023-09-29 20:03:26,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:03:27,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:03:30,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:03:31,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:03:31,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 20:03:31,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:03:31,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:03:32,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:03:32,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 20:03:36,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 20:03:36,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 20:03:37,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=474580.0, ans=0.09899494936611666 2023-09-29 20:03:41,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:03:44,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:03:44,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 20:03:50,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:03:53,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:03:53,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:03:53,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:03:53,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 20:03:53,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:03:57,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:03:57,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:03:57,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:03:57,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:03:59,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 20:04:01,305 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 20:04:01,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:04:02,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:04:02,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:04:04,268 INFO [train.py:1039] (3/4) Epoch 14, batch 2150, loss[loss=0.1878, simple_loss=0.2513, pruned_loss=0.06213, over 23684.00 frames. ], tot_loss[loss=0.1885, simple_loss=0.2611, pruned_loss=0.05792, over 4729867.71 frames. ], batch size: 232, lr: 7.43e-03, grad_scale: 16.0 2023-09-29 20:04:04,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:04:05,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:04:13,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 20:04:15,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:04:15,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:04:15,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=474713.3333333333, ans=0.125 2023-09-29 20:04:17,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:04:18,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:18,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:04:22,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:04:23,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:04:23,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:04:26,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:28,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 20:04:32,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:04:34,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:04:37,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:37,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:04:37,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:37,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:04:38,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:04:38,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:04:39,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:04:40,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 20:04:42,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:04:42,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:04:42,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:04:44,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:04:46,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:04:48,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:04:50,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:04:51,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:04:51,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 20:04:51,738 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:04:53,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:04:53,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:54,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:04:56,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:04:58,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:04:59,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:04:59,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 20:05:02,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 20:05:02,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:05:02,930 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 20:05:04,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:05:04,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:05:05,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 20:05:05,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:05:05,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 20:05:05,982 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 20:05:05,983 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 20:05:06,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 20:05:07,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:05:09,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:05:09,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:05:10,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:05:12,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 20:05:13,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:05:15,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:05:24,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:05:25,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 20:05:27,308 INFO [train.py:1039] (3/4) Epoch 14, batch 2200, loss[loss=0.1931, simple_loss=0.2666, pruned_loss=0.05979, over 23284.00 frames. ], tot_loss[loss=0.1887, simple_loss=0.2618, pruned_loss=0.05782, over 4732158.64 frames. ], batch size: 119, lr: 7.42e-03, grad_scale: 16.0 2023-09-29 20:05:29,108 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:05:29,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=475046.6666666667, ans=0.125 2023-09-29 20:05:30,432 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.895e+02 2.112e+02 2.594e+02 4.631e+02, threshold=4.225e+02, percent-clipped=1.0 2023-09-29 20:05:35,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:05:35,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:05:36,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=475046.6666666667, ans=0.0 2023-09-29 20:05:37,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:05:37,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:05:40,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:05:42,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:05:42,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 20:05:46,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 20:05:47,501 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.47 vs. limit=15.0 2023-09-29 20:05:47,556 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.11 vs. limit=6.0 2023-09-29 20:05:48,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:05:50,696 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.22 vs. limit=15.0 2023-09-29 20:05:53,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 20:05:57,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:05:58,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:06:00,149 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:06:03,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:06:03,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 20:06:05,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:06:07,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:06:07,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 20:06:12,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:06:13,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:06:15,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:06:16,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:06:18,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 20:06:20,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:06:20,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=475246.6666666667, ans=0.125 2023-09-29 20:06:23,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 20:06:26,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:06:26,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:06:26,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:06:29,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:06:29,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:06:29,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:06:29,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:06:30,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 20:06:32,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:06:32,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:06:35,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 20:06:35,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:06:38,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:06:40,820 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 20:06:41,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:06:42,473 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 20:06:43,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 20:06:44,062 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 20:06:45,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:06:46,201 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.33 vs. limit=15.0 2023-09-29 20:06:47,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 20:06:48,128 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.74 vs. limit=15.0 2023-09-29 20:06:50,011 INFO [train.py:1039] (3/4) Epoch 14, batch 2250, loss[loss=0.197, simple_loss=0.2645, pruned_loss=0.06479, over 24431.00 frames. ], tot_loss[loss=0.19, simple_loss=0.2628, pruned_loss=0.05856, over 4736494.42 frames. ], batch size: 58, lr: 7.42e-03, grad_scale: 16.0 2023-09-29 20:06:50,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:06:51,666 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 20:06:53,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:06:56,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:07:00,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=475380.0, ans=0.07 2023-09-29 20:07:01,088 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.52 vs. limit=22.5 2023-09-29 20:07:04,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:07:05,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:07:08,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:07:09,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:07:10,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:07:12,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 20:07:12,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:07:12,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:07:15,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 20:07:17,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:07:17,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:07:19,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:07:23,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:07:24,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 20:07:24,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 20:07:24,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=475513.3333333333, ans=0.0 2023-09-29 20:07:27,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 20:07:28,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:07:30,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:07:34,183 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=475513.3333333333, ans=0.0 2023-09-29 20:07:35,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:07:37,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:07:37,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:07:37,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:07:40,819 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=475580.0, ans=0.0 2023-09-29 20:07:42,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:07:43,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:07:48,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:07:49,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 20:07:53,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:07:54,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=475580.0, ans=10.0 2023-09-29 20:07:55,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:07:55,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:08:01,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=475646.6666666667, ans=0.125 2023-09-29 20:08:02,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 20:08:04,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:08:04,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 20:08:04,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:08:05,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:08:08,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 20:08:12,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:08:12,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:08:13,502 INFO [train.py:1039] (3/4) Epoch 14, batch 2300, loss[loss=0.2207, simple_loss=0.2836, pruned_loss=0.07885, over 22808.00 frames. ], tot_loss[loss=0.1908, simple_loss=0.2634, pruned_loss=0.05905, over 4737233.18 frames. ], batch size: 322, lr: 7.42e-03, grad_scale: 16.0 2023-09-29 20:08:16,564 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.985e+02 2.281e+02 2.632e+02 4.053e+02, threshold=4.563e+02, percent-clipped=0.0 2023-09-29 20:08:18,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:08:19,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:08:21,459 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 20:08:23,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:08:30,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:08:30,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 20:08:30,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:08:30,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:08:30,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 20:08:33,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:08:34,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:08:36,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:08:41,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:08:43,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=475780.0, ans=0.125 2023-09-29 20:08:45,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:08:49,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:08:50,398 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.18 vs. limit=22.5 2023-09-29 20:08:52,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:08:54,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:08:54,979 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.64 vs. limit=15.0 2023-09-29 20:08:57,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:08:59,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:09:04,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:09:05,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 20:09:05,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:09:05,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 20:09:10,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 20:09:10,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:09:10,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:09:10,460 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:09:11,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:09:11,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 20:09:11,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:09:12,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 20:09:13,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:09:13,421 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:09:13,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 20:09:13,895 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=475913.3333333333, ans=0.1 2023-09-29 20:09:22,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:09:27,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:09:30,471 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=475980.0, ans=0.125 2023-09-29 20:09:31,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:09:31,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:09:31,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:09:32,095 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=475980.0, ans=0.125 2023-09-29 20:09:33,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:09:33,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:09:34,813 INFO [train.py:1039] (3/4) Epoch 14, batch 2350, loss[loss=0.1823, simple_loss=0.2674, pruned_loss=0.04862, over 24605.00 frames. ], tot_loss[loss=0.1913, simple_loss=0.2644, pruned_loss=0.05907, over 4731138.48 frames. ], batch size: 68, lr: 7.42e-03, grad_scale: 16.0 2023-09-29 20:09:34,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:09:36,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 20:09:40,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:09:40,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 20:09:46,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 20:09:49,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:09:55,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:09:55,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:09:55,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:09:55,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:09:55,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 20:10:00,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:10:07,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 20:10:09,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:10:10,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:10:10,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:10:12,783 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=476180.0, ans=0.2 2023-09-29 20:10:15,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:10:17,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 20:10:17,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:10:17,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=476180.0, ans=0.1 2023-09-29 20:10:19,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:10:20,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:10:20,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:10:25,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:10:27,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 20:10:28,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:10:31,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:10:31,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:10:32,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 20:10:33,611 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.66 vs. limit=15.0 2023-09-29 20:10:34,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:10:35,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 20:10:37,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:10:42,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 20:10:43,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 20:10:44,723 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.03 vs. limit=15.0 2023-09-29 20:10:45,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:10:45,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 20:10:45,412 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 20:10:46,463 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.16 vs. limit=15.0 2023-09-29 20:10:46,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 20:10:48,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 20:10:51,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:10:55,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:10:56,571 INFO [train.py:1039] (3/4) Epoch 14, batch 2400, loss[loss=0.1917, simple_loss=0.2746, pruned_loss=0.05437, over 23982.00 frames. ], tot_loss[loss=0.1911, simple_loss=0.2644, pruned_loss=0.05893, over 4729873.15 frames. ], batch size: 80, lr: 7.41e-03, grad_scale: 32.0 2023-09-29 20:10:57,509 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=13.00 vs. limit=15.0 2023-09-29 20:10:59,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=476380.0, ans=0.0 2023-09-29 20:10:59,957 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.484e+02 1.908e+02 2.123e+02 2.498e+02 3.353e+02, threshold=4.247e+02, percent-clipped=0.0 2023-09-29 20:11:00,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:11:01,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:11:01,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 20:11:03,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 20:11:04,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=476380.0, ans=0.125 2023-09-29 20:11:09,171 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.95 vs. limit=15.0 2023-09-29 20:11:11,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:11:11,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:11:13,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 20:11:14,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:11:16,064 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:11:16,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 20:11:22,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:11:24,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 20:11:29,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:11:35,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 20:11:39,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:11:40,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:11:45,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:11:47,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 20:11:48,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:11:50,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=476580.0, ans=0.0 2023-09-29 20:11:53,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:11:56,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:12:00,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=476646.6666666667, ans=0.5 2023-09-29 20:12:01,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:12:02,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:12:02,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 20:12:02,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:12:03,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:12:03,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:12:03,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:12:08,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:12:08,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:12:08,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 20:12:08,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=476646.6666666667, ans=0.125 2023-09-29 20:12:10,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 20:12:12,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:12:12,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:12:13,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 20:12:15,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 20:12:15,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 20:12:15,458 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 20:12:16,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 20:12:18,290 INFO [train.py:1039] (3/4) Epoch 14, batch 2450, loss[loss=0.1847, simple_loss=0.2387, pruned_loss=0.0653, over 22773.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2622, pruned_loss=0.05895, over 4709954.01 frames. ], batch size: 322, lr: 7.41e-03, grad_scale: 16.0 2023-09-29 20:12:18,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:12:19,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:12:19,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:12:21,480 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 20:12:23,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:12:23,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 20:12:24,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:12:24,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:12:28,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:12:28,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:12:29,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 20:12:36,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:12:36,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:12:38,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:12:39,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:12:39,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:12:39,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 20:12:45,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=476780.0, ans=0.125 2023-09-29 20:12:46,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:12:47,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:12:47,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:12:52,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:12:52,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:12:52,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:12:54,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:12:54,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 20:12:56,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:13:03,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:13:05,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:13:05,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:13:05,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:13:07,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:13:07,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:13:08,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 20:13:14,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:13:14,350 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:13:17,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:13:17,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:13:18,584 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.28 vs. limit=15.0 2023-09-29 20:13:22,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:13:22,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 20:13:23,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:13:25,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:13:25,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 20:13:25,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:13:26,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:13:31,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:13:32,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:13:32,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:13:37,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 20:13:38,890 INFO [train.py:1039] (3/4) Epoch 14, batch 2500, loss[loss=0.1875, simple_loss=0.2619, pruned_loss=0.05654, over 23310.00 frames. ], tot_loss[loss=0.1893, simple_loss=0.2612, pruned_loss=0.05869, over 4696440.36 frames. ], batch size: 93, lr: 7.41e-03, grad_scale: 16.0 2023-09-29 20:13:39,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:13:44,479 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.907e+02 2.163e+02 2.456e+02 3.959e+02, threshold=4.326e+02, percent-clipped=0.0 2023-09-29 20:13:48,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:13:57,265 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=477113.3333333333, ans=0.125 2023-09-29 20:13:57,754 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.55 vs. limit=15.0 2023-09-29 20:13:58,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:13:58,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:14:00,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:14:00,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 20:14:04,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:14:05,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:14:06,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 20:14:06,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 20:14:08,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 20:14:09,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:14:10,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:14:10,933 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 20:14:10,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:14:12,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 20:14:12,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:14:17,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:14:18,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:14:23,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:14:25,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 20:14:27,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:14:29,234 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.23 vs. limit=15.0 2023-09-29 20:14:30,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:14:33,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:14:36,511 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:14:39,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:14:43,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 20:14:47,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 20:14:47,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:14:47,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 20:14:50,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:14:50,623 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:14:52,865 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 20:14:52,866 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 20:14:52,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 20:14:54,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:14:58,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 20:14:58,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 20:14:59,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:14:59,636 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 20:15:03,179 INFO [train.py:1039] (3/4) Epoch 14, batch 2550, loss[loss=0.2017, simple_loss=0.2685, pruned_loss=0.06747, over 23301.00 frames. ], tot_loss[loss=0.1889, simple_loss=0.2607, pruned_loss=0.05853, over 4700976.80 frames. ], batch size: 119, lr: 7.41e-03, grad_scale: 16.0 2023-09-29 20:15:04,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 20:15:07,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:15:09,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:15:09,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:15:12,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:15:12,802 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 20:15:12,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:15:13,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=477380.0, ans=0.0 2023-09-29 20:15:16,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 20:15:17,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:15:19,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:15:22,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:15:22,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 20:15:23,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:15:25,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:15:25,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:15:26,225 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.27 vs. limit=15.0 2023-09-29 20:15:29,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:15:29,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 20:15:30,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 20:15:30,713 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:15:30,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 20:15:45,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:15:48,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:15:48,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:15:48,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:15:50,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:15:56,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:15:58,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:15:58,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:16:00,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:16:00,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 20:16:00,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:16:03,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:16:04,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:16:08,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:16:09,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 20:16:09,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:16:10,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:16:11,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:16:12,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 20:16:14,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:16:20,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:16:22,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:16:24,930 INFO [train.py:1039] (3/4) Epoch 14, batch 2600, loss[loss=0.1957, simple_loss=0.2807, pruned_loss=0.05528, over 24463.00 frames. ], tot_loss[loss=0.1893, simple_loss=0.2614, pruned_loss=0.05858, over 4714560.46 frames. ], batch size: 69, lr: 7.40e-03, grad_scale: 16.0 2023-09-29 20:16:25,193 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 20:16:28,254 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 20:16:28,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:16:28,353 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 20:16:29,684 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.879e+02 2.138e+02 2.500e+02 3.129e+02, threshold=4.275e+02, percent-clipped=0.0 2023-09-29 20:16:29,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 20:16:29,851 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 20:16:32,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:16:32,209 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 20:16:36,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 20:16:37,657 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 20:16:40,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:16:44,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 20:16:45,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 20:16:47,432 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:16:47,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 20:16:50,665 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 20:16:50,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 20:16:57,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:16:57,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:16:57,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:16:57,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 20:16:58,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:17:00,931 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:17:03,658 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 20:17:11,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:17:11,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:17:13,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 20:17:14,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:17:14,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:17:14,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 20:17:19,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:17:19,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:17:21,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:17:21,653 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=477913.3333333333, ans=0.0 2023-09-29 20:17:24,563 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 20:17:25,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:17:25,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:17:26,884 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.67 vs. limit=15.0 2023-09-29 20:17:31,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:17:32,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:17:32,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 20:17:33,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:17:36,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:17:36,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:17:41,523 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=477980.0, ans=0.2 2023-09-29 20:17:44,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 20:17:46,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:17:48,320 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.05 vs. limit=15.0 2023-09-29 20:17:49,005 INFO [train.py:1039] (3/4) Epoch 14, batch 2650, loss[loss=0.1998, simple_loss=0.263, pruned_loss=0.06829, over 23851.00 frames. ], tot_loss[loss=0.1889, simple_loss=0.2613, pruned_loss=0.05828, over 4720698.71 frames. ], batch size: 195, lr: 7.40e-03, grad_scale: 16.0 2023-09-29 20:17:49,394 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_ff2.min_abs, batch_count=478046.6666666667, ans=0.1 2023-09-29 20:17:50,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:17:54,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 20:17:55,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:17:57,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:17:58,741 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 20:17:58,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:18:01,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:18:03,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:18:04,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:18:07,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:18:09,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 20:18:09,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:18:09,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=478113.3333333333, ans=0.1 2023-09-29 20:18:10,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:18:12,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 20:18:12,753 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 20:18:17,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:18:18,814 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 20:18:18,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:18:20,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 20:18:24,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:18:24,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:18:24,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:18:24,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:18:31,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 20:18:31,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 20:18:34,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:18:39,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 20:18:39,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:18:40,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:18:40,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 20:18:40,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:18:42,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:18:43,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:18:46,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:18:47,046 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:18:47,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:18:48,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:18:50,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:18:50,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:18:50,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:18:53,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:18:53,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 20:18:58,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:19:00,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:19:00,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:19:02,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 20:19:05,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:19:06,118 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=478313.3333333333, ans=0.125 2023-09-29 20:19:07,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:19:07,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:19:08,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:09,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=478380.0, ans=0.2 2023-09-29 20:19:10,263 INFO [train.py:1039] (3/4) Epoch 14, batch 2700, loss[loss=0.1967, simple_loss=0.2517, pruned_loss=0.07087, over 22740.00 frames. ], tot_loss[loss=0.1902, simple_loss=0.2629, pruned_loss=0.05877, over 4731900.30 frames. ], batch size: 322, lr: 7.40e-03, grad_scale: 16.0 2023-09-29 20:19:10,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 20:19:10,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:13,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:19:13,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 20:19:14,652 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.969e+02 2.229e+02 2.662e+02 4.082e+02, threshold=4.458e+02, percent-clipped=0.0 2023-09-29 20:19:15,729 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.85 vs. limit=15.0 2023-09-29 20:19:16,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:19:18,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 20:19:19,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:19:21,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:21,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:22,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:19:22,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:19:23,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:19:23,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 20:19:24,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 20:19:25,455 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:19:25,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:19:27,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:19:27,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=478446.6666666667, ans=0.0 2023-09-29 20:19:28,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:19:32,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:19:32,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 20:19:33,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:19:38,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:19:38,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:19:40,114 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=478446.6666666667, ans=0.05 2023-09-29 20:19:44,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:19:44,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:19:44,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:19:44,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:19:46,266 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=478513.3333333333, ans=0.125 2023-09-29 20:19:47,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:19:48,357 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.97 vs. limit=22.5 2023-09-29 20:19:49,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:19:49,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:19:49,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:19:53,772 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=478513.3333333333, ans=0.1 2023-09-29 20:19:56,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:19:56,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:20:07,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:20:08,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:20:12,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:20:12,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:20:14,566 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.76 vs. limit=15.0 2023-09-29 20:20:15,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:20:16,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:20:17,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:20:18,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:20,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:20:20,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:20:23,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:20:25,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:20:25,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:20:26,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 20:20:27,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:20:30,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:20:30,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 20:20:31,475 INFO [train.py:1039] (3/4) Epoch 14, batch 2750, loss[loss=0.1682, simple_loss=0.2237, pruned_loss=0.05629, over 22664.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.263, pruned_loss=0.05912, over 4733507.90 frames. ], batch size: 322, lr: 7.40e-03, grad_scale: 16.0 2023-09-29 20:20:31,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 20:20:31,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:20:35,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:20:35,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:20:38,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:39,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:20:39,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:42,859 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=478713.3333333333, ans=0.1 2023-09-29 20:20:44,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:20:46,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 20:20:46,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:20:46,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:46,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 20:20:46,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:20:46,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:20:52,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 20:20:54,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:20:55,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:20:55,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:20:55,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 20:20:57,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:20:57,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:20:57,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:20:58,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:21:03,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:21:03,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 20:21:05,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:21:06,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:21:07,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=478846.6666666667, ans=0.2 2023-09-29 20:21:09,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:21:14,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=478846.6666666667, ans=0.1 2023-09-29 20:21:17,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:21:18,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:21:18,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:21:19,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=478846.6666666667, ans=0.125 2023-09-29 20:21:20,646 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=478913.3333333333, ans=0.125 2023-09-29 20:21:22,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:21:22,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:21:24,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:21:25,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=478913.3333333333, ans=0.035 2023-09-29 20:21:31,638 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:21:31,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:21:31,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 20:21:36,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:21:37,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 20:21:43,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 20:21:45,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:21:45,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=478980.0, ans=0.5 2023-09-29 20:21:46,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 20:21:47,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:21:49,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:21:49,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 20:21:49,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:21:53,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 20:21:54,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:21:54,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:21:56,020 INFO [train.py:1039] (3/4) Epoch 14, batch 2800, loss[loss=0.2013, simple_loss=0.2726, pruned_loss=0.06498, over 23143.00 frames. ], tot_loss[loss=0.1908, simple_loss=0.2626, pruned_loss=0.05945, over 4729025.52 frames. ], batch size: 105, lr: 7.39e-03, grad_scale: 8.0 2023-09-29 20:21:56,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 20:21:56,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:21:57,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:21:59,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:22:00,668 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 20:22:00,669 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 20:22:03,503 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.993e+02 2.230e+02 2.590e+02 3.913e+02, threshold=4.460e+02, percent-clipped=0.0 2023-09-29 20:22:05,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:22:06,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:22:06,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:22:10,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:22:11,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 20:22:15,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 20:22:16,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 20:22:16,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:22:18,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:22:18,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:22:24,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:22:25,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:22:25,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 20:22:25,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:22:26,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=479113.3333333333, ans=0.125 2023-09-29 20:22:34,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:22:35,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:22:36,691 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.74 vs. limit=10.0 2023-09-29 20:22:38,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:22:38,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:22:40,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:22:45,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:22:45,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 20:22:45,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=479246.6666666667, ans=0.07 2023-09-29 20:22:47,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:22:48,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:22:48,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:22:51,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:22:53,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:22:57,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:23:00,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:23:00,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:23:00,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:23:01,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 20:23:02,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:23:02,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:23:04,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 20:23:04,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:23:04,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:23:05,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:23:07,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 20:23:07,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:23:08,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:23:08,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:23:10,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 20:23:10,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=479313.3333333333, ans=0.0 2023-09-29 20:23:16,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:23:16,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 20:23:16,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:23:18,208 INFO [train.py:1039] (3/4) Epoch 14, batch 2850, loss[loss=0.1858, simple_loss=0.2716, pruned_loss=0.05004, over 24625.00 frames. ], tot_loss[loss=0.1903, simple_loss=0.2623, pruned_loss=0.05908, over 4728236.77 frames. ], batch size: 68, lr: 7.39e-03, grad_scale: 8.0 2023-09-29 20:23:19,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:23:23,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:23:24,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:23:24,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:23:28,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:23:28,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:23:30,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:23:32,258 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 20:23:38,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 20:23:38,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:23:40,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 20:23:40,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:23:44,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 20:23:46,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 20:23:47,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:23:53,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=479513.3333333333, ans=0.125 2023-09-29 20:24:00,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:24:01,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:24:01,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:24:02,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:24:02,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:24:02,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:24:05,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:24:05,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 20:24:08,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:24:10,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:24:10,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:24:10,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:24:13,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:24:14,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:24:15,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:24:17,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:24:18,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:24:20,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:24:21,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:24:24,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:24:27,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:24:31,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 20:24:31,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 20:24:33,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:24:34,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:24:34,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 20:24:34,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:24:36,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:24:37,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:24:37,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:24:37,037 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 20:24:37,100 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 20:24:37,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:24:38,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:24:41,448 INFO [train.py:1039] (3/4) Epoch 14, batch 2900, loss[loss=0.2022, simple_loss=0.2879, pruned_loss=0.05828, over 24421.00 frames. ], tot_loss[loss=0.19, simple_loss=0.2626, pruned_loss=0.05873, over 4717145.24 frames. ], batch size: 69, lr: 7.39e-03, grad_scale: 8.0 2023-09-29 20:24:45,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 20:24:45,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:24:46,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:24:46,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 20:24:50,317 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.953e+02 2.184e+02 2.656e+02 3.783e+02, threshold=4.367e+02, percent-clipped=0.0 2023-09-29 20:24:52,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:24:52,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 20:24:53,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 20:24:53,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=479713.3333333333, ans=0.125 2023-09-29 20:24:55,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:24:55,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:24:58,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:24:59,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:25:02,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:25:02,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:25:06,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:25:06,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 20:25:08,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:25:09,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:25:10,323 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.90 vs. limit=10.0 2023-09-29 20:25:13,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 20:25:14,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 20:25:17,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:25:17,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 20:25:17,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:25:21,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:25:21,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 20:25:22,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:25:24,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:25:27,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:25:28,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=479846.6666666667, ans=0.125 2023-09-29 20:25:30,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:25:32,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 20:25:32,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 20:25:32,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:25:37,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:25:37,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=479913.3333333333, ans=0.1 2023-09-29 20:25:38,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=479913.3333333333, ans=0.1 2023-09-29 20:25:40,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 20:25:40,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=479913.3333333333, ans=0.125 2023-09-29 20:25:42,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:25:42,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=479913.3333333333, ans=0.1 2023-09-29 20:25:47,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:25:59,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:25:59,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:26:01,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 20:26:05,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:26:05,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 20:26:06,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:26:06,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:26:07,884 INFO [train.py:1039] (3/4) Epoch 14, batch 2950, loss[loss=0.196, simple_loss=0.2768, pruned_loss=0.05763, over 24355.00 frames. ], tot_loss[loss=0.1902, simple_loss=0.2629, pruned_loss=0.05872, over 4711475.47 frames. ], batch size: 77, lr: 7.38e-03, grad_scale: 8.0 2023-09-29 20:26:11,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:26:12,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 20:26:14,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:26:14,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:26:14,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:26:16,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:26:18,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 20:26:18,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 20:26:21,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:26:21,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:26:29,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:26:29,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=480113.3333333333, ans=0.125 2023-09-29 20:26:31,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:26:33,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:26:34,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:26:37,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:26:37,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:26:39,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:26:41,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:26:41,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:26:42,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 20:26:46,390 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=480180.0, ans=0.125 2023-09-29 20:26:47,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 20:26:48,935 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 20:26:50,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:26:52,831 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 20:26:54,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 20:26:54,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:26:54,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:26:54,608 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 20:26:54,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 20:26:57,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 20:26:58,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=480246.6666666667, ans=0.05 2023-09-29 20:27:01,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:27:01,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:27:04,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:27:06,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:27:06,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:27:06,330 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 20:27:08,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:27:08,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 20:27:13,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:27:13,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:27:15,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 20:27:15,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:27:15,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 20:27:18,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:27:21,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:27:21,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:27:23,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:27:23,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 20:27:24,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:27:24,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:27:24,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:27:24,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:27:26,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:27:28,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:27:29,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:27:29,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 20:27:31,337 INFO [train.py:1039] (3/4) Epoch 14, batch 3000, loss[loss=0.1853, simple_loss=0.2712, pruned_loss=0.04965, over 24545.00 frames. ], tot_loss[loss=0.1913, simple_loss=0.2641, pruned_loss=0.05921, over 4714795.53 frames. ], batch size: 71, lr: 7.38e-03, grad_scale: 8.0 2023-09-29 20:27:31,338 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 20:27:46,683 INFO [train.py:1071] (3/4) Epoch 14, validation: loss=0.2839, simple_loss=0.2749, pruned_loss=0.1465, over 1125622.00 frames. 2023-09-29 20:27:46,684 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 20:27:46,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:27:48,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:27:49,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:27:51,630 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 20:27:53,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 20:27:54,615 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.646e+02 1.901e+02 2.128e+02 2.266e+02 3.715e+02, threshold=4.256e+02, percent-clipped=0.0 2023-09-29 20:27:54,845 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:27:56,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:27:56,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 20:27:56,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:27:58,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=480380.0, ans=0.1 2023-09-29 20:27:58,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=480380.0, ans=0.125 2023-09-29 20:28:04,229 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.98 vs. limit=15.0 2023-09-29 20:28:04,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:28:09,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=480446.6666666667, ans=0.125 2023-09-29 20:28:11,125 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:28:15,106 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.40 vs. limit=12.0 2023-09-29 20:28:16,283 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=480446.6666666667, ans=0.1 2023-09-29 20:28:17,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:28:22,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 20:28:24,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:28:26,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:28:26,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:28:26,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:28:29,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:28:29,336 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 20:28:32,276 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 20:28:33,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:28:35,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 20:28:37,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:28:38,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:28:38,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:28:38,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:28:41,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:28:42,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:28:42,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:28:43,620 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.95 vs. limit=15.0 2023-09-29 20:28:44,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:28:47,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 20:28:49,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:28:49,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=480580.0, ans=0.0 2023-09-29 20:28:50,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:28:50,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:28:53,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:28:53,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:28:56,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 20:28:56,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 20:28:57,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:28:57,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 20:28:57,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:28:59,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 20:29:02,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:29:03,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:29:03,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 20:29:05,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 20:29:05,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 20:29:05,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:29:06,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:29:08,220 INFO [train.py:1039] (3/4) Epoch 14, batch 3050, loss[loss=0.1755, simple_loss=0.2424, pruned_loss=0.05427, over 24312.00 frames. ], tot_loss[loss=0.1923, simple_loss=0.2647, pruned_loss=0.05994, over 4706107.96 frames. ], batch size: 56, lr: 7.38e-03, grad_scale: 8.0 2023-09-29 20:29:08,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 20:29:08,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:08,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:29:09,441 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=480713.3333333333, ans=0.05 2023-09-29 20:29:13,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 20:29:15,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:29:18,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:29:18,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:29:20,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=480713.3333333333, ans=0.1 2023-09-29 20:29:23,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:27,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 20:29:30,269 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.80 vs. limit=15.0 2023-09-29 20:29:30,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 20:29:33,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 20:29:33,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:29:36,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:29:37,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:37,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:29:38,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=480780.0, ans=0.0 2023-09-29 20:29:39,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:29:44,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:29:44,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:29:44,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:29:46,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:29:46,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:29:48,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:49,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:29:52,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:29:52,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 20:29:54,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:29:54,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:29:57,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:29:57,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:29:59,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:29:59,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:30:07,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:30:07,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:30:13,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:30:13,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:30:13,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:30:14,631 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.86 vs. limit=6.0 2023-09-29 20:30:16,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:30:18,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 20:30:18,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:30:20,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 20:30:22,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:30:22,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:30:22,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 20:30:25,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:30:30,453 INFO [train.py:1039] (3/4) Epoch 14, batch 3100, loss[loss=0.1888, simple_loss=0.2616, pruned_loss=0.05801, over 24023.00 frames. ], tot_loss[loss=0.1924, simple_loss=0.265, pruned_loss=0.0599, over 4710947.87 frames. ], batch size: 86, lr: 7.38e-03, grad_scale: 8.0 2023-09-29 20:30:32,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:30:34,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:30:34,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=481046.6666666667, ans=0.125 2023-09-29 20:30:35,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 20:30:38,647 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.952e+02 2.226e+02 2.517e+02 3.865e+02, threshold=4.452e+02, percent-clipped=0.0 2023-09-29 20:30:38,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 20:30:41,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 20:30:43,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 20:30:43,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:30:45,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:30:45,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:30:50,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 20:30:54,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:30:59,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 20:31:02,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=481180.0, ans=0.0 2023-09-29 20:31:06,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 20:31:07,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:07,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:31:07,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:31:08,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 20:31:10,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:31:10,569 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 20:31:10,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:31:12,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:31:13,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 20:31:15,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:31:18,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:31:19,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=481246.6666666667, ans=0.1 2023-09-29 20:31:20,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 20:31:20,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 20:31:21,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:22,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=481246.6666666667, ans=0.0 2023-09-29 20:31:23,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:31:24,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:31:25,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:26,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:31:26,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:31:26,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:31:28,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:31:30,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:31:30,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:30,913 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 20:31:35,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:31:37,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 20:31:39,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:31:39,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 20:31:40,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:31:42,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:42,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 20:31:44,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=481313.3333333333, ans=0.2 2023-09-29 20:31:53,120 INFO [train.py:1039] (3/4) Epoch 14, batch 3150, loss[loss=0.2054, simple_loss=0.29, pruned_loss=0.06042, over 24354.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.263, pruned_loss=0.05907, over 4709533.72 frames. ], batch size: 77, lr: 7.37e-03, grad_scale: 8.0 2023-09-29 20:31:53,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 20:31:56,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:31:57,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:31:58,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:31:58,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:32:00,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 20:32:01,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=481380.0, ans=0.125 2023-09-29 20:32:02,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:32:02,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 20:32:03,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 20:32:05,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:32:06,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=481380.0, ans=0.2 2023-09-29 20:32:09,090 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 20:32:12,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 20:32:12,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:32:14,126 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 20:32:14,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 20:32:15,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 20:32:17,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 20:32:17,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 20:32:17,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:32:17,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:32:18,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:32:20,259 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 20:32:21,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:32:21,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:32:23,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:32:23,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 20:32:28,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 20:32:28,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:32:31,681 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:32:31,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:32:31,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=481513.3333333333, ans=0.125 2023-09-29 20:32:34,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 20:32:38,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 20:32:38,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:32:39,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 20:32:39,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 20:32:40,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:32:40,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:32:40,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:32:40,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 20:32:41,581 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.67 vs. limit=12.0 2023-09-29 20:32:43,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 20:32:43,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:32:43,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:32:46,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=481580.0, ans=0.125 2023-09-29 20:32:47,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:32:47,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:32:47,542 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 20:32:47,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:32:49,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 20:32:49,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:32:49,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=481580.0, ans=0.125 2023-09-29 20:32:50,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 20:32:50,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 20:32:53,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:32:53,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:32:55,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 20:32:56,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 20:32:57,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:33:01,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:33:01,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:33:01,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:33:09,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:33:09,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:33:11,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 20:33:16,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:33:16,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 20:33:18,091 INFO [train.py:1039] (3/4) Epoch 14, batch 3200, loss[loss=0.1849, simple_loss=0.2543, pruned_loss=0.05774, over 23686.00 frames. ], tot_loss[loss=0.189, simple_loss=0.2609, pruned_loss=0.05854, over 4702057.27 frames. ], batch size: 149, lr: 7.37e-03, grad_scale: 16.0 2023-09-29 20:33:20,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:33:20,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:33:20,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 20:33:21,599 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.02 vs. limit=6.0 2023-09-29 20:33:24,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:33:26,522 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.865e+02 2.106e+02 2.358e+02 3.213e+02, threshold=4.213e+02, percent-clipped=0.0 2023-09-29 20:33:28,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:33:31,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:33:35,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=481780.0, ans=0.1 2023-09-29 20:33:41,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:33:52,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 20:33:53,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:33:56,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 20:33:56,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 20:34:01,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:34:01,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:34:03,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:34:06,153 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.90 vs. limit=8.0 2023-09-29 20:34:08,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 20:34:09,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 20:34:09,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 20:34:12,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 20:34:15,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:34:21,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:34:21,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 20:34:23,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:34:23,504 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 20:34:23,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:34:30,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:34:31,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 20:34:33,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 20:34:34,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 20:34:36,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 20:34:37,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:34:40,984 INFO [train.py:1039] (3/4) Epoch 14, batch 3250, loss[loss=0.2068, simple_loss=0.2687, pruned_loss=0.07247, over 23708.00 frames. ], tot_loss[loss=0.189, simple_loss=0.261, pruned_loss=0.05852, over 4716906.48 frames. ], batch size: 232, lr: 7.37e-03, grad_scale: 16.0 2023-09-29 20:34:41,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:34:41,120 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 20:34:41,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:34:41,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:34:42,713 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 20:34:47,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:34:48,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:34:58,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:34:58,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 20:35:00,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:35:02,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:35:02,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:35:02,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:35:03,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:35:05,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:35:06,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:35:06,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:35:06,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:35:06,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:35:08,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:35:09,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:35:11,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:35:14,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:35:14,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:35:16,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:35:16,149 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:35:16,755 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.24 vs. limit=22.5 2023-09-29 20:35:17,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:35:21,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 20:35:23,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:35:23,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:35:25,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:35:27,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:35:33,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:35:41,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:35:42,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:35:42,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 20:35:42,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:35:42,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 20:35:42,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:35:44,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 20:35:44,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 20:35:45,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:35:47,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:35:47,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:35:48,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 20:35:48,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:35:53,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:35:53,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:35:56,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 20:35:56,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:35:58,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=482313.3333333333, ans=0.2 2023-09-29 20:36:00,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:36:00,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 20:36:03,603 INFO [train.py:1039] (3/4) Epoch 14, batch 3300, loss[loss=0.1841, simple_loss=0.2536, pruned_loss=0.05735, over 24409.00 frames. ], tot_loss[loss=0.1893, simple_loss=0.2617, pruned_loss=0.0584, over 4728964.14 frames. ], batch size: 58, lr: 7.37e-03, grad_scale: 16.0 2023-09-29 20:36:03,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:36:03,794 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 20:36:05,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 20:36:07,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 20:36:07,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:36:11,794 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.874e+02 2.156e+02 2.504e+02 3.971e+02, threshold=4.311e+02, percent-clipped=0.0 2023-09-29 20:36:12,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:36:13,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:36:13,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:36:15,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=482380.0, ans=0.125 2023-09-29 20:36:16,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 20:36:16,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:36:18,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:36:21,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:36:24,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=482446.6666666667, ans=0.125 2023-09-29 20:36:25,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 20:36:26,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:36:26,098 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:36:27,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:36:27,742 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 20:36:29,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:36:30,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 20:36:30,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:36:30,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:36:30,989 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 20:36:34,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:36:34,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:36:37,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:36:37,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 20:36:38,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 20:36:40,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:36:40,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:36:43,572 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 20:36:45,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 20:36:45,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:36:46,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 20:36:49,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:36:52,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 20:36:54,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:36:57,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:36:57,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:36:57,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:36:57,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:37:00,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:37:00,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:37:01,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:37:03,269 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 20:37:03,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 20:37:05,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 20:37:06,726 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=482580.0, ans=0.125 2023-09-29 20:37:07,581 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:37:07,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:37:09,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:37:09,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:37:11,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:37:11,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:12,779 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:37:13,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=482646.6666666667, ans=0.1 2023-09-29 20:37:14,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:37:16,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:37:19,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 20:37:21,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:37:21,559 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:24,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:37:24,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:37:24,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:37:26,200 INFO [train.py:1039] (3/4) Epoch 14, batch 3350, loss[loss=0.1838, simple_loss=0.2503, pruned_loss=0.05864, over 19948.00 frames. ], tot_loss[loss=0.19, simple_loss=0.2628, pruned_loss=0.05861, over 4723050.58 frames. ], batch size: 43, lr: 7.36e-03, grad_scale: 16.0 2023-09-29 20:37:27,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:37:27,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:37:28,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=482713.3333333333, ans=0.2 2023-09-29 20:37:30,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:37:33,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:37:34,938 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.49 vs. limit=15.0 2023-09-29 20:37:35,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:37:38,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:40,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:37:42,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:37:42,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:37:44,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 20:37:45,614 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 20:37:47,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:37:49,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 20:37:49,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 20:37:49,460 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:37:50,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:37:51,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:37:53,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 20:37:53,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:53,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:37:56,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:37:57,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=482846.6666666667, ans=0.125 2023-09-29 20:37:58,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:37:58,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:38:00,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:38:03,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:38:06,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:38:06,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:38:11,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:38:12,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:38:14,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:38:16,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:38:17,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:38:20,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 20:38:20,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:38:21,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 20:38:21,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:38:23,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 20:38:23,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:38:24,034 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.26 vs. limit=15.0 2023-09-29 20:38:24,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:38:28,879 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=482913.3333333333, ans=0.125 2023-09-29 20:38:32,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:38:33,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 20:38:33,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:38:35,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:38:35,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:38:41,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:38:43,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 20:38:43,595 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=482980.0, ans=0.125 2023-09-29 20:38:44,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:38:44,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:38:46,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:38:46,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 20:38:48,226 INFO [train.py:1039] (3/4) Epoch 14, batch 3400, loss[loss=0.271, simple_loss=0.323, pruned_loss=0.1095, over 19553.00 frames. ], tot_loss[loss=0.1916, simple_loss=0.2645, pruned_loss=0.05936, over 4714645.95 frames. ], batch size: 388, lr: 7.36e-03, grad_scale: 16.0 2023-09-29 20:38:48,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:38:48,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 20:38:49,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:38:50,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:38:51,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 20:38:53,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:38:53,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 20:38:56,129 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.936e+02 2.106e+02 2.472e+02 5.174e+02, threshold=4.212e+02, percent-clipped=2.0 2023-09-29 20:38:58,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 20:38:58,501 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 20:38:58,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:39:02,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:39:02,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:39:02,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=483046.6666666667, ans=0.125 2023-09-29 20:39:03,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:39:05,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:39:08,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=483113.3333333333, ans=0.125 2023-09-29 20:39:08,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=483113.3333333333, ans=0.1 2023-09-29 20:39:12,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:39:14,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 20:39:18,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:39:21,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:39:23,921 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:39:25,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 20:39:30,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:39:35,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 20:39:40,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:39:42,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:39:42,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 20:39:42,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:39:43,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:39:45,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:39:45,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:39:46,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:39:47,543 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=4.74 vs. limit=12.0 2023-09-29 20:39:49,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:39:49,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:39:54,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:39:58,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 20:40:04,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 20:40:04,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=483313.3333333333, ans=0.2 2023-09-29 20:40:09,463 INFO [train.py:1039] (3/4) Epoch 14, batch 3450, loss[loss=0.1816, simple_loss=0.2534, pruned_loss=0.05492, over 23580.00 frames. ], tot_loss[loss=0.1907, simple_loss=0.2633, pruned_loss=0.05904, over 4717052.21 frames. ], batch size: 134, lr: 7.36e-03, grad_scale: 16.0 2023-09-29 20:40:11,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 20:40:14,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 20:40:14,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:40:15,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=483380.0, ans=0.0 2023-09-29 20:40:16,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:40:16,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 20:40:17,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:40:21,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:40:24,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:40:25,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:40:27,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:40:27,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:40:30,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:40:37,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 20:40:43,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=483513.3333333333, ans=0.0 2023-09-29 20:40:44,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 20:40:44,411 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 20:40:44,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:40:46,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:40:46,921 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=483513.3333333333, ans=0.0 2023-09-29 20:40:51,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 20:40:51,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:40:56,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:40:56,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:40:57,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 20:40:59,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:41:01,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 20:41:01,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:41:03,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:41:07,276 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=483580.0, ans=0.125 2023-09-29 20:41:08,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:41:11,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 20:41:14,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:41:18,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:41:21,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:41:23,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:41:25,428 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=483646.6666666667, ans=0.0 2023-09-29 20:41:28,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:41:28,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:41:29,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:41:29,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:41:32,615 INFO [train.py:1039] (3/4) Epoch 14, batch 3500, loss[loss=0.2036, simple_loss=0.2859, pruned_loss=0.0606, over 24570.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.262, pruned_loss=0.05837, over 4719036.21 frames. ], batch size: 71, lr: 7.36e-03, grad_scale: 16.0 2023-09-29 20:41:34,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:41:35,118 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.59 vs. limit=15.0 2023-09-29 20:41:38,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:41:38,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 20:41:40,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=483713.3333333333, ans=0.2 2023-09-29 20:41:41,683 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.652e+02 2.028e+02 2.360e+02 2.884e+02 5.509e+02, threshold=4.720e+02, percent-clipped=5.0 2023-09-29 20:41:41,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 20:41:44,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 20:41:49,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:41:49,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 20:41:54,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:41:54,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:41:56,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:41:56,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:41:56,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 20:41:58,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:41:58,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:41:59,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 20:42:02,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:02,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:42:03,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=483780.0, ans=0.125 2023-09-29 20:42:04,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:42:07,038 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.51 vs. limit=15.0 2023-09-29 20:42:07,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:09,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 20:42:09,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:42:13,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:42:16,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:42:18,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:19,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:42:19,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:42:21,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 20:42:23,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 20:42:23,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 20:42:24,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:42:26,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:27,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:42:28,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:42:28,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=483913.3333333333, ans=0.125 2023-09-29 20:42:30,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 20:42:30,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:42:35,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:42:38,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 20:42:38,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 20:42:38,411 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:42:40,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:42:40,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:42:41,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:46,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 20:42:46,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:42:46,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=483980.0, ans=0.125 2023-09-29 20:42:47,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:42:49,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 20:42:52,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 20:42:55,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:42:55,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:42:55,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:42:55,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:42:56,645 INFO [train.py:1039] (3/4) Epoch 14, batch 3550, loss[loss=0.1592, simple_loss=0.2362, pruned_loss=0.0411, over 24335.00 frames. ], tot_loss[loss=0.1886, simple_loss=0.2605, pruned_loss=0.05835, over 4701391.68 frames. ], batch size: 56, lr: 7.35e-03, grad_scale: 16.0 2023-09-29 20:42:58,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:43:02,346 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=484046.6666666667, ans=0.1 2023-09-29 20:43:11,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:43:12,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 20:43:14,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:43:16,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:43:16,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:43:19,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:43:19,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:43:22,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:43:23,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:43:23,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:43:23,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 20:43:25,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:43:32,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:43:32,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:43:34,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:43:34,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:43:36,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:43:36,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 20:43:36,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:43:37,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:43:39,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 20:43:46,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:43:46,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:43:47,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:43:49,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 20:43:49,949 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=484246.6666666667, ans=0.0 2023-09-29 20:43:51,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:43:51,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 20:43:52,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:43:53,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=484246.6666666667, ans=0.035 2023-09-29 20:43:54,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:43:54,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:43:57,838 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 20:43:59,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:44:04,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:44:04,336 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 20:44:06,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:44:06,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=484313.3333333333, ans=0.125 2023-09-29 20:44:08,712 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.26 vs. limit=10.0 2023-09-29 20:44:10,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:44:12,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 20:44:18,041 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:44:19,146 INFO [train.py:1039] (3/4) Epoch 14, batch 3600, loss[loss=0.1921, simple_loss=0.2662, pruned_loss=0.05904, over 24124.00 frames. ], tot_loss[loss=0.1883, simple_loss=0.2607, pruned_loss=0.05792, over 4703319.74 frames. ], batch size: 80, lr: 7.35e-03, grad_scale: 32.0 2023-09-29 20:44:19,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 20:44:19,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:44:19,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=484380.0, ans=0.125 2023-09-29 20:44:20,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:44:22,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:44:23,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:44:24,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:44:27,550 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.842e+02 2.079e+02 2.493e+02 3.972e+02, threshold=4.157e+02, percent-clipped=0.0 2023-09-29 20:44:31,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:44:32,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:44:34,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:44:35,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:44:37,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:44:37,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 20:44:40,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:44:40,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:44:44,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:44:48,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:44:48,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:44:49,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:44:49,050 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 20:44:50,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:44:53,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:44:55,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:44:56,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:44:59,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:45:01,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:45:01,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 20:45:09,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:45:10,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 20:45:10,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 20:45:15,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:45:20,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:45:23,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:45:29,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:45:30,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:45:30,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 20:45:32,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 20:45:32,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 20:45:34,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=484646.6666666667, ans=0.0 2023-09-29 20:45:35,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:45:35,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:45:35,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 20:45:37,111 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:45:37,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:45:37,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:45:37,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 20:45:37,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=484646.6666666667, ans=0.1 2023-09-29 20:45:38,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 20:45:42,497 INFO [train.py:1039] (3/4) Epoch 14, batch 3650, loss[loss=0.187, simple_loss=0.2699, pruned_loss=0.05206, over 24336.00 frames. ], tot_loss[loss=0.1877, simple_loss=0.2609, pruned_loss=0.05725, over 4717078.18 frames. ], batch size: 74, lr: 7.35e-03, grad_scale: 32.0 2023-09-29 20:45:42,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:45:44,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 20:45:48,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=484713.3333333333, ans=0.1 2023-09-29 20:45:49,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 20:45:50,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:45:54,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 20:45:55,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 20:46:00,873 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:46:00,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 20:46:02,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:46:06,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 20:46:06,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:46:06,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=484780.0, ans=0.05 2023-09-29 20:46:07,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 20:46:07,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 20:46:09,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:46:09,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 20:46:10,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 20:46:12,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:46:12,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:46:12,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=484780.0, ans=0.0 2023-09-29 20:46:13,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:46:15,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 20:46:17,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 20:46:17,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:46:19,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 20:46:20,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:46:20,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:46:28,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:46:30,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:46:30,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:46:31,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:46:33,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:46:35,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:46:41,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:46:42,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:46:42,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:46:42,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 20:46:44,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:46:44,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=484913.3333333333, ans=0.0 2023-09-29 20:46:45,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:46:52,675 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 20:46:54,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:46:54,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:46:55,989 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:46:56,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:46:57,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 20:46:59,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:47:01,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 20:47:01,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:47:04,100 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:47:05,637 INFO [train.py:1039] (3/4) Epoch 14, batch 3700, loss[loss=0.1741, simple_loss=0.2644, pruned_loss=0.0419, over 24484.00 frames. ], tot_loss[loss=0.1883, simple_loss=0.2618, pruned_loss=0.05737, over 4726585.55 frames. ], batch size: 69, lr: 7.35e-03, grad_scale: 32.0 2023-09-29 20:47:05,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:47:07,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:47:09,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:47:09,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 20:47:09,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:47:11,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 20:47:12,417 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 20:47:12,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=485046.6666666667, ans=0.0 2023-09-29 20:47:13,910 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.934e+02 2.131e+02 2.298e+02 2.848e+02, threshold=4.262e+02, percent-clipped=0.0 2023-09-29 20:47:14,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 20:47:17,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:47:18,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:47:18,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 20:47:18,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:47:20,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 20:47:22,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:47:24,105 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 20:47:33,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:47:33,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 20:47:35,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:47:35,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 20:47:35,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:47:40,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:47:41,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 20:47:41,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:47:42,173 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=485180.0, ans=0.125 2023-09-29 20:47:43,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:47:47,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:47:49,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:47:52,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 20:47:53,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:47:55,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 20:47:55,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:47:56,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 20:48:00,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:48:01,025 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=485246.6666666667, ans=0.125 2023-09-29 20:48:02,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:48:03,109 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=7.39 vs. limit=15.0 2023-09-29 20:48:05,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:48:07,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 20:48:08,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:48:08,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 20:48:09,687 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.58 vs. limit=22.5 2023-09-29 20:48:10,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:48:10,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:48:13,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:48:14,275 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.10 vs. limit=15.0 2023-09-29 20:48:14,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 20:48:16,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 20:48:18,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:48:18,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:48:18,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:48:20,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:48:23,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:48:24,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:48:26,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:48:27,885 INFO [train.py:1039] (3/4) Epoch 14, batch 3750, loss[loss=0.1682, simple_loss=0.2511, pruned_loss=0.0426, over 24510.00 frames. ], tot_loss[loss=0.1899, simple_loss=0.2633, pruned_loss=0.05825, over 4729873.79 frames. ], batch size: 63, lr: 7.34e-03, grad_scale: 32.0 2023-09-29 20:48:29,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 20:48:31,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 20:48:34,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 20:48:34,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 20:48:36,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:48:37,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:48:39,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:48:39,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:48:43,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:48:46,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 20:48:47,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:48:50,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:48:54,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:48:56,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 20:48:56,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:48:56,863 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=485446.6666666667, ans=0.125 2023-09-29 20:48:58,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:48:58,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:49:02,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 20:49:03,010 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=485513.3333333333, ans=0.125 2023-09-29 20:49:05,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 20:49:07,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:49:09,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:49:09,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:49:10,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=485513.3333333333, ans=0.125 2023-09-29 20:49:13,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=485513.3333333333, ans=0.0 2023-09-29 20:49:15,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:49:16,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 20:49:21,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 20:49:23,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:49:27,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:49:29,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:49:33,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 20:49:33,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=485646.6666666667, ans=0.125 2023-09-29 20:49:37,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 20:49:39,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 20:49:41,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:49:43,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:49:46,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 20:49:50,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=485713.3333333333, ans=0.125 2023-09-29 20:49:51,107 INFO [train.py:1039] (3/4) Epoch 14, batch 3800, loss[loss=0.1744, simple_loss=0.2532, pruned_loss=0.04778, over 24453.00 frames. ], tot_loss[loss=0.1897, simple_loss=0.2632, pruned_loss=0.05806, over 4728848.04 frames. ], batch size: 63, lr: 7.34e-03, grad_scale: 32.0 2023-09-29 20:49:54,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:49:55,155 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=485713.3333333333, ans=0.025 2023-09-29 20:49:56,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=485713.3333333333, ans=0.125 2023-09-29 20:49:57,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:49:59,283 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.914e+02 2.283e+02 2.549e+02 3.824e+02, threshold=4.565e+02, percent-clipped=0.0 2023-09-29 20:49:59,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 20:49:59,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 20:50:01,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:50:04,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:50:04,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 20:50:06,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=485780.0, ans=0.1 2023-09-29 20:50:08,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 20:50:08,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:50:09,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:50:12,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:50:12,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:50:12,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:50:12,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 20:50:18,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 20:50:18,539 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=485780.0, ans=0.1 2023-09-29 20:50:19,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:50:22,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:50:25,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:50:25,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 20:50:29,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 20:50:29,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:50:32,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:50:32,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:50:36,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=485846.6666666667, ans=0.125 2023-09-29 20:50:37,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:50:37,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 20:50:39,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:50:47,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:50:48,181 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.77 vs. limit=12.0 2023-09-29 20:50:54,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:50:55,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 20:50:57,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 20:50:57,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:51:00,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:51:00,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:51:00,719 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=485980.0, ans=0.0 2023-09-29 20:51:02,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 20:51:05,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 20:51:05,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 20:51:05,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:51:07,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:51:13,737 INFO [train.py:1039] (3/4) Epoch 14, batch 3850, loss[loss=0.1899, simple_loss=0.2437, pruned_loss=0.06801, over 23571.00 frames. ], tot_loss[loss=0.1893, simple_loss=0.2623, pruned_loss=0.05818, over 4728432.19 frames. ], batch size: 256, lr: 7.34e-03, grad_scale: 16.0 2023-09-29 20:51:15,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:51:16,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:51:20,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:51:20,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 20:51:22,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:51:23,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:51:26,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 20:51:30,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:51:31,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 20:51:34,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 20:51:40,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:51:42,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:51:44,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:51:46,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:51:50,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:51:50,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:51:50,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:51:50,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:51:52,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:51:54,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:51:55,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:51:55,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:51:57,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 20:51:59,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 20:51:59,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:51:59,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:52:02,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:02,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:52:02,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 20:52:05,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 20:52:07,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:09,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 20:52:12,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 20:52:18,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:20,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:52:23,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:23,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 20:52:27,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 20:52:29,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:52:30,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:52:32,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 20:52:33,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 20:52:34,489 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:35,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:35,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:52:36,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 20:52:37,421 INFO [train.py:1039] (3/4) Epoch 14, batch 3900, loss[loss=0.1961, simple_loss=0.2587, pruned_loss=0.06676, over 23728.00 frames. ], tot_loss[loss=0.1882, simple_loss=0.26, pruned_loss=0.05822, over 4699452.73 frames. ], batch size: 232, lr: 7.34e-03, grad_scale: 16.0 2023-09-29 20:52:37,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:52:37,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 20:52:39,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:39,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:52:39,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:52:40,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:41,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:52:42,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:52:42,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:52:42,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:52:44,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 20:52:44,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:44,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=486380.0, ans=0.2 2023-09-29 20:52:47,046 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.944e+02 2.146e+02 2.547e+02 3.892e+02, threshold=4.292e+02, percent-clipped=0.0 2023-09-29 20:52:48,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:52:48,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:52:48,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:52:53,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:52:54,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 20:52:56,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:52:58,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:52:59,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 20:52:59,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:53:03,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 20:53:03,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:53:03,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 20:53:05,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 20:53:08,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:53:10,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:53:10,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:53:10,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:53:15,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:53:18,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:53:18,852 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.95 vs. limit=15.0 2023-09-29 20:53:19,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:53:19,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:53:20,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=486513.3333333333, ans=0.2 2023-09-29 20:53:21,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:53:25,176 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.48 vs. limit=15.0 2023-09-29 20:53:28,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:53:28,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:53:30,348 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=486580.0, ans=0.125 2023-09-29 20:53:37,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 20:53:40,149 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:53:49,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:53:51,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:53:53,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 20:53:54,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 20:53:54,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 20:53:56,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 20:53:56,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:53:57,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 20:54:01,169 INFO [train.py:1039] (3/4) Epoch 14, batch 3950, loss[loss=0.2026, simple_loss=0.2737, pruned_loss=0.06575, over 23763.00 frames. ], tot_loss[loss=0.1885, simple_loss=0.2609, pruned_loss=0.05801, over 4705710.10 frames. ], batch size: 179, lr: 7.33e-03, grad_scale: 16.0 2023-09-29 20:54:03,772 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=486713.3333333333, ans=0.0 2023-09-29 20:54:05,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:54:08,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 20:54:08,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:54:09,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:54:11,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:54:18,304 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 20:54:18,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:54:19,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 20:54:19,940 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 20:54:19,989 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:54:20,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=486780.0, ans=0.2 2023-09-29 20:54:22,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:54:22,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 20:54:22,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:54:25,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 20:54:27,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:54:29,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 20:54:29,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 20:54:29,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 20:54:29,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 20:54:37,548 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.33 vs. limit=15.0 2023-09-29 20:54:41,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:54:43,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:54:49,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 20:54:55,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 20:54:55,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 20:54:56,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:54:58,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:55:05,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 20:55:06,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=486980.0, ans=10.0 2023-09-29 20:55:07,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 20:55:07,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:55:09,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:55:09,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 20:55:14,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:55:15,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=486980.0, ans=0.0 2023-09-29 20:55:16,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:55:19,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 20:55:25,243 INFO [train.py:1039] (3/4) Epoch 14, batch 4000, loss[loss=0.1805, simple_loss=0.2666, pruned_loss=0.04721, over 24343.00 frames. ], tot_loss[loss=0.1895, simple_loss=0.2621, pruned_loss=0.05844, over 4719956.38 frames. ], batch size: 74, lr: 7.33e-03, grad_scale: 32.0 2023-09-29 20:55:29,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:55:33,270 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 20:55:34,257 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.864e+02 2.068e+02 2.328e+02 3.219e+02, threshold=4.135e+02, percent-clipped=0.0 2023-09-29 20:55:34,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=487046.6666666667, ans=0.125 2023-09-29 20:55:39,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:55:42,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:55:44,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:55:44,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:55:44,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 20:55:45,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 20:55:45,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 20:55:45,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:55:45,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 20:55:49,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:55:52,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 20:55:52,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:55:52,398 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 20:55:53,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:55:53,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 20:55:55,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=487180.0, ans=0.0 2023-09-29 20:55:57,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:55:57,438 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 20:55:59,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:55:59,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:56:02,623 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 20:56:04,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 20:56:04,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:56:04,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=487180.0, ans=0.1 2023-09-29 20:56:10,546 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 20:56:10,609 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:56:13,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:56:13,673 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 20:56:15,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 20:56:16,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 20:56:16,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:56:16,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:56:18,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 20:56:20,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 20:56:20,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 20:56:20,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:56:22,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 20:56:22,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:56:24,162 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 20:56:30,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 20:56:32,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 20:56:34,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 20:56:36,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:56:37,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:56:39,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:56:43,989 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:56:46,908 INFO [train.py:1039] (3/4) Epoch 14, batch 4050, loss[loss=0.1991, simple_loss=0.2647, pruned_loss=0.06675, over 23770.00 frames. ], tot_loss[loss=0.1893, simple_loss=0.2622, pruned_loss=0.05823, over 4712788.64 frames. ], batch size: 179, lr: 7.33e-03, grad_scale: 32.0 2023-09-29 20:56:46,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 20:56:47,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 20:56:49,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:56:51,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:56:52,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 20:56:54,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:56:54,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:56:58,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 20:56:59,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=487380.0, ans=0.0 2023-09-29 20:57:00,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:57:02,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 20:57:03,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 20:57:03,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:57:05,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=487446.6666666667, ans=0.125 2023-09-29 20:57:09,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:57:10,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 20:57:12,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 20:57:13,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 20:57:14,016 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 20:57:15,843 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=487446.6666666667, ans=0.125 2023-09-29 20:57:17,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 20:57:23,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 20:57:27,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:57:30,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:57:33,210 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=487513.3333333333, ans=15.0 2023-09-29 20:57:33,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:57:33,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:57:33,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:57:37,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 20:57:40,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 20:57:40,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 20:57:42,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:57:46,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 20:57:50,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:57:59,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 20:57:59,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:58:01,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 20:58:02,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 20:58:02,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 20:58:02,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:58:04,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:58:06,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:06,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 20:58:10,373 INFO [train.py:1039] (3/4) Epoch 14, batch 4100, loss[loss=0.1997, simple_loss=0.2802, pruned_loss=0.05959, over 24649.00 frames. ], tot_loss[loss=0.1899, simple_loss=0.2629, pruned_loss=0.05846, over 4717902.15 frames. ], batch size: 68, lr: 7.33e-03, grad_scale: 32.0 2023-09-29 20:58:12,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=487713.3333333333, ans=10.0 2023-09-29 20:58:14,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 20:58:14,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=487713.3333333333, ans=0.0 2023-09-29 20:58:15,657 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 20:58:19,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 20:58:20,527 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.678e+02 1.945e+02 2.209e+02 2.502e+02 4.292e+02, threshold=4.417e+02, percent-clipped=1.0 2023-09-29 20:58:20,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 20:58:20,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:58:20,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:20,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:22,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 20:58:22,420 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 20:58:24,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:58:25,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 20:58:25,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:58:27,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 20:58:28,851 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=487780.0, ans=0.125 2023-09-29 20:58:32,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 20:58:33,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:58:33,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 20:58:33,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 20:58:35,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:35,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 20:58:35,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:58:37,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 20:58:37,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 20:58:38,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:58:40,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 20:58:41,613 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.36 vs. limit=10.0 2023-09-29 20:58:42,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 20:58:45,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 20:58:45,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 20:58:45,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 20:58:47,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 20:58:47,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 20:58:50,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 20:58:52,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 20:58:54,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 20:58:55,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 20:58:55,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:58:57,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:58:58,123 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.76 vs. limit=10.0 2023-09-29 20:59:00,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:59:00,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=487913.3333333333, ans=0.125 2023-09-29 20:59:05,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:59:07,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=487913.3333333333, ans=0.125 2023-09-29 20:59:10,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:59:10,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 20:59:13,303 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.98 vs. limit=15.0 2023-09-29 20:59:15,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=487980.0, ans=0.125 2023-09-29 20:59:21,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:59:21,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 20:59:25,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 20:59:28,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 20:59:33,285 INFO [train.py:1039] (3/4) Epoch 14, batch 4150, loss[loss=0.1857, simple_loss=0.2684, pruned_loss=0.05146, over 24440.00 frames. ], tot_loss[loss=0.1898, simple_loss=0.2627, pruned_loss=0.05842, over 4728822.75 frames. ], batch size: 69, lr: 7.32e-03, grad_scale: 32.0 2023-09-29 20:59:33,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 20:59:34,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 20:59:35,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 20:59:35,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:59:38,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 20:59:38,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:59:40,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 20:59:41,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 20:59:41,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 20:59:43,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 20:59:48,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 20:59:48,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 20:59:53,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 20:59:54,932 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 20:59:56,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 20:59:57,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 20:59:57,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 20:59:59,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 21:00:04,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:00:05,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=488180.0, ans=0.0 2023-09-29 21:00:09,721 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:00:09,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 21:00:12,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 21:00:12,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:00:13,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 21:00:13,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:00:14,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:00:16,408 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=488180.0, ans=0.2 2023-09-29 21:00:17,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:00:17,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:00:23,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 21:00:26,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:00:29,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:00:29,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 21:00:30,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:00:32,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 21:00:34,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:00:34,658 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=488246.6666666667, ans=0.0 2023-09-29 21:00:37,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:00:37,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:00:39,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 21:00:39,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:00:39,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 21:00:41,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:00:44,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 21:00:45,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:00:45,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:00:45,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 21:00:45,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 21:00:45,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:00:47,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 21:00:48,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:00:49,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=488313.3333333333, ans=0.0 2023-09-29 21:00:50,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:00:50,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 21:00:50,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:00:55,497 INFO [train.py:1039] (3/4) Epoch 14, batch 4200, loss[loss=0.1793, simple_loss=0.2238, pruned_loss=0.06738, over 19574.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.2618, pruned_loss=0.05846, over 4728920.69 frames. ], batch size: 389, lr: 7.32e-03, grad_scale: 32.0 2023-09-29 21:00:55,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:00:59,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 21:01:00,932 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:01:03,528 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.41 vs. limit=22.5 2023-09-29 21:01:03,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:01:05,340 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.706e+02 2.014e+02 2.292e+02 2.781e+02 4.764e+02, threshold=4.585e+02, percent-clipped=1.0 2023-09-29 21:01:05,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:01:05,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:01:05,581 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:01:09,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 21:01:11,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 21:01:12,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:01:12,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=488446.6666666667, ans=0.0 2023-09-29 21:01:14,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:01:19,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:01:22,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 21:01:22,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:01:23,856 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:01:23,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 21:01:23,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:01:24,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=488446.6666666667, ans=0.1 2023-09-29 21:01:25,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:01:25,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:01:25,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:01:27,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:01:28,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 21:01:28,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:01:33,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 21:01:34,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:01:37,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:01:39,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:01:41,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=488513.3333333333, ans=0.125 2023-09-29 21:01:42,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:01:42,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 21:01:42,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:01:43,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:01:45,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=488580.0, ans=0.2 2023-09-29 21:01:46,142 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.39 vs. limit=15.0 2023-09-29 21:01:48,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:01:51,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:01:57,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:01:59,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 21:02:02,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:02:09,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:02:09,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:02:09,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=488646.6666666667, ans=0.1 2023-09-29 21:02:12,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 21:02:12,893 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=488646.6666666667, ans=0.0 2023-09-29 21:02:17,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 21:02:19,030 INFO [train.py:1039] (3/4) Epoch 14, batch 4250, loss[loss=0.1842, simple_loss=0.2698, pruned_loss=0.0493, over 24584.00 frames. ], tot_loss[loss=0.1888, simple_loss=0.2609, pruned_loss=0.05838, over 4715617.09 frames. ], batch size: 71, lr: 7.32e-03, grad_scale: 32.0 2023-09-29 21:02:21,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=488713.3333333333, ans=0.1 2023-09-29 21:02:22,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:02:22,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:02:23,682 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.47 vs. limit=15.0 2023-09-29 21:02:25,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:02:30,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:02:32,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 21:02:32,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:02:33,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:02:36,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=488780.0, ans=0.125 2023-09-29 21:02:39,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:02:39,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=488780.0, ans=0.0 2023-09-29 21:02:42,181 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:02:44,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:02:45,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:02:47,010 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:02:47,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:02:50,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:02:51,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:02:53,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:02:55,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:02:57,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:02:58,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 21:03:03,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 21:03:03,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:03:03,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:03:03,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:03:06,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:03:06,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:03:06,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:03:08,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:03:11,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:03:15,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:03:17,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:03:18,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 21:03:18,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:03:18,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 21:03:20,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:03:22,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:03:23,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:03:24,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:03:25,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 21:03:27,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:03:28,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:03:32,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:03:33,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=488980.0, ans=0.0 2023-09-29 21:03:35,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:03:36,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:03:36,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=488980.0, ans=0.05 2023-09-29 21:03:39,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:03:40,515 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:03:42,508 INFO [train.py:1039] (3/4) Epoch 14, batch 4300, loss[loss=0.1946, simple_loss=0.2816, pruned_loss=0.05382, over 24649.00 frames. ], tot_loss[loss=0.189, simple_loss=0.2614, pruned_loss=0.05834, over 4714742.16 frames. ], batch size: 73, lr: 7.32e-03, grad_scale: 16.0 2023-09-29 21:03:42,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:03:44,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:03:44,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 21:03:45,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:03:46,837 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.74 vs. limit=8.0 2023-09-29 21:03:50,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:03:50,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:03:53,189 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.403e+02 1.977e+02 2.365e+02 3.006e+02 5.319e+02, threshold=4.729e+02, percent-clipped=1.0 2023-09-29 21:03:57,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:04:04,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:04:04,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 21:04:06,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:04:09,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:04:09,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:04:09,148 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 21:04:10,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 21:04:12,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:04:15,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 21:04:15,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:04:17,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 21:04:20,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 21:04:22,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:04:22,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:04:23,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:04:25,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:04:27,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:04:28,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:04:28,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 21:04:28,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 21:04:32,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:04:34,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:04:34,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 21:04:34,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:04:36,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:04:36,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 21:04:36,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 21:04:38,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 21:04:38,924 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.18 vs. limit=22.5 2023-09-29 21:04:40,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:04:40,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 21:04:40,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=489246.6666666667, ans=0.025 2023-09-29 21:04:41,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 21:04:45,178 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.27 vs. limit=6.0 2023-09-29 21:04:46,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:04:47,697 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 21:04:47,828 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:04:49,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:04:49,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:04:51,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 21:04:52,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:04:52,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:04:52,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:04:52,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:04:53,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=489313.3333333333, ans=0.125 2023-09-29 21:04:54,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:04:57,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:04:58,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:04:59,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:05:01,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:05:01,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=489313.3333333333, ans=0.0 2023-09-29 21:05:06,038 INFO [train.py:1039] (3/4) Epoch 14, batch 4350, loss[loss=0.1684, simple_loss=0.2503, pruned_loss=0.04325, over 24311.00 frames. ], tot_loss[loss=0.1899, simple_loss=0.2625, pruned_loss=0.05872, over 4699653.28 frames. ], batch size: 56, lr: 7.31e-03, grad_scale: 16.0 2023-09-29 21:05:07,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 21:05:07,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 21:05:13,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:05:16,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:05:17,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=489380.0, ans=0.07 2023-09-29 21:05:19,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:05:19,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:05:19,843 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.64 vs. limit=15.0 2023-09-29 21:05:23,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=489446.6666666667, ans=0.015 2023-09-29 21:05:24,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=489446.6666666667, ans=0.0 2023-09-29 21:05:25,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:05:27,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:05:30,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:05:30,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:05:35,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:05:36,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:05:38,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:05:43,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=489513.3333333333, ans=0.0 2023-09-29 21:05:45,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 21:05:45,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:05:46,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:05:50,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:05:53,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 21:05:53,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=489580.0, ans=0.125 2023-09-29 21:05:56,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:05:56,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:05:58,541 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.51 vs. limit=15.0 2023-09-29 21:06:00,157 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.41 vs. limit=15.0 2023-09-29 21:06:01,026 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 21:06:03,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:06:04,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:06:04,774 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 21:06:06,277 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 21:06:06,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:06:06,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:06:07,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:06:07,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:06:09,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:06:09,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:06:13,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 21:06:13,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:13,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:06:13,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:14,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 21:06:16,085 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 21:06:16,092 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 21:06:16,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 21:06:18,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:06:20,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:06:20,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:06:20,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:06:23,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 21:06:26,147 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 21:06:26,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:27,557 INFO [train.py:1039] (3/4) Epoch 14, batch 4400, loss[loss=0.2271, simple_loss=0.2871, pruned_loss=0.08356, over 22763.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2631, pruned_loss=0.05907, over 4699066.09 frames. ], batch size: 322, lr: 7.31e-03, grad_scale: 32.0 2023-09-29 21:06:29,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:06:29,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:32,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:06:35,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 21:06:35,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 21:06:37,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 21:06:37,353 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 21:06:37,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 21:06:37,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:06:38,946 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.905e+02 2.228e+02 2.642e+02 4.473e+02, threshold=4.456e+02, percent-clipped=0.0 2023-09-29 21:06:40,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 21:06:42,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:06:43,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:06:43,793 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 21:06:47,445 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:06:47,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 21:06:47,531 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 21:06:51,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 21:06:52,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 21:06:52,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 21:06:52,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:06:54,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:06:54,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:06:56,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:06:59,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 21:06:59,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 21:06:59,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:07:02,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:07:02,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:07:04,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:07:05,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:07:05,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 21:07:07,002 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 21:07:10,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:07:12,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=489846.6666666667, ans=0.1 2023-09-29 21:07:16,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:07:19,175 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.23 vs. limit=15.0 2023-09-29 21:07:19,840 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 21:07:23,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:07:27,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:07:28,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:07:29,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=489913.3333333333, ans=0.2 2023-09-29 21:07:30,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 21:07:30,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:07:30,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:07:30,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:07:31,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:07:37,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 21:07:37,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff2.min_abs, batch_count=489980.0, ans=0.1 2023-09-29 21:07:39,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 21:07:40,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 21:07:40,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:07:40,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 21:07:42,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:07:44,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=489980.0, ans=0.1 2023-09-29 21:07:45,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:07:46,935 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.54 vs. limit=6.0 2023-09-29 21:07:47,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 21:07:49,001 INFO [train.py:1039] (3/4) Epoch 14, batch 4450, loss[loss=0.2703, simple_loss=0.3177, pruned_loss=0.1114, over 19607.00 frames. ], tot_loss[loss=0.1924, simple_loss=0.2648, pruned_loss=0.05999, over 4691972.84 frames. ], batch size: 388, lr: 7.31e-03, grad_scale: 32.0 2023-09-29 21:07:50,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:07:53,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:07:55,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:08:02,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:08:02,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:08:05,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:08:07,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:08:10,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:08:12,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:08:13,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 21:08:13,526 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:08:13,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:08:13,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:08:13,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:08:16,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 21:08:23,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:08:23,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:08:23,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=490180.0, ans=0.0 2023-09-29 21:08:25,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:08:26,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:08:27,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:08:33,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 21:08:33,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=490180.0, ans=0.0 2023-09-29 21:08:35,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 21:08:35,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 21:08:35,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:08:38,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:08:40,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 21:08:45,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:08:49,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:08:49,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 21:08:49,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:08:49,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:08:49,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:08:49,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:08:49,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=490246.6666666667, ans=0.0 2023-09-29 21:08:51,930 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.68 vs. limit=12.0 2023-09-29 21:08:52,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:08:56,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 21:08:57,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 21:08:59,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:08:59,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:09:02,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:09:02,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:09:02,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 21:09:06,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:09:10,408 INFO [train.py:1039] (3/4) Epoch 14, batch 4500, loss[loss=0.1674, simple_loss=0.2386, pruned_loss=0.04814, over 24285.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2648, pruned_loss=0.0602, over 4688195.73 frames. ], batch size: 56, lr: 7.31e-03, grad_scale: 16.0 2023-09-29 21:09:10,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 21:09:10,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:09:17,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:09:17,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 21:09:17,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 21:09:19,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:09:23,948 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.876e+02 2.126e+02 2.360e+02 4.104e+02, threshold=4.251e+02, percent-clipped=0.0 2023-09-29 21:09:24,284 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:09:25,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:09:25,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:09:27,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:09:27,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:09:27,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:09:41,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:09:42,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:09:45,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:09:45,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:09:47,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:09:53,064 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 21:09:59,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:10:00,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:10:06,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:10:06,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 21:10:07,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:10:07,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:10:11,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:10:11,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:10:11,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=490580.0, ans=0.125 2023-09-29 21:10:14,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:10:14,390 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 21:10:14,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:10:14,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:10:19,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:10:20,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:10:22,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:10:25,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:10:26,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:10:27,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 21:10:29,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 21:10:29,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 21:10:32,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 21:10:34,064 INFO [train.py:1039] (3/4) Epoch 14, batch 4550, loss[loss=0.1858, simple_loss=0.2448, pruned_loss=0.06343, over 23572.00 frames. ], tot_loss[loss=0.1916, simple_loss=0.2629, pruned_loss=0.06018, over 4679286.92 frames. ], batch size: 256, lr: 7.31e-03, grad_scale: 16.0 2023-09-29 21:10:36,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 21:10:36,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:10:39,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:10:41,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:10:45,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:10:45,547 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=490713.3333333333, ans=0.1 2023-09-29 21:10:49,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:10:52,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:10:52,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:10:52,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:10:52,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:10:54,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=490780.0, ans=0.025 2023-09-29 21:10:55,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:10:57,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:11:01,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:11:03,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 21:11:04,033 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.67 vs. limit=10.0 2023-09-29 21:11:04,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 21:11:06,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:11:07,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 21:11:09,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 21:11:11,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:11:13,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 21:11:15,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:11:18,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:18,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:19,959 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:11:20,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=490846.6666666667, ans=0.125 2023-09-29 21:11:21,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 21:11:26,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:11:26,937 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.12 vs. limit=22.5 2023-09-29 21:11:27,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:27,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:11:29,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:11:31,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 21:11:32,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 21:11:32,803 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:11:32,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 21:11:36,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 21:11:36,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:11:38,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:11:38,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:11:39,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:39,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:11:42,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:11:42,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 21:11:44,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:11:44,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 21:11:46,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 21:11:46,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:11:46,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 21:11:50,245 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=490980.0, ans=0.125 2023-09-29 21:11:51,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:11:51,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:11:54,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:11:54,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:11:54,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 21:11:57,358 INFO [train.py:1039] (3/4) Epoch 14, batch 4600, loss[loss=0.1845, simple_loss=0.2527, pruned_loss=0.05818, over 24437.00 frames. ], tot_loss[loss=0.1904, simple_loss=0.2615, pruned_loss=0.05964, over 4691227.08 frames. ], batch size: 58, lr: 7.30e-03, grad_scale: 8.0 2023-09-29 21:11:57,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:11:57,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:12:02,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:03,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:12:07,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:12:07,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:12:08,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:12:08,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 21:12:11,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:12:12,507 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.411e+02 1.889e+02 2.188e+02 2.520e+02 3.712e+02, threshold=4.377e+02, percent-clipped=0.0 2023-09-29 21:12:15,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:12:15,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:12:17,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:19,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=491113.3333333333, ans=0.125 2023-09-29 21:12:19,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=491113.3333333333, ans=0.125 2023-09-29 21:12:23,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=491113.3333333333, ans=0.125 2023-09-29 21:12:27,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 21:12:28,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:31,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:33,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=491180.0, ans=0.125 2023-09-29 21:12:34,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:12:34,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:12:38,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 21:12:38,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 21:12:39,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:12:44,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:12:44,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:12:46,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:12:50,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 21:12:52,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 21:12:57,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:12:58,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:13:01,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:13:01,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 21:13:01,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:13:02,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 21:13:02,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:13:02,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:13:05,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:13:05,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:13:07,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:13:07,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 21:13:07,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 21:13:09,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 21:13:09,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:13:09,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:13:11,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:13:11,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:13:20,985 INFO [train.py:1039] (3/4) Epoch 14, batch 4650, loss[loss=0.1981, simple_loss=0.2619, pruned_loss=0.06719, over 23773.00 frames. ], tot_loss[loss=0.1903, simple_loss=0.2623, pruned_loss=0.05922, over 4715651.37 frames. ], batch size: 179, lr: 7.30e-03, grad_scale: 8.0 2023-09-29 21:13:24,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:13:27,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:13:28,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:13:28,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:13:28,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:13:30,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:13:30,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:13:34,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 21:13:39,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:13:40,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 21:13:42,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:13:42,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 21:13:42,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:13:44,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 21:13:44,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 21:13:44,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:13:44,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:13:44,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=491446.6666666667, ans=0.1 2023-09-29 21:13:48,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:13:49,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:13:49,598 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 21:13:53,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:13:56,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 21:13:59,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:14:00,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:14:00,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 21:14:02,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:14:04,269 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=491513.3333333333, ans=0.0 2023-09-29 21:14:06,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:14:09,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:14:14,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:14:17,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:14:17,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:14:19,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:14:20,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 21:14:22,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 21:14:23,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 21:14:23,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 21:14:24,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:14:31,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:14:31,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:14:31,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 21:14:32,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:14:34,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:14:34,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:14:36,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:14:37,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:14:37,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:14:39,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:14:43,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:14:44,419 INFO [train.py:1039] (3/4) Epoch 14, batch 4700, loss[loss=0.1669, simple_loss=0.2401, pruned_loss=0.04682, over 24293.00 frames. ], tot_loss[loss=0.1911, simple_loss=0.2628, pruned_loss=0.05965, over 4711104.24 frames. ], batch size: 56, lr: 7.30e-03, grad_scale: 8.0 2023-09-29 21:14:45,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:14:45,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:14:45,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 21:14:46,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:14:48,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 21:14:56,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:14:58,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:14:59,753 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.446e+02 1.978e+02 2.336e+02 2.752e+02 4.215e+02, threshold=4.671e+02, percent-clipped=0.0 2023-09-29 21:14:59,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:15:00,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:15:02,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:15:03,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=491780.0, ans=0.125 2023-09-29 21:15:05,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=491780.0, ans=0.1 2023-09-29 21:15:05,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=491780.0, ans=0.0 2023-09-29 21:15:08,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 21:15:08,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 21:15:09,828 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:15:11,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:15:11,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:15:11,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=491780.0, ans=0.125 2023-09-29 21:15:14,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:15:21,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:15:23,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 21:15:25,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:15:30,043 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=491846.6666666667, ans=0.0 2023-09-29 21:15:31,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 21:15:32,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:15:36,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:15:38,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 21:15:40,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:15:43,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:15:45,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 21:15:46,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:15:46,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:15:51,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:15:51,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:15:51,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 21:15:54,080 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 21:15:55,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:15:55,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:15:55,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:15:55,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 21:15:59,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:16:02,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 21:16:05,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:16:08,498 INFO [train.py:1039] (3/4) Epoch 14, batch 4750, loss[loss=0.1594, simple_loss=0.2408, pruned_loss=0.039, over 24644.00 frames. ], tot_loss[loss=0.1912, simple_loss=0.2633, pruned_loss=0.05957, over 4709601.74 frames. ], batch size: 60, lr: 7.30e-03, grad_scale: 8.0 2023-09-29 21:16:08,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:16:13,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:16:13,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:16:15,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 21:16:15,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:16:18,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 21:16:20,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:16:21,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:16:22,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:16:28,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 21:16:32,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:16:35,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 21:16:35,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:16:38,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:16:38,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:16:40,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:16:42,283 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 21:16:42,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 21:16:42,610 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:16:42,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=492180.0, ans=0.1 2023-09-29 21:16:44,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=492180.0, ans=0.125 2023-09-29 21:16:48,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 21:16:50,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:16:50,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=492180.0, ans=0.125 2023-09-29 21:16:52,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:16:56,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:16:56,473 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 21:16:56,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:16:58,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:17:01,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:17:04,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 21:17:04,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 21:17:04,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:17:04,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:17:04,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:17:06,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 21:17:07,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=492246.6666666667, ans=0.125 2023-09-29 21:17:08,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 21:17:10,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 21:17:12,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:17:13,811 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=492313.3333333333, ans=0.0 2023-09-29 21:17:16,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:17:16,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 21:17:18,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:17:19,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:17:21,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:17:23,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:17:23,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:17:26,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:17:26,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 21:17:28,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 21:17:28,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=492313.3333333333, ans=0.0 2023-09-29 21:17:29,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 21:17:30,934 INFO [train.py:1039] (3/4) Epoch 14, batch 4800, loss[loss=0.1578, simple_loss=0.2324, pruned_loss=0.04163, over 24481.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.2633, pruned_loss=0.05923, over 4726166.90 frames. ], batch size: 58, lr: 7.29e-03, grad_scale: 16.0 2023-09-29 21:17:33,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:17:34,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:17:36,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 21:17:40,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:17:42,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:17:45,640 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.621e+02 1.984e+02 2.307e+02 2.840e+02 4.511e+02, threshold=4.614e+02, percent-clipped=0.0 2023-09-29 21:17:47,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:17:48,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:17:50,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:17:50,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 21:17:50,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:17:51,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:17:53,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:17:58,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:00,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:00,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:18:02,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:02,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 21:18:02,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:18:03,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:18:04,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=492513.3333333333, ans=0.125 2023-09-29 21:18:06,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:10,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:18:11,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:18:11,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:18:13,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 21:18:14,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:18:16,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 21:18:16,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 21:18:18,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:18:19,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:18:19,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:18:19,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:18:19,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:18:21,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:18:22,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:18:26,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:18:30,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:31,838 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:18:37,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 21:18:37,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=492646.6666666667, ans=0.1 2023-09-29 21:18:38,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:18:38,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:38,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:18:38,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:18:43,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:18:45,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:18:45,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:46,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:18:46,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:18:47,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:18:51,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:18:51,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:51,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:18:53,059 INFO [train.py:1039] (3/4) Epoch 14, batch 4850, loss[loss=0.1732, simple_loss=0.261, pruned_loss=0.04272, over 24466.00 frames. ], tot_loss[loss=0.1906, simple_loss=0.2633, pruned_loss=0.05897, over 4734223.45 frames. ], batch size: 66, lr: 7.29e-03, grad_scale: 16.0 2023-09-29 21:18:53,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 21:18:56,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 21:18:56,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:56,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:18:56,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:18:56,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:18:59,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_na.min_abs, batch_count=492713.3333333333, ans=0.02 2023-09-29 21:19:00,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:19:07,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 21:19:08,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=492713.3333333333, ans=0.0 2023-09-29 21:19:10,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:19:11,510 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.67 vs. limit=12.0 2023-09-29 21:19:13,838 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:19:14,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=492780.0, ans=0.125 2023-09-29 21:19:15,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 21:19:15,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:19:19,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:19:19,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=492780.0, ans=0.0 2023-09-29 21:19:20,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:19:22,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:19:22,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 21:19:26,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:19:28,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:19:28,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 21:19:30,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:19:30,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 21:19:31,404 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.52 vs. limit=22.5 2023-09-29 21:19:33,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:19:33,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:19:38,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:19:38,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 21:19:40,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 21:19:40,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:19:40,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=492846.6666666667, ans=0.125 2023-09-29 21:19:47,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:19:48,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 21:19:50,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:19:50,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:19:53,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:19:55,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 21:19:55,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:19:55,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 21:19:55,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:19:57,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:19:58,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 21:20:07,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:20:10,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=492980.0, ans=0.1 2023-09-29 21:20:10,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=492980.0, ans=0.0 2023-09-29 21:20:13,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:20:13,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:20:17,050 INFO [train.py:1039] (3/4) Epoch 14, batch 4900, loss[loss=0.1662, simple_loss=0.2357, pruned_loss=0.04842, over 24458.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2625, pruned_loss=0.05889, over 4727216.81 frames. ], batch size: 58, lr: 7.29e-03, grad_scale: 16.0 2023-09-29 21:20:18,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 21:20:18,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:20:23,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:20:25,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:20:25,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:20:30,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 21:20:31,620 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.872e+02 2.087e+02 2.309e+02 3.318e+02, threshold=4.174e+02, percent-clipped=0.0 2023-09-29 21:20:33,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 21:20:37,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 21:20:38,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 21:20:40,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:20:40,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:20:40,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:20:40,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:20:40,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:20:41,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 21:20:48,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 21:20:48,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:20:48,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:20:50,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:20:52,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:20:53,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:20:55,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:20:55,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 21:20:56,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:20:57,510 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.41 vs. limit=15.0 2023-09-29 21:20:58,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:20:58,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 21:20:58,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 21:21:04,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 21:21:06,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:21:07,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:21:07,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:21:08,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:21:08,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 21:21:09,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:21:09,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 21:21:11,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:21:12,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:21:15,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:21:20,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 21:21:21,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:21:21,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 21:21:23,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 21:21:28,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:21:31,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:21:31,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=493313.3333333333, ans=0.2 2023-09-29 21:21:32,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 21:21:33,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 21:21:33,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:21:36,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:21:36,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=493313.3333333333, ans=0.0 2023-09-29 21:21:39,477 INFO [train.py:1039] (3/4) Epoch 14, batch 4950, loss[loss=0.1904, simple_loss=0.2798, pruned_loss=0.0505, over 24340.00 frames. ], tot_loss[loss=0.189, simple_loss=0.2618, pruned_loss=0.05812, over 4734970.86 frames. ], batch size: 74, lr: 7.29e-03, grad_scale: 16.0 2023-09-29 21:21:39,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:21:39,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:21:39,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:21:39,775 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 21:21:40,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=493380.0, ans=0.125 2023-09-29 21:21:42,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:21:44,452 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:21:45,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 21:21:49,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 21:21:49,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 21:21:49,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:21:49,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 21:21:49,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:21:49,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:21:51,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:21:51,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:21:54,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:21:55,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:21:55,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:21:57,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:22:00,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:22:00,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:22:02,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=493446.6666666667, ans=0.0 2023-09-29 21:22:05,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:22:10,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:22:10,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:22:12,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:22:13,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:22:15,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:22:16,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 21:22:16,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 21:22:19,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:22:22,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:22:22,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:22:23,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:22:23,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:22:25,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:22:26,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:22:27,716 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.40 vs. limit=10.0 2023-09-29 21:22:29,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:22:31,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=493580.0, ans=0.125 2023-09-29 21:22:32,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:22:34,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:22:34,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:22:34,736 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=493580.0, ans=0.0 2023-09-29 21:22:36,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 21:22:36,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:22:38,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:22:41,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:22:43,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:22:43,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:22:45,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:22:45,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:22:45,549 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=493646.6666666667, ans=0.0 2023-09-29 21:22:46,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:22:48,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:22:49,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:22:49,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:22:51,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 21:22:54,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:23:01,126 INFO [train.py:1039] (3/4) Epoch 14, batch 5000, loss[loss=0.2052, simple_loss=0.2746, pruned_loss=0.06796, over 23236.00 frames. ], tot_loss[loss=0.1883, simple_loss=0.2609, pruned_loss=0.05786, over 4725050.62 frames. ], batch size: 93, lr: 7.28e-03, grad_scale: 16.0 2023-09-29 21:23:01,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 21:23:01,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 21:23:06,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:23:06,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:23:09,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 21:23:09,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 21:23:11,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:23:14,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 21:23:14,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:23:14,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:23:16,182 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 1.874e+02 2.097e+02 2.409e+02 3.545e+02, threshold=4.194e+02, percent-clipped=0.0 2023-09-29 21:23:16,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 21:23:16,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:23:17,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:23:19,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 21:23:19,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:23:19,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:23:22,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 21:23:22,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 21:23:22,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:23:23,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 21:23:23,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:23:24,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:23:25,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:23:25,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 21:23:25,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 21:23:27,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=493780.0, ans=0.1 2023-09-29 21:23:29,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 21:23:29,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:23:30,008 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.91 vs. limit=15.0 2023-09-29 21:23:30,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:23:30,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 21:23:30,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:23:33,974 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:23:34,634 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.07 vs. limit=15.0 2023-09-29 21:23:35,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:23:35,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 21:23:35,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 21:23:35,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:23:36,774 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.34 vs. limit=15.0 2023-09-29 21:23:38,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:23:40,847 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 21:23:46,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:23:46,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:23:46,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:23:49,957 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=493913.3333333333, ans=0.125 2023-09-29 21:23:51,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 21:23:51,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:23:51,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:23:51,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:23:54,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 21:23:54,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:23:57,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:23:57,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:24:04,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 21:24:05,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=493913.3333333333, ans=0.125 2023-09-29 21:24:09,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:24:18,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:24:20,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:24:20,894 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:24:22,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:24:22,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:24:22,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:24:22,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:24:24,190 INFO [train.py:1039] (3/4) Epoch 14, batch 5050, loss[loss=0.189, simple_loss=0.2589, pruned_loss=0.05951, over 23414.00 frames. ], tot_loss[loss=0.1898, simple_loss=0.2622, pruned_loss=0.05872, over 4727802.10 frames. ], batch size: 106, lr: 7.28e-03, grad_scale: 16.0 2023-09-29 21:24:24,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.max_abs, batch_count=494046.6666666667, ans=10.0 2023-09-29 21:24:28,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:24:28,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 21:24:28,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=494046.6666666667, ans=0.125 2023-09-29 21:24:30,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=494046.6666666667, ans=0.0 2023-09-29 21:24:31,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:24:32,342 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.31 vs. limit=15.0 2023-09-29 21:24:34,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:24:35,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:24:36,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 21:24:38,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:24:38,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:24:40,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:24:41,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=494113.3333333333, ans=0.0 2023-09-29 21:24:42,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:24:42,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:24:45,848 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=494113.3333333333, ans=0.2 2023-09-29 21:24:51,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 21:24:51,803 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:24:53,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:24:53,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 21:24:55,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:24:56,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:24:58,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:24:58,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:24:58,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 21:25:00,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 21:25:00,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=494180.0, ans=0.125 2023-09-29 21:25:02,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:25:03,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:25:06,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:25:08,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 21:25:09,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:25:13,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 21:25:13,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:25:14,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:25:14,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:25:16,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:25:18,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:25:20,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:25:21,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:25:21,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:25:21,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:25:23,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 21:25:24,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:25:26,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:25:28,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=494313.3333333333, ans=0.125 2023-09-29 21:25:28,741 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.75 vs. limit=15.0 2023-09-29 21:25:31,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:25:31,412 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 21:25:31,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 21:25:33,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:25:33,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:25:33,672 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 21:25:36,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:25:36,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 21:25:36,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:25:41,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:25:41,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:25:42,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 21:25:43,793 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.43 vs. limit=15.0 2023-09-29 21:25:44,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 21:25:46,493 INFO [train.py:1039] (3/4) Epoch 14, batch 5100, loss[loss=0.2133, simple_loss=0.2751, pruned_loss=0.07572, over 22733.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2629, pruned_loss=0.05864, over 4729081.51 frames. ], batch size: 322, lr: 7.28e-03, grad_scale: 16.0 2023-09-29 21:25:48,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:25:48,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:25:48,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:25:51,270 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 21:25:54,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:25:56,759 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=494380.0, ans=0.125 2023-09-29 21:25:57,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 21:25:59,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 21:25:59,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:26:00,834 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.801e+02 1.996e+02 2.340e+02 4.098e+02, threshold=3.991e+02, percent-clipped=0.0 2023-09-29 21:26:01,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:26:04,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:26:04,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 21:26:04,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 21:26:11,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:26:11,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:26:11,996 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.87 vs. limit=15.0 2023-09-29 21:26:15,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:26:16,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=494446.6666666667, ans=0.125 2023-09-29 21:26:18,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 21:26:18,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:26:21,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:26:22,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 21:26:25,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:26:25,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:26:25,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 21:26:28,571 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 21:26:28,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:26:28,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 21:26:28,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 21:26:32,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:26:40,267 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:26:40,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=494580.0, ans=0.0 2023-09-29 21:26:42,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 21:26:42,517 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 21:26:42,530 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 21:26:45,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 21:26:45,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:26:47,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 21:26:51,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 21:26:53,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 21:26:55,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:26:58,554 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 21:26:58,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:27:00,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 21:27:05,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:27:05,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:27:05,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:27:05,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:27:05,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:27:07,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:27:08,866 INFO [train.py:1039] (3/4) Epoch 14, batch 5150, loss[loss=0.1924, simple_loss=0.2617, pruned_loss=0.0616, over 23398.00 frames. ], tot_loss[loss=0.1907, simple_loss=0.2635, pruned_loss=0.05898, over 4726662.16 frames. ], batch size: 119, lr: 7.28e-03, grad_scale: 16.0 2023-09-29 21:27:08,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 21:27:08,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 21:27:10,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 21:27:10,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:27:10,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 21:27:11,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:27:11,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 21:27:12,298 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=494713.3333333333, ans=0.05 2023-09-29 21:27:15,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:27:15,731 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:27:20,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 21:27:20,686 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=494713.3333333333, ans=0.0 2023-09-29 21:27:21,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 21:27:22,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:27:23,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:27:25,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:27:25,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:27:25,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:27:26,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:27:26,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:27:26,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 21:27:30,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:27:30,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:27:32,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:27:34,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 21:27:34,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=494780.0, ans=0.1 2023-09-29 21:27:35,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:27:42,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:27:43,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 21:27:48,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:27:49,270 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=494846.6666666667, ans=0.125 2023-09-29 21:27:53,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:27:55,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:27:57,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=494913.3333333333, ans=0.125 2023-09-29 21:28:00,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:28:00,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:28:05,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 21:28:10,298 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:28:11,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:28:11,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:28:13,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=494980.0, ans=0.0 2023-09-29 21:28:14,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:28:16,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:28:18,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 21:28:19,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=494980.0, ans=0.1 2023-09-29 21:28:22,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:28:24,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:28:24,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=494980.0, ans=0.0 2023-09-29 21:28:28,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:28:28,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:28:29,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 21:28:29,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:28:29,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:28:29,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:28:31,109 INFO [train.py:1039] (3/4) Epoch 14, batch 5200, loss[loss=0.1811, simple_loss=0.237, pruned_loss=0.06261, over 23404.00 frames. ], tot_loss[loss=0.1915, simple_loss=0.2642, pruned_loss=0.05937, over 4724538.91 frames. ], batch size: 285, lr: 7.27e-03, grad_scale: 32.0 2023-09-29 21:28:32,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:28:33,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:28:36,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:28:40,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 21:28:41,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:28:43,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:28:45,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:28:45,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:28:45,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:28:48,047 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.902e+02 2.088e+02 2.453e+02 3.691e+02, threshold=4.175e+02, percent-clipped=0.0 2023-09-29 21:28:48,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 21:28:49,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:28:51,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:28:54,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 21:28:57,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:28:59,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:29:01,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 21:29:01,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 21:29:04,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 21:29:04,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:29:04,519 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 21:29:04,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:29:07,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:29:07,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:29:09,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 21:29:10,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:29:12,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:29:16,170 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 21:29:16,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 21:29:16,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 21:29:20,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=495246.6666666667, ans=0.125 2023-09-29 21:29:21,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 21:29:22,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:29:27,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:29:27,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:29:29,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 21:29:30,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:29:30,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 21:29:30,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:29:30,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:29:34,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=495246.6666666667, ans=0.125 2023-09-29 21:29:35,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:29:36,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=495313.3333333333, ans=0.1 2023-09-29 21:29:39,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:29:42,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=495313.3333333333, ans=0.125 2023-09-29 21:29:43,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:29:44,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:29:44,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:29:45,924 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:29:47,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=495313.3333333333, ans=0.0 2023-09-29 21:29:50,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:29:52,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 21:29:52,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:29:52,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:29:53,790 INFO [train.py:1039] (3/4) Epoch 14, batch 5250, loss[loss=0.1679, simple_loss=0.2454, pruned_loss=0.04521, over 24285.00 frames. ], tot_loss[loss=0.1904, simple_loss=0.2632, pruned_loss=0.05879, over 4723741.10 frames. ], batch size: 61, lr: 7.27e-03, grad_scale: 16.0 2023-09-29 21:29:54,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:29:54,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 21:29:56,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=495380.0, ans=0.0 2023-09-29 21:29:57,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:29:59,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:30:00,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:30:01,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:30:02,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:30:03,484 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.90 vs. limit=22.5 2023-09-29 21:30:07,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:30:10,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:30:12,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:30:15,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:30:17,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 21:30:17,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:30:17,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:30:26,686 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=495513.3333333333, ans=0.125 2023-09-29 21:30:26,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=495513.3333333333, ans=0.125 2023-09-29 21:30:31,589 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=495513.3333333333, ans=0.125 2023-09-29 21:30:38,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=495513.3333333333, ans=0.2 2023-09-29 21:30:50,002 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.77 vs. limit=12.0 2023-09-29 21:31:08,525 INFO [train.py:1039] (3/4) Epoch 14, batch 5300, loss[loss=0.1964, simple_loss=0.2581, pruned_loss=0.06734, over 23769.00 frames. ], tot_loss[loss=0.1891, simple_loss=0.2611, pruned_loss=0.05855, over 4710512.51 frames. ], batch size: 212, lr: 7.27e-03, grad_scale: 16.0 2023-09-29 21:31:22,581 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 1.904e+02 2.089e+02 2.457e+02 4.761e+02, threshold=4.177e+02, percent-clipped=1.0 2023-09-29 21:31:25,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:31:25,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 21:31:25,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 21:31:25,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:31:26,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:31:26,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:31:26,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:31:26,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:31:26,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:31:26,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:31:26,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:31:27,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:31:27,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 21:31:27,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 21:31:27,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 21:31:27,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 21:31:27,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 21:31:27,681 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 21:31:27,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:31:28,317 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:31:28,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:31:28,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:31:29,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:31:29,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:31:29,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:31:29,621 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:31:29,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:31:29,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:31:29,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:31:29,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:31:29,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:31:30,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 21:31:30,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:31:31,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:31:31,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 21:31:31,322 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 21:31:31,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:31:31,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:31:31,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 21:31:31,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 21:31:31,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 21:31:33,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:31:33,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:31:33,429 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 21:31:33,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 21:31:33,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:31:33,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:31:33,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 21:31:33,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 21:31:34,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 21:31:34,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 21:31:43,606 INFO [train.py:1039] (3/4) Epoch 15, batch 0, loss[loss=0.1887, simple_loss=0.2651, pruned_loss=0.0562, over 24476.00 frames. ], tot_loss[loss=0.1887, simple_loss=0.2651, pruned_loss=0.0562, over 24476.00 frames. ], batch size: 63, lr: 7.02e-03, grad_scale: 32.0 2023-09-29 21:31:43,607 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 21:31:58,981 INFO [train.py:1071] (3/4) Epoch 15, validation: loss=0.2846, simple_loss=0.2783, pruned_loss=0.1455, over 1125622.00 frames. 2023-09-29 21:31:58,982 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 21:32:02,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 21:32:06,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:32:07,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:32:11,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:32:11,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:32:11,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:32:11,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=495800.0, ans=0.1 2023-09-29 21:32:12,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 21:32:14,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 21:32:17,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:32:18,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:32:22,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:32:23,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:32:23,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:32:23,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:32:25,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 21:32:25,329 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:32:26,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:32:32,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=495933.3333333333, ans=0.04949747468305833 2023-09-29 21:32:36,317 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:32:36,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:32:39,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 21:32:39,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=495933.3333333333, ans=0.1 2023-09-29 21:32:44,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:32:44,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:32:45,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:32:50,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:32:53,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:32:58,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 21:32:59,533 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=7.36 vs. limit=15.0 2023-09-29 21:33:00,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=496000.0, ans=0.0 2023-09-29 21:33:02,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 21:33:03,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:33:03,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:33:04,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:33:05,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:33:06,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 21:33:11,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:33:11,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:33:16,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:33:18,766 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-29 21:33:18,994 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=496066.6666666667, ans=0.1 2023-09-29 21:33:21,693 INFO [train.py:1039] (3/4) Epoch 15, batch 50, loss[loss=0.1692, simple_loss=0.2409, pruned_loss=0.0488, over 20024.00 frames. ], tot_loss[loss=0.1882, simple_loss=0.2639, pruned_loss=0.05624, over 1074396.83 frames. ], batch size: 43, lr: 7.02e-03, grad_scale: 32.0 2023-09-29 21:33:21,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:33:23,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=496133.3333333333, ans=0.0 2023-09-29 21:33:24,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:33:26,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:33:26,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-29 21:33:27,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:33:27,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:33:29,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:33:32,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:33:33,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:33:37,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-29 21:33:37,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:33:42,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-29 21:33:44,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-29 21:33:46,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-29 21:33:48,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:33:49,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:33:49,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:33:50,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:33:52,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-29 21:33:52,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:33:52,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:33:59,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:34:00,959 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:34:00,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:34:02,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-29 21:34:04,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:34:05,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:34:05,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-29 21:34:07,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:34:10,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-29 21:34:17,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:34:19,599 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:34:19,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:34:21,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:34:21,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:34:21,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=496333.3333333333, ans=0.0 2023-09-29 21:34:25,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-29 21:34:25,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-29 21:34:28,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:34:28,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:34:30,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=496400.0, ans=0.125 2023-09-29 21:34:31,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:34:31,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:34:31,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=496400.0, ans=0.125 2023-09-29 21:34:32,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-29 21:34:32,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-29 21:34:34,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-29 21:34:35,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:34:35,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:34:37,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-29 21:34:37,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-29 21:34:37,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:34:37,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:34:39,136 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.715e+02 2.074e+02 2.565e+02 3.305e+02 5.603e+02, threshold=5.131e+02, percent-clipped=8.0 2023-09-29 21:34:40,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-29 21:34:40,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:34:42,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:34:44,230 INFO [train.py:1039] (3/4) Epoch 15, batch 100, loss[loss=0.2017, simple_loss=0.2655, pruned_loss=0.06888, over 23697.00 frames. ], tot_loss[loss=0.1892, simple_loss=0.2642, pruned_loss=0.05708, over 1901008.21 frames. ], batch size: 232, lr: 7.02e-03, grad_scale: 32.0 2023-09-29 21:34:45,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:34:47,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=496466.6666666667, ans=0.2 2023-09-29 21:34:50,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:34:53,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-29 21:34:54,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:34:57,842 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:34:57,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:34:57,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:34:57,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:34:57,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:34:59,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-29 21:35:03,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:35:03,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:35:03,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:35:03,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:35:07,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-29 21:35:10,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:35:11,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:35:12,652 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:35:13,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=496533.3333333333, ans=0.0 2023-09-29 21:35:14,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:35:14,487 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=496533.3333333333, ans=0.0 2023-09-29 21:35:18,014 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-29 21:35:19,473 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-29 21:35:20,995 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:35:20,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:35:25,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-29 21:35:27,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:35:27,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:35:33,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:35:34,026 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-29 21:35:37,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-29 21:35:42,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:35:44,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:35:45,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:35:48,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:35:52,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:35:52,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:35:54,780 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.24 vs. limit=15.0 2023-09-29 21:35:55,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:35:57,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:35:57,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:35:57,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:35:58,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:36:00,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-29 21:36:00,214 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-29 21:36:00,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:36:00,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:36:02,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:02,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:36:02,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 21:36:02,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 21:36:03,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:36:03,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:05,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:36:06,712 INFO [train.py:1039] (3/4) Epoch 15, batch 150, loss[loss=0.1801, simple_loss=0.2555, pruned_loss=0.05238, over 24486.00 frames. ], tot_loss[loss=0.1883, simple_loss=0.2639, pruned_loss=0.0564, over 2547032.46 frames. ], batch size: 66, lr: 7.01e-03, grad_scale: 32.0 2023-09-29 21:36:06,900 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:36:08,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:36:08,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:36:11,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:36:14,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:36:14,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:36:15,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:18,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:36:18,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:21,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:36:23,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:24,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=496866.6666666667, ans=0.125 2023-09-29 21:36:29,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-29 21:36:29,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-29 21:36:29,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-29 21:36:30,254 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.97 vs. limit=22.5 2023-09-29 21:36:32,599 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:36:32,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:36:32,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:36:34,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:36:34,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:36:34,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:34,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:36:36,124 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-29 21:36:39,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:36:43,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:36:46,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:36:48,350 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-29 21:36:52,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:36:52,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:36:52,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:36:54,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:36:54,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:36:56,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:36:57,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:36:57,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-29 21:37:01,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:37:02,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=497000.0, ans=0.0 2023-09-29 21:37:04,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:37:04,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:37:04,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:37:08,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:37:08,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=497000.0, ans=0.2 2023-09-29 21:37:09,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 21:37:12,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:37:14,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:37:15,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:37:16,077 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=497066.6666666667, ans=0.1 2023-09-29 21:37:19,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:37:19,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-29 21:37:19,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:37:19,389 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-29 21:37:23,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:37:26,178 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.806e+02 2.055e+02 2.590e+02 4.271e+02, threshold=4.110e+02, percent-clipped=0.0 2023-09-29 21:37:26,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:37:26,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:37:29,048 INFO [train.py:1039] (3/4) Epoch 15, batch 200, loss[loss=0.2605, simple_loss=0.3167, pruned_loss=0.1022, over 19558.00 frames. ], tot_loss[loss=0.1926, simple_loss=0.2669, pruned_loss=0.05908, over 3010604.70 frames. ], batch size: 389, lr: 7.01e-03, grad_scale: 16.0 2023-09-29 21:37:29,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-29 21:37:30,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:37:30,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:37:34,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-29 21:37:36,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-29 21:37:39,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:37:39,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:37:44,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:37:44,451 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:37:45,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:38:08,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:38:10,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:38:10,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:38:10,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:38:12,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 21:38:12,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:38:15,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:38:16,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:38:17,126 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=497333.3333333333, ans=0.125 2023-09-29 21:38:18,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:38:19,451 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.84 vs. limit=6.0 2023-09-29 21:38:20,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:38:21,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-29 21:38:21,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 21:38:21,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:38:25,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:38:30,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:38:37,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=497400.0, ans=0.125 2023-09-29 21:38:39,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:38:39,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:38:47,309 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:38:48,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-29 21:38:50,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:38:50,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:38:50,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:38:51,892 INFO [train.py:1039] (3/4) Epoch 15, batch 250, loss[loss=0.1644, simple_loss=0.2363, pruned_loss=0.04624, over 24340.00 frames. ], tot_loss[loss=0.1912, simple_loss=0.265, pruned_loss=0.05873, over 3391261.36 frames. ], batch size: 56, lr: 7.01e-03, grad_scale: 16.0 2023-09-29 21:38:51,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:38:53,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-29 21:38:54,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:38:54,950 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-29 21:38:56,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:38:58,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:39:02,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:39:02,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:39:03,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:39:03,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:39:05,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:39:09,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:39:10,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=497533.3333333333, ans=15.0 2023-09-29 21:39:20,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:39:24,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:39:25,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:39:31,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-29 21:39:31,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:39:33,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:39:33,359 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=497600.0, ans=0.125 2023-09-29 21:39:35,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:39:35,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:39:35,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:39:35,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:39:38,638 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:39:42,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-29 21:39:43,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:39:44,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:39:45,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:39:45,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:39:45,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:39:47,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:39:47,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:39:48,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:39:50,418 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:39:50,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:39:56,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:40:00,196 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.71 vs. limit=15.0 2023-09-29 21:40:00,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:40:02,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:40:06,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:40:08,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:40:12,380 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.473e+02 1.826e+02 2.078e+02 2.374e+02 4.039e+02, threshold=4.156e+02, percent-clipped=0.0 2023-09-29 21:40:13,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-29 21:40:13,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=497733.3333333333, ans=0.125 2023-09-29 21:40:14,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:40:14,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 21:40:16,024 INFO [train.py:1039] (3/4) Epoch 15, batch 300, loss[loss=0.189, simple_loss=0.2653, pruned_loss=0.05637, over 23545.00 frames. ], tot_loss[loss=0.1898, simple_loss=0.2623, pruned_loss=0.05868, over 3665106.57 frames. ], batch size: 106, lr: 7.01e-03, grad_scale: 16.0 2023-09-29 21:40:17,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-29 21:40:17,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-29 21:40:19,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:40:19,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-29 21:40:19,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=497800.0, ans=0.125 2023-09-29 21:40:23,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:40:25,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:40:31,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:40:31,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-29 21:40:31,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:40:32,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 21:40:34,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-29 21:40:34,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:40:34,680 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:40:37,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:40:42,046 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:40:44,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-29 21:40:45,773 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=497866.6666666667, ans=0.0 2023-09-29 21:40:48,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-29 21:40:49,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:40:52,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:40:54,370 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.55 vs. limit=15.0 2023-09-29 21:40:55,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:40:55,204 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-29 21:40:55,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:40:55,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:40:57,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=497933.3333333333, ans=0.125 2023-09-29 21:40:58,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:40:58,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:41:05,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-29 21:41:05,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-29 21:41:05,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:41:08,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:41:10,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-29 21:41:11,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:41:15,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:41:17,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:41:17,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-29 21:41:22,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:41:22,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:41:23,746 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:41:25,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:41:26,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-29 21:41:26,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:41:28,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:41:29,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-29 21:41:32,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:41:32,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:41:34,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:41:34,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:41:35,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:41:37,683 INFO [train.py:1039] (3/4) Epoch 15, batch 350, loss[loss=0.1573, simple_loss=0.2404, pruned_loss=0.03704, over 22733.00 frames. ], tot_loss[loss=0.1877, simple_loss=0.2604, pruned_loss=0.05752, over 3908399.69 frames. ], batch size: 49, lr: 7.00e-03, grad_scale: 16.0 2023-09-29 21:41:40,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:41:40,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 21:41:44,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:41:49,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:41:52,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:41:54,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:41:54,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=498200.0, ans=0.0 2023-09-29 21:41:57,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-29 21:41:59,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:41:59,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-29 21:42:01,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:42:02,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-29 21:42:02,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:42:06,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-29 21:42:08,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:42:10,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:42:10,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:42:12,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:42:13,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:42:13,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:42:13,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:42:13,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:42:15,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:42:17,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:42:24,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:42:24,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:42:25,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:42:27,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:42:31,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-29 21:42:31,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:42:37,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:42:37,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:42:37,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:42:37,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=498333.3333333333, ans=0.2 2023-09-29 21:42:38,949 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=498333.3333333333, ans=0.1 2023-09-29 21:42:39,902 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.02 vs. limit=15.0 2023-09-29 21:42:40,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-29 21:42:40,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:42:42,054 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-29 21:42:43,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-29 21:42:43,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:42:44,154 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=498400.0, ans=0.1 2023-09-29 21:42:45,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:42:45,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-29 21:42:49,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:42:49,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=498400.0, ans=0.0 2023-09-29 21:42:50,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 21:42:51,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=498400.0, ans=0.1 2023-09-29 21:42:52,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:42:54,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:42:54,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:42:54,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=498400.0, ans=0.04949747468305833 2023-09-29 21:42:56,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:42:57,506 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.447e+02 1.856e+02 2.198e+02 2.696e+02 4.798e+02, threshold=4.395e+02, percent-clipped=2.0 2023-09-29 21:42:59,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:43:00,682 INFO [train.py:1039] (3/4) Epoch 15, batch 400, loss[loss=0.1866, simple_loss=0.2552, pruned_loss=0.05897, over 23390.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.2597, pruned_loss=0.05704, over 4086734.66 frames. ], batch size: 120, lr: 7.00e-03, grad_scale: 32.0 2023-09-29 21:43:00,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:43:02,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-29 21:43:02,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:43:04,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:43:05,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:43:07,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:43:10,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:43:11,180 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.86 vs. limit=22.5 2023-09-29 21:43:12,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:43:13,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-29 21:43:13,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-29 21:43:13,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:43:15,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-29 21:43:15,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:43:19,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=498533.3333333333, ans=0.125 2023-09-29 21:43:20,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:43:20,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:43:20,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-29 21:43:20,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:43:22,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:43:22,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:43:22,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:43:27,361 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-29 21:43:27,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-29 21:43:27,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=498533.3333333333, ans=0.0 2023-09-29 21:43:32,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:43:33,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:43:33,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=498600.0, ans=0.125 2023-09-29 21:43:35,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-29 21:43:36,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-29 21:43:39,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:43:42,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:43:48,542 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-29 21:43:52,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:43:54,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-29 21:43:56,589 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.84 vs. limit=6.0 2023-09-29 21:43:58,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:44:01,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:44:02,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-29 21:44:04,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:44:07,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 21:44:08,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:44:13,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:44:13,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-29 21:44:13,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=498733.3333333333, ans=0.2 2023-09-29 21:44:15,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-29 21:44:16,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-29 21:44:18,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:44:18,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:44:20,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-29 21:44:22,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:44:23,364 INFO [train.py:1039] (3/4) Epoch 15, batch 450, loss[loss=0.2104, simple_loss=0.2731, pruned_loss=0.07386, over 23876.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2606, pruned_loss=0.05678, over 4238485.40 frames. ], batch size: 164, lr: 7.00e-03, grad_scale: 32.0 2023-09-29 21:44:23,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:44:23,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-29 21:44:25,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-29 21:44:25,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:44:26,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:44:28,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:44:28,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-29 21:44:30,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:44:32,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 21:44:33,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:44:43,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:44:45,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:44:45,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-29 21:44:46,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-29 21:44:48,566 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=498866.6666666667, ans=0.2 2023-09-29 21:44:53,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:44:54,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:44:57,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:45:00,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:45:00,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:45:01,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=498933.3333333333, ans=0.125 2023-09-29 21:45:03,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-29 21:45:05,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-29 21:45:07,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-29 21:45:07,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:45:09,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:45:09,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:45:11,112 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-29 21:45:11,126 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-29 21:45:11,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:45:13,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:45:13,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-29 21:45:15,732 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.71 vs. limit=15.0 2023-09-29 21:45:18,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:45:18,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-29 21:45:19,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-29 21:45:19,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-29 21:45:20,424 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.56 vs. limit=15.0 2023-09-29 21:45:22,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:45:24,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:45:24,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:45:26,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-29 21:45:29,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:45:31,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-29 21:45:31,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-29 21:45:32,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 21:45:36,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=499066.6666666667, ans=0.04949747468305833 2023-09-29 21:45:39,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:45:40,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:45:42,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:45:42,983 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-29 21:45:44,924 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.892e+02 2.180e+02 2.454e+02 3.588e+02, threshold=4.360e+02, percent-clipped=0.0 2023-09-29 21:45:46,457 INFO [train.py:1039] (3/4) Epoch 15, batch 500, loss[loss=0.1987, simple_loss=0.2723, pruned_loss=0.06251, over 23275.00 frames. ], tot_loss[loss=0.1896, simple_loss=0.2625, pruned_loss=0.05837, over 4336481.13 frames. ], batch size: 105, lr: 7.00e-03, grad_scale: 16.0 2023-09-29 21:45:48,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:45:49,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:45:51,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:45:51,140 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-29 21:45:52,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-29 21:45:52,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:45:54,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:45:59,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 21:46:01,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-29 21:46:02,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:46:02,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:46:04,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:16,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:46:18,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-29 21:46:18,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-29 21:46:20,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:46:20,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-29 21:46:20,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 21:46:24,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:46:25,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:46:25,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:46:25,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:46:27,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-29 21:46:30,129 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-29 21:46:31,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:46:33,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:34,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:34,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:37,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-29 21:46:37,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=499333.3333333333, ans=0.125 2023-09-29 21:46:38,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-29 21:46:41,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:46:41,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:46:46,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:46:50,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:46:56,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:47:01,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-29 21:47:01,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:47:01,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:47:04,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-29 21:47:04,594 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-29 21:47:06,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:47:07,621 INFO [train.py:1039] (3/4) Epoch 15, batch 550, loss[loss=0.1679, simple_loss=0.2483, pruned_loss=0.04374, over 24563.00 frames. ], tot_loss[loss=0.1885, simple_loss=0.2622, pruned_loss=0.05742, over 4422586.39 frames. ], batch size: 60, lr: 6.99e-03, grad_scale: 16.0 2023-09-29 21:47:09,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-29 21:47:11,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-29 21:47:12,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:47:13,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-29 21:47:14,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:47:14,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:47:14,811 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=499466.6666666667, ans=0.125 2023-09-29 21:47:16,019 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:16,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:16,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:47:17,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:47:18,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=499466.6666666667, ans=0.05 2023-09-29 21:47:19,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:47:22,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-29 21:47:22,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:47:24,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=499533.3333333333, ans=0.0 2023-09-29 21:47:28,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:47:30,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:33,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:47:33,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:37,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-29 21:47:38,846 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.43 vs. limit=10.0 2023-09-29 21:47:39,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-29 21:47:41,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:47:44,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:47:44,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:47:46,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:47:48,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=499600.0, ans=0.125 2023-09-29 21:47:51,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:47:51,021 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-29 21:47:51,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=499600.0, ans=0.1 2023-09-29 21:47:52,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:47:52,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 21:47:55,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 21:47:57,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 21:47:57,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:47:59,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:47:59,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-29 21:48:02,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-29 21:48:04,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:48:04,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:48:06,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:48:06,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:48:08,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:48:09,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-29 21:48:12,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:48:13,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:48:14,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 21:48:16,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:48:18,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:48:19,079 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:48:19,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:48:21,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-29 21:48:21,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-29 21:48:27,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-29 21:48:28,883 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.896e+02 2.048e+02 2.393e+02 3.212e+02, threshold=4.096e+02, percent-clipped=0.0 2023-09-29 21:48:30,433 INFO [train.py:1039] (3/4) Epoch 15, batch 600, loss[loss=0.249, simple_loss=0.2972, pruned_loss=0.1004, over 19813.00 frames. ], tot_loss[loss=0.1896, simple_loss=0.2629, pruned_loss=0.05813, over 4486096.38 frames. ], batch size: 388, lr: 6.99e-03, grad_scale: 16.0 2023-09-29 21:48:31,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-29 21:48:33,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:48:33,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:48:33,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:48:34,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=499800.0, ans=0.125 2023-09-29 21:48:41,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:48:41,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 21:48:42,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-29 21:48:45,087 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.53 vs. limit=15.0 2023-09-29 21:48:45,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-29 21:48:47,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:48:49,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:48:52,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-29 21:48:52,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:48:58,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-29 21:49:00,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=499866.6666666667, ans=0.0 2023-09-29 21:49:01,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:49:01,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:49:03,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:49:09,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:49:09,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:49:09,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:49:17,089 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:49:17,363 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=499933.3333333333, ans=0.0 2023-09-29 21:49:22,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:49:22,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:49:22,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:49:30,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-29 21:49:38,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:49:38,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:49:43,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-29 21:49:45,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:49:45,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=500066.6666666667, ans=0.1 2023-09-29 21:49:47,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=500066.6666666667, ans=0.125 2023-09-29 21:49:47,915 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.53 vs. limit=22.5 2023-09-29 21:49:49,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-29 21:49:49,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:49:50,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:49:53,640 INFO [train.py:1039] (3/4) Epoch 15, batch 650, loss[loss=0.1815, simple_loss=0.2572, pruned_loss=0.05293, over 23905.00 frames. ], tot_loss[loss=0.1884, simple_loss=0.2614, pruned_loss=0.05775, over 4532521.12 frames. ], batch size: 86, lr: 6.99e-03, grad_scale: 8.0 2023-09-29 21:49:53,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 21:49:55,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-29 21:49:57,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:49:59,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:50:00,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:04,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-29 21:50:05,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:50:10,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:50:10,448 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:50:15,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:50:19,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-29 21:50:20,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:50:20,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:50:21,107 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=500200.0, ans=0.2 2023-09-29 21:50:25,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:50:25,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 21:50:28,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:50:30,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:32,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:50:32,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:33,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 21:50:35,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 21:50:35,507 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-29 21:50:35,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:50:35,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:50:37,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=500266.6666666667, ans=0.09899494936611666 2023-09-29 21:50:40,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:40,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:50:41,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:50:43,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-29 21:50:44,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-29 21:50:44,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:50:44,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-29 21:50:46,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-29 21:50:46,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:50:47,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 21:50:50,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-29 21:50:52,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-29 21:50:52,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:50:52,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:50:52,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:50:53,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:50:55,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:50:55,694 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 21:51:00,026 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:51:00,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:51:01,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:51:04,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=500400.0, ans=0.0 2023-09-29 21:51:05,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:51:05,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 21:51:05,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:51:11,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=500400.0, ans=0.0 2023-09-29 21:51:12,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:51:12,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:51:14,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:51:14,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:51:15,839 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.967e+02 2.230e+02 2.701e+02 4.378e+02, threshold=4.460e+02, percent-clipped=5.0 2023-09-29 21:51:15,882 INFO [train.py:1039] (3/4) Epoch 15, batch 700, loss[loss=0.1404, simple_loss=0.2197, pruned_loss=0.03061, over 24303.00 frames. ], tot_loss[loss=0.1865, simple_loss=0.2591, pruned_loss=0.05693, over 4565190.21 frames. ], batch size: 56, lr: 6.99e-03, grad_scale: 8.0 2023-09-29 21:51:19,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer_na.min_abs, batch_count=500466.6666666667, ans=0.02 2023-09-29 21:51:20,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-29 21:51:20,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-29 21:51:21,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=500466.6666666667, ans=0.0 2023-09-29 21:51:22,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-29 21:51:24,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:51:28,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:51:29,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-29 21:51:33,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:51:36,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:51:36,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:51:38,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=500533.3333333333, ans=0.0 2023-09-29 21:51:39,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-29 21:51:39,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:51:42,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:51:45,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 21:51:45,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:51:47,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-29 21:51:49,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-29 21:51:52,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=500600.0, ans=0.1 2023-09-29 21:51:53,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-29 21:51:53,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:51:55,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-29 21:52:03,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:52:03,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-29 21:52:07,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:52:09,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:52:09,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-29 21:52:11,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=500666.6666666667, ans=0.0 2023-09-29 21:52:14,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:52:15,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:52:17,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=500666.6666666667, ans=0.125 2023-09-29 21:52:18,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:52:25,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-29 21:52:25,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-29 21:52:25,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=500733.3333333333, ans=0.125 2023-09-29 21:52:27,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-29 21:52:28,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-29 21:52:30,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:52:32,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:52:33,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:52:34,727 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:52:34,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-29 21:52:37,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=500733.3333333333, ans=10.0 2023-09-29 21:52:39,401 INFO [train.py:1039] (3/4) Epoch 15, batch 750, loss[loss=0.1556, simple_loss=0.231, pruned_loss=0.04006, over 24604.00 frames. ], tot_loss[loss=0.186, simple_loss=0.2588, pruned_loss=0.05654, over 4609467.97 frames. ], batch size: 60, lr: 6.99e-03, grad_scale: 8.0 2023-09-29 21:52:41,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-29 21:52:41,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-29 21:52:41,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-29 21:52:42,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-29 21:52:43,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-29 21:52:43,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=500800.0, ans=0.5 2023-09-29 21:52:44,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:52:45,497 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=11.24 vs. limit=15.0 2023-09-29 21:52:46,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-29 21:52:46,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:52:46,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=500800.0, ans=0.125 2023-09-29 21:52:47,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:52:48,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:52:49,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:52:50,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-29 21:52:51,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:52:51,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=500800.0, ans=0.125 2023-09-29 21:52:54,149 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:52:54,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:52:57,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:52:58,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:52:59,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:53:00,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-29 21:53:01,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-29 21:53:03,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:53:04,718 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:53:04,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-29 21:53:06,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-29 21:53:07,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:53:10,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-29 21:53:10,091 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-29 21:53:11,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-29 21:53:11,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:53:11,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 21:53:14,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 21:53:21,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:53:22,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:53:22,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:53:24,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:53:26,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:53:26,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-29 21:53:26,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=500933.3333333333, ans=0.2 2023-09-29 21:53:27,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:53:28,493 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.63 vs. limit=15.0 2023-09-29 21:53:29,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-29 21:53:29,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:53:32,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:53:32,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-29 21:53:34,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:53:39,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=501000.0, ans=0.125 2023-09-29 21:53:41,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:53:41,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:53:42,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:53:44,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:53:49,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-29 21:53:49,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:53:49,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:53:49,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=501066.6666666667, ans=0.125 2023-09-29 21:53:51,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=501066.6666666667, ans=0.125 2023-09-29 21:53:54,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:53:54,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:53:56,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=501066.6666666667, ans=0.0 2023-09-29 21:53:57,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:53:57,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-29 21:54:02,041 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.875e+02 2.035e+02 2.283e+02 3.726e+02, threshold=4.071e+02, percent-clipped=0.0 2023-09-29 21:54:02,084 INFO [train.py:1039] (3/4) Epoch 15, batch 800, loss[loss=0.2051, simple_loss=0.2827, pruned_loss=0.06379, over 24479.00 frames. ], tot_loss[loss=0.1873, simple_loss=0.26, pruned_loss=0.05731, over 4631664.35 frames. ], batch size: 66, lr: 6.98e-03, grad_scale: 16.0 2023-09-29 21:54:03,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:54:03,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:07,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:54:07,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:54:09,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:09,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:54:11,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:14,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=501133.3333333333, ans=0.0 2023-09-29 21:54:15,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:54:15,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:54:17,547 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=501200.0, ans=0.125 2023-09-29 21:54:18,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-29 21:54:20,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:54:21,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:54:21,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:54:21,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:54:23,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-29 21:54:23,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:54:25,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-29 21:54:29,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:32,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:54:35,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:54:35,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:54:36,221 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.19 vs. limit=15.0 2023-09-29 21:54:38,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:54:38,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:54:42,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:54:43,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:54:43,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-29 21:54:47,192 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-29 21:54:47,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-29 21:54:47,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 21:54:47,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:54:48,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:54:48,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:54:55,120 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-29 21:54:55,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-29 21:54:58,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-29 21:54:59,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 21:55:04,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 21:55:09,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:55:11,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-29 21:55:11,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-29 21:55:14,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-29 21:55:21,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:55:24,934 INFO [train.py:1039] (3/4) Epoch 15, batch 850, loss[loss=0.1741, simple_loss=0.245, pruned_loss=0.05159, over 21666.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2607, pruned_loss=0.0572, over 4657034.27 frames. ], batch size: 47, lr: 6.98e-03, grad_scale: 16.0 2023-09-29 21:55:25,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:55:25,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-29 21:55:25,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:55:25,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=501466.6666666667, ans=0.125 2023-09-29 21:55:26,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:55:28,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-29 21:55:29,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:55:31,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:55:32,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:55:34,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:55:35,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:55:37,276 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-29 21:55:37,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-29 21:55:39,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-29 21:55:40,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 21:55:41,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:55:41,766 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.58 vs. limit=10.0 2023-09-29 21:55:43,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:55:43,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:55:43,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=501533.3333333333, ans=0.0 2023-09-29 21:55:44,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 21:55:48,654 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.11 vs. limit=15.0 2023-09-29 21:55:50,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:55:50,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:55:52,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-29 21:55:54,489 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.82 vs. limit=12.0 2023-09-29 21:55:55,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-29 21:55:58,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:56:01,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-29 21:56:03,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=501600.0, ans=0.125 2023-09-29 21:56:05,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-29 21:56:07,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-29 21:56:08,849 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-29 21:56:08,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:56:08,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:56:08,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 21:56:11,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:56:12,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:56:13,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-29 21:56:16,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 21:56:17,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:56:17,829 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.77 vs. limit=15.0 2023-09-29 21:56:18,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:56:18,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-29 21:56:20,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 21:56:22,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-29 21:56:23,091 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.72 vs. limit=15.0 2023-09-29 21:56:24,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-29 21:56:27,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-29 21:56:28,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:56:28,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 21:56:28,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:56:28,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=501733.3333333333, ans=0.1 2023-09-29 21:56:30,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:56:34,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:56:36,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-29 21:56:37,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-29 21:56:39,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:56:39,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-29 21:56:46,687 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.806e+02 2.003e+02 2.266e+02 2.717e+02, threshold=4.007e+02, percent-clipped=0.0 2023-09-29 21:56:46,731 INFO [train.py:1039] (3/4) Epoch 15, batch 900, loss[loss=0.168, simple_loss=0.2419, pruned_loss=0.04703, over 17124.00 frames. ], tot_loss[loss=0.1881, simple_loss=0.2615, pruned_loss=0.05736, over 4670925.08 frames. ], batch size: 37, lr: 6.98e-03, grad_scale: 16.0 2023-09-29 21:56:48,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-29 21:56:50,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:56:50,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-29 21:56:51,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:56:51,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:56:53,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-29 21:56:55,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=501800.0, ans=0.125 2023-09-29 21:57:00,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:57:00,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=501800.0, ans=0.0 2023-09-29 21:57:03,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:57:03,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-29 21:57:05,290 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=501866.6666666667, ans=0.2 2023-09-29 21:57:07,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:57:07,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-29 21:57:09,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-29 21:57:09,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-29 21:57:09,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:57:10,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 21:57:10,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-29 21:57:11,804 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=13.38 vs. limit=15.0 2023-09-29 21:57:22,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:57:22,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 21:57:22,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 21:57:25,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:57:28,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-29 21:57:31,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:57:36,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-29 21:57:36,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-29 21:57:38,461 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-29 21:57:38,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-29 21:57:45,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-29 21:57:46,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-29 21:57:46,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 21:57:52,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:57:53,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:57:54,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-29 21:57:54,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 21:57:56,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-29 21:57:58,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-29 21:57:58,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:58:01,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:58:01,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:58:06,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-29 21:58:06,870 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-29 21:58:08,485 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-29 21:58:08,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-29 21:58:09,934 INFO [train.py:1039] (3/4) Epoch 15, batch 950, loss[loss=0.1921, simple_loss=0.2562, pruned_loss=0.06401, over 23777.00 frames. ], tot_loss[loss=0.1881, simple_loss=0.2613, pruned_loss=0.05744, over 4690849.46 frames. ], batch size: 195, lr: 6.98e-03, grad_scale: 16.0 2023-09-29 21:58:12,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:58:15,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-29 21:58:21,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:58:23,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:58:23,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:58:25,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 21:58:28,157 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-29 21:58:30,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:58:30,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:58:30,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:58:30,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 21:58:31,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-29 21:58:33,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-29 21:58:35,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:58:36,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-29 21:58:37,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:58:43,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:58:43,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-29 21:58:43,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:58:45,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-29 21:58:49,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 21:58:50,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 21:58:52,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 21:58:56,153 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.39 vs. limit=22.5 2023-09-29 21:58:58,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:58:58,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 21:59:01,406 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-29 21:59:02,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 21:59:02,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 21:59:04,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:59:04,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:59:04,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 21:59:05,343 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.00 vs. limit=15.0 2023-09-29 21:59:07,030 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=502333.3333333333, ans=0.05 2023-09-29 21:59:09,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-29 21:59:12,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-29 21:59:16,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:59:16,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:59:16,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-29 21:59:16,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:59:16,317 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 21:59:18,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-29 21:59:23,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 21:59:25,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 21:59:28,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:59:29,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-29 21:59:29,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-29 21:59:32,768 INFO [train.py:1039] (3/4) Epoch 15, batch 1000, loss[loss=0.1894, simple_loss=0.2402, pruned_loss=0.06935, over 23402.00 frames. ], tot_loss[loss=0.188, simple_loss=0.2609, pruned_loss=0.05757, over 4699167.82 frames. ], batch size: 285, lr: 6.97e-03, grad_scale: 8.0 2023-09-29 21:59:34,250 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.874e+02 2.213e+02 2.619e+02 3.676e+02, threshold=4.426e+02, percent-clipped=0.0 2023-09-29 21:59:34,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-29 21:59:34,918 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=502466.6666666667, ans=0.0 2023-09-29 21:59:37,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-29 21:59:38,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-29 21:59:41,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=502466.6666666667, ans=0.0 2023-09-29 21:59:45,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 21:59:47,170 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-29 21:59:47,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-29 21:59:53,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 21:59:53,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 21:59:54,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 21:59:58,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-29 22:00:00,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-29 22:00:01,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-29 22:00:01,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:00:04,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-29 22:00:06,421 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-29 22:00:06,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-29 22:00:08,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:00:09,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:00:17,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:00:17,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:00:17,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:00:20,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:00:20,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-29 22:00:20,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:00:20,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:00:21,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:00:21,950 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-29 22:00:25,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-29 22:00:27,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-29 22:00:30,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-29 22:00:32,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:00:37,283 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=502666.6666666667, ans=0.125 2023-09-29 22:00:39,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:00:39,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:00:39,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:00:42,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:00:42,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-29 22:00:44,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:00:45,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-29 22:00:45,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-29 22:00:47,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:00:47,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:00:51,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:00:51,769 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.09 vs. limit=22.5 2023-09-29 22:00:54,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:00:56,326 INFO [train.py:1039] (3/4) Epoch 15, batch 1050, loss[loss=0.1893, simple_loss=0.2506, pruned_loss=0.06403, over 22744.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2593, pruned_loss=0.05679, over 4703856.02 frames. ], batch size: 322, lr: 6.97e-03, grad_scale: 8.0 2023-09-29 22:00:56,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:00:59,220 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.10 vs. limit=12.0 2023-09-29 22:01:00,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:01:01,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:01:03,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 22:01:04,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:01:07,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:01:08,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=502800.0, ans=0.5 2023-09-29 22:01:10,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:01:12,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:01:14,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:01:15,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:01:15,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:01:17,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:01:17,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-29 22:01:18,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:01:18,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-29 22:01:20,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:01:20,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-29 22:01:20,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:01:26,320 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=502866.6666666667, ans=0.035 2023-09-29 22:01:27,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:01:29,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:01:29,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:01:29,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=502933.3333333333, ans=0.125 2023-09-29 22:01:32,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-29 22:01:34,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-29 22:01:34,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:01:36,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-29 22:01:38,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-29 22:01:39,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:01:44,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 22:01:47,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:01:47,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:01:48,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:01:51,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:01:55,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-29 22:01:56,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-29 22:01:56,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-29 22:01:56,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:01:56,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:02:00,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-29 22:02:05,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:02:07,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:02:07,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:02:08,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:02:08,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:02:12,712 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.20 vs. limit=15.0 2023-09-29 22:02:13,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:02:13,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-29 22:02:14,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:02:14,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-29 22:02:15,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-29 22:02:16,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:02:17,908 INFO [train.py:1039] (3/4) Epoch 15, batch 1100, loss[loss=0.1738, simple_loss=0.2475, pruned_loss=0.05005, over 23585.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.2588, pruned_loss=0.05666, over 4686581.21 frames. ], batch size: 120, lr: 6.97e-03, grad_scale: 8.0 2023-09-29 22:02:19,327 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.797e+02 2.092e+02 2.502e+02 4.130e+02, threshold=4.184e+02, percent-clipped=0.0 2023-09-29 22:02:19,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:02:24,995 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.02 vs. limit=15.0 2023-09-29 22:02:25,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:02:30,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:02:31,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:02:31,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:02:33,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-29 22:02:35,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:02:37,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-29 22:02:39,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:02:39,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=503200.0, ans=0.0 2023-09-29 22:02:43,888 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=503200.0, ans=15.0 2023-09-29 22:02:44,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:02:44,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-29 22:02:46,287 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=503200.0, ans=0.0 2023-09-29 22:02:47,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 22:02:47,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:02:47,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:02:50,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:02:52,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:02:57,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:03:00,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-29 22:03:01,762 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-29 22:03:01,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:03:02,803 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.54 vs. limit=22.5 2023-09-29 22:03:04,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:03:04,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:03:05,714 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.15 vs. limit=22.5 2023-09-29 22:03:06,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:03:08,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-29 22:03:08,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:03:10,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:03:10,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:03:10,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:03:10,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-29 22:03:16,481 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:03:16,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-29 22:03:20,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 22:03:23,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=503400.0, ans=0.125 2023-09-29 22:03:24,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:03:26,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-29 22:03:26,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-29 22:03:28,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:03:32,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:03:32,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:03:34,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-29 22:03:34,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:03:34,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=503400.0, ans=0.125 2023-09-29 22:03:35,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:03:37,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-29 22:03:37,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:03:37,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-29 22:03:38,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:03:38,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:03:40,068 INFO [train.py:1039] (3/4) Epoch 15, batch 1150, loss[loss=0.1914, simple_loss=0.2627, pruned_loss=0.06011, over 23588.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2597, pruned_loss=0.05729, over 4685927.00 frames. ], batch size: 149, lr: 6.97e-03, grad_scale: 8.0 2023-09-29 22:03:40,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:03:43,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:03:48,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:03:48,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=503466.6666666667, ans=0.0 2023-09-29 22:03:50,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:03:50,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:03:50,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=503466.6666666667, ans=0.1 2023-09-29 22:03:50,826 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.67 vs. limit=22.5 2023-09-29 22:03:51,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-29 22:03:52,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:03:56,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-29 22:03:57,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:03:57,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:03:59,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=503533.3333333333, ans=0.07 2023-09-29 22:04:05,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-29 22:04:06,922 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:04:09,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:04:10,088 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:04:10,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-29 22:04:10,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:04:10,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:04:12,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=503600.0, ans=0.1 2023-09-29 22:04:13,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-29 22:04:14,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:04:16,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:04:29,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:04:36,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:04:36,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-29 22:04:36,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:04:36,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:04:43,228 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-29 22:04:44,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:04:48,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=503733.3333333333, ans=0.0 2023-09-29 22:04:52,424 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-29 22:04:59,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:04:59,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:04:59,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=503733.3333333333, ans=0.2 2023-09-29 22:05:00,995 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:05:01,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:05:03,921 INFO [train.py:1039] (3/4) Epoch 15, batch 1200, loss[loss=0.189, simple_loss=0.2787, pruned_loss=0.0497, over 24304.00 frames. ], tot_loss[loss=0.1875, simple_loss=0.2606, pruned_loss=0.0572, over 4706629.75 frames. ], batch size: 74, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:05:04,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=503800.0, ans=0.1 2023-09-29 22:05:05,384 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.402e+02 1.796e+02 2.044e+02 2.374e+02 3.909e+02, threshold=4.087e+02, percent-clipped=0.0 2023-09-29 22:05:05,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:05:11,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:05:11,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:05:14,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:05:14,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:05:14,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:05:16,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:05:18,062 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:05:19,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:05:21,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:05:22,765 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-29 22:05:24,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-29 22:05:27,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:05:30,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:05:33,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:05:36,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:05:36,389 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-29 22:05:38,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:05:44,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:05:44,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:05:46,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-29 22:05:46,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:05:49,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-29 22:05:55,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-29 22:05:55,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:05:57,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:05:58,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:06:00,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:06:01,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:06:01,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:06:03,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:06:03,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-29 22:06:05,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:06:05,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:06:05,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:06:07,004 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:06:07,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:06:10,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=504066.6666666667, ans=0.125 2023-09-29 22:06:13,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-29 22:06:14,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:06:17,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-29 22:06:22,017 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-29 22:06:23,581 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:06:24,984 INFO [train.py:1039] (3/4) Epoch 15, batch 1250, loss[loss=0.1923, simple_loss=0.2744, pruned_loss=0.05507, over 24376.00 frames. ], tot_loss[loss=0.1893, simple_loss=0.2623, pruned_loss=0.05813, over 4700257.03 frames. ], batch size: 74, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:06:26,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:06:29,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:06:29,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:06:30,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=504133.3333333333, ans=0.125 2023-09-29 22:06:34,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-29 22:06:35,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=504133.3333333333, ans=0.125 2023-09-29 22:06:38,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:06:39,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:06:40,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-29 22:06:42,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:06:42,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:06:48,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 22:06:49,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:06:51,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:06:51,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:06:52,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:06:57,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 22:06:57,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-29 22:06:57,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:06:59,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:06:59,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:07:02,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:07:03,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:07:08,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-29 22:07:08,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:07:11,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:07:11,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-29 22:07:13,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:07:13,956 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-29 22:07:13,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:07:13,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:07:14,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=504333.3333333333, ans=0.125 2023-09-29 22:07:19,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:07:19,874 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=504333.3333333333, ans=0.0 2023-09-29 22:07:22,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:07:23,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:07:25,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-29 22:07:25,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-29 22:07:25,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-29 22:07:28,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:07:30,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-29 22:07:30,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:07:33,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-29 22:07:33,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:07:34,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-29 22:07:34,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-29 22:07:36,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:07:36,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:07:37,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:07:37,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-29 22:07:40,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:07:42,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:07:42,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 22:07:46,008 INFO [train.py:1039] (3/4) Epoch 15, batch 1300, loss[loss=0.1954, simple_loss=0.2555, pruned_loss=0.06761, over 23746.00 frames. ], tot_loss[loss=0.1897, simple_loss=0.2629, pruned_loss=0.05829, over 4709231.98 frames. ], batch size: 179, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:07:46,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-29 22:07:48,103 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.572e+02 1.942e+02 2.297e+02 2.853e+02 4.160e+02, threshold=4.593e+02, percent-clipped=1.0 2023-09-29 22:07:50,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:07:50,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-29 22:07:52,707 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:07:55,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:07:58,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:07:59,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:08:00,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:08:00,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:08:01,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-29 22:08:02,427 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.03 vs. limit=15.0 2023-09-29 22:08:07,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:08:08,417 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.67 vs. limit=15.0 2023-09-29 22:08:09,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:08:10,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-29 22:08:12,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:08:17,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:08:19,150 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:08:20,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:08:21,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=504600.0, ans=0.125 2023-09-29 22:08:22,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:08:22,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:08:22,366 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=504600.0, ans=0.0 2023-09-29 22:08:23,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-29 22:08:23,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-29 22:08:30,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:08:30,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:08:32,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-29 22:08:32,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 22:08:32,447 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=504600.0, ans=0.125 2023-09-29 22:08:33,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:08:38,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:08:38,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-29 22:08:38,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:08:38,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-29 22:08:39,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:08:43,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:08:43,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:08:47,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-29 22:08:49,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-29 22:08:51,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-29 22:08:56,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:08:58,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-29 22:09:01,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:09:07,795 INFO [train.py:1039] (3/4) Epoch 15, batch 1350, loss[loss=0.203, simple_loss=0.2836, pruned_loss=0.06115, over 24444.00 frames. ], tot_loss[loss=0.1894, simple_loss=0.2624, pruned_loss=0.05824, over 4704927.88 frames. ], batch size: 69, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:09:07,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-29 22:09:08,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=504800.0, ans=0.0 2023-09-29 22:09:09,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:09:12,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:09:15,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:09:17,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:09:18,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:09:18,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:09:26,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:09:26,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-29 22:09:26,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_na.min_abs, batch_count=504866.6666666667, ans=0.02 2023-09-29 22:09:27,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-29 22:09:29,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:09:33,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-29 22:09:33,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:09:35,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:09:35,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-29 22:09:36,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-29 22:09:37,420 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.71 vs. limit=15.0 2023-09-29 22:09:39,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-29 22:09:41,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:09:41,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-29 22:09:53,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:10:03,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:10:03,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:10:03,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-29 22:10:08,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:10:10,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-29 22:10:10,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-29 22:10:11,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:10:14,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:10:16,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-29 22:10:17,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:10:22,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-29 22:10:24,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-29 22:10:29,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-29 22:10:29,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:10:31,235 INFO [train.py:1039] (3/4) Epoch 15, batch 1400, loss[loss=0.2081, simple_loss=0.2872, pruned_loss=0.06443, over 24384.00 frames. ], tot_loss[loss=0.1881, simple_loss=0.2609, pruned_loss=0.0576, over 4685362.10 frames. ], batch size: 77, lr: 6.96e-03, grad_scale: 16.0 2023-09-29 22:10:33,170 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.906e+02 2.114e+02 2.329e+02 4.269e+02, threshold=4.227e+02, percent-clipped=0.0 2023-09-29 22:10:34,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:10:34,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:10:43,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-29 22:10:45,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-29 22:10:50,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=505200.0, ans=0.1 2023-09-29 22:10:53,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:10:56,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:10:56,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=505200.0, ans=0.125 2023-09-29 22:10:59,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:10:59,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-29 22:11:02,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:11:04,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-29 22:11:14,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:11:15,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:11:20,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-29 22:11:22,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:11:23,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:11:25,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:11:25,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:11:27,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:11:27,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:11:28,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:11:28,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-29 22:11:29,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:11:31,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=505333.3333333333, ans=0.0 2023-09-29 22:11:34,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:11:37,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:11:43,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-29 22:11:44,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 22:11:45,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:11:50,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 22:11:51,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:11:53,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:11:55,003 INFO [train.py:1039] (3/4) Epoch 15, batch 1450, loss[loss=0.1642, simple_loss=0.2395, pruned_loss=0.04447, over 24616.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.2602, pruned_loss=0.05679, over 4706485.86 frames. ], batch size: 60, lr: 6.95e-03, grad_scale: 16.0 2023-09-29 22:11:56,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:11:57,442 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.63 vs. limit=15.0 2023-09-29 22:12:01,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:12:01,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:01,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-29 22:12:01,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=505466.6666666667, ans=0.1 2023-09-29 22:12:05,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:12:06,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=505466.6666666667, ans=0.125 2023-09-29 22:12:07,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:12:08,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:12:08,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-29 22:12:10,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:12:11,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-29 22:12:12,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:13,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:12:13,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-29 22:12:13,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:12:15,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:12:16,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 22:12:16,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:12:18,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:12:20,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:24,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:12:26,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:12:27,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:12:29,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:12:29,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:29,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=505600.0, ans=0.125 2023-09-29 22:12:30,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:12:31,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:12:31,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:12:32,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:12:36,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-29 22:12:39,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:12:44,371 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-29 22:12:45,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:12:46,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:12:47,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:12:49,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-29 22:12:52,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:12:53,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=505666.6666666667, ans=0.0 2023-09-29 22:12:55,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=505666.6666666667, ans=0.125 2023-09-29 22:12:56,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-29 22:12:59,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-29 22:13:00,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:13:02,867 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.79 vs. limit=15.0 2023-09-29 22:13:03,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:13:05,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:13:06,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-29 22:13:07,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=505733.3333333333, ans=0.0 2023-09-29 22:13:08,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-29 22:13:08,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-29 22:13:09,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:13:11,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:13:13,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=505733.3333333333, ans=0.2 2023-09-29 22:13:16,233 INFO [train.py:1039] (3/4) Epoch 15, batch 1500, loss[loss=0.2128, simple_loss=0.2779, pruned_loss=0.07384, over 23743.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2604, pruned_loss=0.05685, over 4715531.54 frames. ], batch size: 256, lr: 6.95e-03, grad_scale: 16.0 2023-09-29 22:13:17,607 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.853e+02 2.114e+02 2.421e+02 4.526e+02, threshold=4.227e+02, percent-clipped=1.0 2023-09-29 22:13:21,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-29 22:13:22,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:13:22,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:13:22,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:13:24,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:13:25,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:13:27,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-29 22:13:29,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:13:29,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:13:29,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:13:31,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:13:32,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:13:34,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:13:38,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:13:38,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-29 22:13:39,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:13:40,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:13:40,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:13:43,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-29 22:13:48,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-29 22:13:49,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:13:51,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-29 22:13:52,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=505933.3333333333, ans=0.2 2023-09-29 22:13:54,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-29 22:13:54,934 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=10.09 vs. limit=15.0 2023-09-29 22:13:57,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:13:57,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:13:57,149 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:13:58,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-29 22:13:58,779 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:13:58,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:14:00,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-29 22:14:02,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:14:06,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:14:06,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-29 22:14:12,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 22:14:14,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:14:20,422 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-29 22:14:20,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:14:20,497 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-29 22:14:20,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:14:22,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:14:23,613 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-29 22:14:25,142 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:14:29,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-29 22:14:31,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:14:34,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:14:34,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:14:34,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:14:34,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:14:34,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:14:36,469 INFO [train.py:1039] (3/4) Epoch 15, batch 1550, loss[loss=0.1834, simple_loss=0.2486, pruned_loss=0.05907, over 23793.00 frames. ], tot_loss[loss=0.1879, simple_loss=0.2611, pruned_loss=0.05736, over 4718098.56 frames. ], batch size: 164, lr: 6.95e-03, grad_scale: 16.0 2023-09-29 22:14:36,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-29 22:14:38,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-29 22:14:38,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:14:39,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-29 22:14:39,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-29 22:14:43,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:14:43,843 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=506133.3333333333, ans=0.125 2023-09-29 22:14:45,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:14:46,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:14:46,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:14:48,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:14:48,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:14:52,873 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-29 22:14:54,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:14:54,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:14:54,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:14:57,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:14:57,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-29 22:14:57,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=506200.0, ans=0.1 2023-09-29 22:14:58,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:14:59,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-29 22:15:01,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-29 22:15:01,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-29 22:15:02,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:15:03,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:15:08,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:15:11,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-29 22:15:11,390 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-29 22:15:16,428 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.94 vs. limit=15.0 2023-09-29 22:15:19,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:15:22,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:15:24,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-29 22:15:24,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:15:25,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-29 22:15:29,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=506333.3333333333, ans=0.125 2023-09-29 22:15:30,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:15:30,565 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=506333.3333333333, ans=0.95 2023-09-29 22:15:31,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:15:34,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:15:37,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:15:39,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:15:39,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-29 22:15:39,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:15:40,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:15:40,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:15:42,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-29 22:15:42,446 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-29 22:15:47,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:15:52,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-29 22:15:57,916 INFO [train.py:1039] (3/4) Epoch 15, batch 1600, loss[loss=0.2156, simple_loss=0.2802, pruned_loss=0.07555, over 23388.00 frames. ], tot_loss[loss=0.1884, simple_loss=0.2615, pruned_loss=0.05768, over 4713186.71 frames. ], batch size: 285, lr: 6.95e-03, grad_scale: 32.0 2023-09-29 22:15:59,412 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.908e+02 2.211e+02 2.589e+02 3.896e+02, threshold=4.422e+02, percent-clipped=0.0 2023-09-29 22:15:59,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:16:01,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:16:01,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-29 22:16:02,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:16:02,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:16:02,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:16:02,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:16:04,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:16:07,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:16:09,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-29 22:16:10,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-29 22:16:12,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-29 22:16:15,171 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:16:15,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-29 22:16:16,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:16:19,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:16:23,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:16:23,468 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=1.017e-02 2023-09-29 22:16:28,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-29 22:16:31,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:16:33,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-29 22:16:34,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:16:34,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-29 22:16:39,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-29 22:16:45,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:16:45,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-29 22:16:50,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:16:50,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:16:50,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:16:52,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-29 22:16:58,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 22:17:00,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:17:00,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:17:00,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:17:01,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:17:05,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:17:06,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:17:07,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:17:14,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:17:14,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:17:17,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-29 22:17:17,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:17:17,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-29 22:17:19,498 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=506733.3333333333, ans=0.0 2023-09-29 22:17:19,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten.whitening_limit, batch_count=506733.3333333333, ans=15.0 2023-09-29 22:17:20,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=506733.3333333333, ans=0.0 2023-09-29 22:17:22,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=506800.0, ans=0.0 2023-09-29 22:17:23,492 INFO [train.py:1039] (3/4) Epoch 15, batch 1650, loss[loss=0.1914, simple_loss=0.2761, pruned_loss=0.05336, over 24373.00 frames. ], tot_loss[loss=0.1897, simple_loss=0.2625, pruned_loss=0.05842, over 4703309.05 frames. ], batch size: 77, lr: 6.94e-03, grad_scale: 32.0 2023-09-29 22:17:25,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:17:25,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:17:27,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:17:27,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-29 22:17:27,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-29 22:17:27,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-29 22:17:28,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-29 22:17:32,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:17:32,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:17:34,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:17:34,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:17:37,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:17:39,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-29 22:17:44,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:17:44,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:17:44,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:17:44,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:17:45,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-29 22:17:45,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-29 22:17:49,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=506866.6666666667, ans=0.125 2023-09-29 22:17:53,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:17:54,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:18:03,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-29 22:18:03,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:18:07,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-29 22:18:10,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:18:13,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:18:13,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:18:14,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:18:14,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:18:14,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:18:17,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:18:19,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:18:19,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:18:19,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:18:20,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:18:21,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:18:24,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:18:24,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=507000.0, ans=0.125 2023-09-29 22:18:26,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-29 22:18:27,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:18:27,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-29 22:18:28,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-29 22:18:28,728 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-29 22:18:28,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:18:30,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:18:30,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:18:31,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:18:31,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-29 22:18:32,730 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=5.30 vs. limit=15.0 2023-09-29 22:18:35,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:18:38,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:18:38,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:18:41,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-29 22:18:47,120 INFO [train.py:1039] (3/4) Epoch 15, batch 1700, loss[loss=0.1696, simple_loss=0.2509, pruned_loss=0.04416, over 24305.00 frames. ], tot_loss[loss=0.189, simple_loss=0.2619, pruned_loss=0.05804, over 4705500.81 frames. ], batch size: 61, lr: 6.94e-03, grad_scale: 32.0 2023-09-29 22:18:48,615 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.631e+02 1.972e+02 2.169e+02 2.497e+02 4.927e+02, threshold=4.339e+02, percent-clipped=2.0 2023-09-29 22:18:48,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:18:48,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:18:48,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-29 22:18:50,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:18:50,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:18:51,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:18:54,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:18:54,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:18:54,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-29 22:18:57,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:18:58,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=507133.3333333333, ans=0.0 2023-09-29 22:19:05,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:19:07,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:19:14,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:19:14,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:19:14,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:19:14,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:19:19,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-29 22:19:21,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:19:21,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:19:22,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:19:24,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:19:26,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-29 22:19:26,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-29 22:19:27,681 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:19:29,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-29 22:19:30,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:19:38,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:19:38,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:19:40,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:19:41,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-29 22:19:41,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-29 22:19:41,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:19:45,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:19:45,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-29 22:19:45,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:19:45,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:19:46,695 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.98 vs. limit=15.0 2023-09-29 22:19:47,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:19:47,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:19:50,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:19:50,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:19:51,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:19:51,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:19:51,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:19:56,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:19:57,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-29 22:20:00,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:20:00,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:20:04,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-29 22:20:06,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=507400.0, ans=0.0 2023-09-29 22:20:08,729 INFO [train.py:1039] (3/4) Epoch 15, batch 1750, loss[loss=0.1775, simple_loss=0.2365, pruned_loss=0.0593, over 23335.00 frames. ], tot_loss[loss=0.1883, simple_loss=0.2614, pruned_loss=0.05761, over 4714246.42 frames. ], batch size: 285, lr: 6.94e-03, grad_scale: 32.0 2023-09-29 22:20:08,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:20:11,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:20:11,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:20:14,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-29 22:20:14,126 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:20:17,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:20:19,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:20:21,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=507466.6666666667, ans=0.125 2023-09-29 22:20:24,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-29 22:20:26,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:20:28,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-29 22:20:28,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:20:28,866 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.55 vs. limit=15.0 2023-09-29 22:20:29,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:20:34,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 22:20:35,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-29 22:20:37,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:20:37,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-29 22:20:46,518 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:20:50,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:20:50,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:20:53,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:20:53,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:20:56,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:20:57,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:21:00,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:21:02,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:21:02,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-29 22:21:03,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:21:06,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-29 22:21:07,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:21:10,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:21:11,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:21:11,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=507666.6666666667, ans=0.2 2023-09-29 22:21:13,774 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.08 vs. limit=15.0 2023-09-29 22:21:16,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:21:16,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-29 22:21:16,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:21:19,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:21:22,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:21:25,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:21:27,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:21:29,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-29 22:21:29,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:21:30,898 INFO [train.py:1039] (3/4) Epoch 15, batch 1800, loss[loss=0.1927, simple_loss=0.278, pruned_loss=0.05375, over 24543.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.2607, pruned_loss=0.05709, over 4712092.99 frames. ], batch size: 71, lr: 6.94e-03, grad_scale: 16.0 2023-09-29 22:21:30,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:21:30,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:21:30,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:21:31,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:21:31,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:21:34,574 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.864e+02 2.038e+02 2.350e+02 3.855e+02, threshold=4.075e+02, percent-clipped=0.0 2023-09-29 22:21:34,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:21:36,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:21:37,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 22:21:40,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:21:42,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 22:21:43,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=507800.0, ans=0.0 2023-09-29 22:21:45,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:21:48,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:21:51,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:21:51,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:21:53,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:21:54,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:21:54,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-29 22:21:56,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:21:59,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:22:01,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=507933.3333333333, ans=0.2 2023-09-29 22:22:03,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-29 22:22:07,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-29 22:22:07,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-29 22:22:07,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:22:09,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:22:09,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:22:11,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:22:17,338 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-29 22:22:17,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:22:20,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:22:22,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-29 22:22:22,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=508000.0, ans=0.2 2023-09-29 22:22:23,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-29 22:22:23,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:22:25,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:22:27,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:22:32,136 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.87 vs. limit=12.0 2023-09-29 22:22:33,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-29 22:22:38,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:22:40,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-29 22:22:42,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:22:42,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:22:42,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:22:43,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-29 22:22:45,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:22:45,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:22:49,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-29 22:22:49,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:22:51,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:22:51,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:22:51,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:22:51,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:22:52,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:22:54,107 INFO [train.py:1039] (3/4) Epoch 15, batch 1850, loss[loss=0.2153, simple_loss=0.2744, pruned_loss=0.07812, over 19187.00 frames. ], tot_loss[loss=0.1879, simple_loss=0.261, pruned_loss=0.05735, over 4697013.02 frames. ], batch size: 389, lr: 6.94e-03, grad_scale: 16.0 2023-09-29 22:22:55,657 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:22:55,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:22:56,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=508133.3333333333, ans=0.125 2023-09-29 22:22:58,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:23:00,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:23:08,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:23:08,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-29 22:23:11,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=508200.0, ans=0.07 2023-09-29 22:23:11,566 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.61 vs. limit=12.0 2023-09-29 22:23:15,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-29 22:23:18,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-29 22:23:23,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:23:23,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-29 22:23:23,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 22:23:33,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:23:33,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-29 22:23:36,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:23:36,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=508266.6666666667, ans=0.125 2023-09-29 22:23:37,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:23:41,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-29 22:23:41,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:23:41,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:23:43,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:23:45,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:23:47,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:23:51,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:23:52,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:23:52,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 22:23:52,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:23:53,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:23:55,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:23:55,408 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=508333.3333333333, ans=0.125 2023-09-29 22:23:58,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-29 22:24:00,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:24:03,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:24:05,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:24:05,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-29 22:24:05,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-29 22:24:06,730 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-29 22:24:08,179 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-29 22:24:09,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:24:09,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:24:09,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:24:09,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:24:09,970 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-29 22:24:09,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:24:11,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:24:11,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:24:15,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:24:17,058 INFO [train.py:1039] (3/4) Epoch 15, batch 1900, loss[loss=0.2558, simple_loss=0.3059, pruned_loss=0.1029, over 19496.00 frames. ], tot_loss[loss=0.1893, simple_loss=0.2625, pruned_loss=0.05805, over 4698077.62 frames. ], batch size: 389, lr: 6.93e-03, grad_scale: 16.0 2023-09-29 22:24:17,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:24:17,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-29 22:24:18,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:24:18,849 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-29 22:24:18,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:24:20,831 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 2.037e+02 2.291e+02 2.918e+02 4.608e+02, threshold=4.583e+02, percent-clipped=3.0 2023-09-29 22:24:20,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:24:25,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:24:26,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=508466.6666666667, ans=0.125 2023-09-29 22:24:28,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:24:29,938 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-29 22:24:30,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-29 22:24:30,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=508466.6666666667, ans=0.125 2023-09-29 22:24:33,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:24:33,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:24:34,993 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-29 22:24:35,048 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-29 22:24:39,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-29 22:24:41,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:24:44,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-29 22:24:45,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-29 22:24:47,652 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=508600.0, ans=0.07 2023-09-29 22:24:49,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=508600.0, ans=0.2 2023-09-29 22:24:59,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-29 22:25:02,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-29 22:25:02,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:25:02,823 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-29 22:25:02,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-29 22:25:04,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-29 22:25:04,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-29 22:25:04,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:25:09,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-29 22:25:11,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=508666.6666666667, ans=0.125 2023-09-29 22:25:12,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:25:16,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:25:16,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-29 22:25:18,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:25:18,968 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=508666.6666666667, ans=0.0 2023-09-29 22:25:21,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-29 22:25:21,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:25:28,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:25:28,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:25:28,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:25:30,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:25:32,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:25:32,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-29 22:25:33,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:25:36,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:25:36,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:25:38,240 INFO [train.py:1039] (3/4) Epoch 15, batch 1950, loss[loss=0.1953, simple_loss=0.2714, pruned_loss=0.05964, over 23210.00 frames. ], tot_loss[loss=0.1908, simple_loss=0.2637, pruned_loss=0.05893, over 4693535.78 frames. ], batch size: 105, lr: 6.93e-03, grad_scale: 16.0 2023-09-29 22:25:38,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=508800.0, ans=0.0 2023-09-29 22:25:39,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:25:39,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:25:41,284 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:25:41,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:25:43,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=508800.0, ans=0.0 2023-09-29 22:25:44,590 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:25:46,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=508800.0, ans=0.2 2023-09-29 22:25:47,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:25:47,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:25:47,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:25:49,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-29 22:25:51,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 22:25:51,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:25:53,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:25:54,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:25:56,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:25:56,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:25:58,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:26:01,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:26:01,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:26:01,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:26:01,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:26:06,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:26:08,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:26:08,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:26:08,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-29 22:26:08,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-29 22:26:08,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=508866.6666666667, ans=0.2 2023-09-29 22:26:09,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:26:09,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:26:10,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:26:13,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:26:16,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:26:22,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:26:25,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:26:25,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:26:25,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-29 22:26:25,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:26:32,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:26:33,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:26:35,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:26:42,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:26:44,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:26:46,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:26:49,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:26:51,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:26:52,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:26:53,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-29 22:26:53,098 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:26:53,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:26:56,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-29 22:26:58,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:26:59,705 INFO [train.py:1039] (3/4) Epoch 15, batch 2000, loss[loss=0.2777, simple_loss=0.3256, pruned_loss=0.1149, over 19623.00 frames. ], tot_loss[loss=0.1911, simple_loss=0.2638, pruned_loss=0.05913, over 4689764.02 frames. ], batch size: 389, lr: 6.93e-03, grad_scale: 32.0 2023-09-29 22:27:02,741 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.867e+02 2.115e+02 2.554e+02 3.825e+02, threshold=4.229e+02, percent-clipped=0.0 2023-09-29 22:27:02,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:27:03,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_positive, batch_count=509133.3333333333, ans=0.05 2023-09-29 22:27:05,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:27:05,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:27:06,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:27:09,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:27:13,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-29 22:27:13,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:27:18,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:27:21,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-29 22:27:21,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:27:21,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:27:24,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:27:25,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-29 22:27:27,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:27:27,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:27:28,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:27:29,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-29 22:27:29,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:27:31,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-29 22:27:31,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:27:31,571 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=509266.6666666667, ans=0.0 2023-09-29 22:27:32,026 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.18 vs. limit=22.5 2023-09-29 22:27:35,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:27:37,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-29 22:27:37,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:27:39,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:27:39,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:27:40,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-29 22:27:44,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-29 22:27:44,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:27:44,098 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:27:50,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:27:51,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:27:51,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:27:53,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:27:54,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:27:54,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:27:56,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:27:56,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:27:57,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:00,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:28:02,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-29 22:28:07,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:28:09,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:12,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:12,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:28:16,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:17,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:28:17,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:19,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:28:19,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:28:23,137 INFO [train.py:1039] (3/4) Epoch 15, batch 2050, loss[loss=0.1906, simple_loss=0.2564, pruned_loss=0.06246, over 23783.00 frames. ], tot_loss[loss=0.1901, simple_loss=0.2628, pruned_loss=0.05869, over 4694245.24 frames. ], batch size: 179, lr: 6.93e-03, grad_scale: 32.0 2023-09-29 22:28:23,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:24,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:27,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:28:28,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:31,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:28:31,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=509466.6666666667, ans=0.0 2023-09-29 22:28:34,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:28:34,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:28:34,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:28:35,206 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.97 vs. limit=15.0 2023-09-29 22:28:37,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-29 22:28:37,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:28:38,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:28:38,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:28:39,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=509533.3333333333, ans=0.1 2023-09-29 22:28:50,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:28:50,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:54,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-29 22:28:55,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:28:57,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-29 22:28:57,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:29:02,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:29:04,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:29:06,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:29:06,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:29:08,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:29:09,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:29:11,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:29:14,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:29:16,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:29:18,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:29:21,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:29:23,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:29:30,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=509733.3333333333, ans=0.0 2023-09-29 22:29:31,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:29:32,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-29 22:29:36,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:29:36,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=509733.3333333333, ans=0.125 2023-09-29 22:29:37,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:29:40,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:29:42,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-29 22:29:43,638 INFO [train.py:1039] (3/4) Epoch 15, batch 2100, loss[loss=0.1803, simple_loss=0.2643, pruned_loss=0.04814, over 24455.00 frames. ], tot_loss[loss=0.1887, simple_loss=0.2612, pruned_loss=0.05805, over 4707364.97 frames. ], batch size: 69, lr: 6.92e-03, grad_scale: 32.0 2023-09-29 22:29:45,595 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-29 22:29:45,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:29:45,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:29:46,559 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.29 vs. limit=12.0 2023-09-29 22:29:46,884 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.817e+02 2.090e+02 2.571e+02 3.864e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-29 22:29:47,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:29:48,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:29:48,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-29 22:29:48,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-29 22:29:50,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:29:56,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:29:56,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:29:57,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:29:59,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:29:59,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-29 22:30:01,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:30:01,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-29 22:30:01,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-29 22:30:04,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:30:04,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:30:04,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-29 22:30:04,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 22:30:09,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-29 22:30:09,429 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:30:14,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:30:14,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:30:18,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:30:18,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-29 22:30:19,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:30:19,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-29 22:30:21,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-29 22:30:21,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:30:21,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-29 22:30:21,812 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-29 22:30:23,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-29 22:30:27,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:30:30,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:30:32,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:30:33,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:30:35,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:30:37,284 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:30:37,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-29 22:30:37,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:30:37,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:30:37,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:30:37,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-29 22:30:40,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-29 22:30:40,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-29 22:30:43,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:30:44,563 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=8.98 vs. limit=15.0 2023-09-29 22:30:46,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:30:46,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-29 22:30:51,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:30:54,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:30:56,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:30:56,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:30:56,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-29 22:30:56,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:30:57,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:30:57,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:30:59,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:30:59,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:03,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-29 22:31:03,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=510066.6666666667, ans=0.2 2023-09-29 22:31:06,100 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.51 vs. limit=15.0 2023-09-29 22:31:07,264 INFO [train.py:1039] (3/4) Epoch 15, batch 2150, loss[loss=0.1728, simple_loss=0.2588, pruned_loss=0.04339, over 24703.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.26, pruned_loss=0.05744, over 4708985.13 frames. ], batch size: 65, lr: 6.92e-03, grad_scale: 32.0 2023-09-29 22:31:07,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-29 22:31:07,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:31:09,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:31:09,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:31:09,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:31:09,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:31:15,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-29 22:31:17,210 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=510133.3333333333, ans=0.0 2023-09-29 22:31:18,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:31:18,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:20,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:31:20,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:20,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:31:23,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:24,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:31:24,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:31:27,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:27,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-29 22:31:32,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:31:34,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:31:36,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:36,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:31:36,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:38,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:31:38,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:31:38,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:31:40,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:31:41,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-29 22:31:43,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:31:43,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:45,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:31:45,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:31:46,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:31:48,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:31:50,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:31:51,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:31:51,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-29 22:31:51,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-29 22:31:54,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:31:56,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:31:57,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:31:59,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:31:59,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:00,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:00,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-29 22:32:02,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-29 22:32:02,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:32:03,796 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-29 22:32:03,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:03,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:32:07,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-29 22:32:07,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:32:07,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-29 22:32:07,345 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-29 22:32:07,345 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-29 22:32:07,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-29 22:32:11,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:11,126 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:32:12,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:32:12,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:14,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 22:32:14,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:14,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:22,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:32:24,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-29 22:32:25,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:32:28,674 INFO [train.py:1039] (3/4) Epoch 15, batch 2200, loss[loss=0.1768, simple_loss=0.2467, pruned_loss=0.05343, over 23160.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2603, pruned_loss=0.05697, over 4721353.63 frames. ], batch size: 105, lr: 6.92e-03, grad_scale: 32.0 2023-09-29 22:32:31,712 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 1.917e+02 2.133e+02 2.422e+02 4.121e+02, threshold=4.265e+02, percent-clipped=0.0 2023-09-29 22:32:31,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:31,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:32:32,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:32:33,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:32:35,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:32:35,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:32:35,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-29 22:32:42,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-29 22:32:44,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:32:44,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=510533.3333333333, ans=0.125 2023-09-29 22:32:49,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-29 22:32:50,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=510533.3333333333, ans=0.0 2023-09-29 22:32:52,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:32:53,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:32:54,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:33:00,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:33:00,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-29 22:33:04,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:33:06,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:33:06,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-29 22:33:10,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:33:11,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:33:14,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:33:14,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:33:16,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-29 22:33:18,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:33:20,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-29 22:33:22,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:33:23,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-29 22:33:23,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:33:26,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:33:28,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:33:28,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:33:28,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:33:29,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:33:29,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:33:32,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 22:33:36,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 22:33:37,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:33:37,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=510733.3333333333, ans=0.0 2023-09-29 22:33:39,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:33:40,698 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-29 22:33:42,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:33:42,421 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-29 22:33:43,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:33:45,332 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-29 22:33:47,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:33:47,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-29 22:33:50,560 INFO [train.py:1039] (3/4) Epoch 15, batch 2250, loss[loss=0.1581, simple_loss=0.2365, pruned_loss=0.03989, over 24422.00 frames. ], tot_loss[loss=0.1868, simple_loss=0.2597, pruned_loss=0.05698, over 4714239.53 frames. ], batch size: 58, lr: 6.92e-03, grad_scale: 32.0 2023-09-29 22:33:50,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:33:52,584 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-29 22:33:54,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:33:56,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:34:01,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:34:03,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:34:07,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:34:07,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:34:08,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=510866.6666666667, ans=0.0 2023-09-29 22:34:09,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:34:10,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-29 22:34:12,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:34:12,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:34:12,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=510866.6666666667, ans=0.125 2023-09-29 22:34:13,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-29 22:34:15,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:34:15,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:34:15,644 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=510866.6666666667, ans=0.125 2023-09-29 22:34:16,048 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=18.16 vs. limit=15.0 2023-09-29 22:34:17,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:34:18,867 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=510866.6666666667, ans=0.0 2023-09-29 22:34:23,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:34:25,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 22:34:25,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:34:26,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-29 22:34:26,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:34:31,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:34:34,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:34:35,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:34:38,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:34:38,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:34:41,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:34:43,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:34:46,575 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=511000.0, ans=0.0 2023-09-29 22:34:47,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:34:50,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-29 22:34:51,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=511000.0, ans=0.0 2023-09-29 22:34:56,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 22:34:56,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:34:58,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:35:04,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 22:35:08,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:35:08,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-29 22:35:08,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:35:08,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:35:11,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-29 22:35:12,831 INFO [train.py:1039] (3/4) Epoch 15, batch 2300, loss[loss=0.1959, simple_loss=0.2603, pruned_loss=0.06574, over 23250.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2607, pruned_loss=0.05721, over 4727890.80 frames. ], batch size: 119, lr: 6.91e-03, grad_scale: 8.0 2023-09-29 22:35:14,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:35:16,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:35:19,136 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.881e+02 2.170e+02 2.531e+02 3.802e+02, threshold=4.341e+02, percent-clipped=0.0 2023-09-29 22:35:20,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:35:22,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:35:23,863 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-29 22:35:25,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:35:25,993 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.78 vs. limit=15.0 2023-09-29 22:35:32,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:35:32,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-29 22:35:32,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:35:33,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:35:33,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-29 22:35:35,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:35:40,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:35:40,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:35:45,255 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:35:48,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:35:51,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:35:53,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=511266.6666666667, ans=0.1 2023-09-29 22:35:57,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:35:58,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:35:59,462 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.71 vs. limit=15.0 2023-09-29 22:36:00,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:36:00,664 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:36:03,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:36:08,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:36:08,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:36:08,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:36:08,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-29 22:36:11,398 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=13.01 vs. limit=15.0 2023-09-29 22:36:12,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 22:36:12,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:36:13,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:36:13,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:36:13,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:36:15,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 22:36:15,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-29 22:36:15,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-29 22:36:15,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:36:15,440 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:36:15,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-29 22:36:21,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:36:25,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:36:30,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:36:30,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:36:30,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-29 22:36:32,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:36:32,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:36:32,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:36:33,776 INFO [train.py:1039] (3/4) Epoch 15, batch 2350, loss[loss=0.1902, simple_loss=0.2749, pruned_loss=0.05274, over 24306.00 frames. ], tot_loss[loss=0.1881, simple_loss=0.2619, pruned_loss=0.05717, over 4729502.47 frames. ], batch size: 74, lr: 6.91e-03, grad_scale: 8.0 2023-09-29 22:36:33,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-29 22:36:40,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=511466.6666666667, ans=0.0 2023-09-29 22:36:42,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:36:42,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-29 22:36:48,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-29 22:36:50,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:36:52,829 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=511533.3333333333, ans=0.125 2023-09-29 22:36:55,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:36:55,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:36:55,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:36:55,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:36:56,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-29 22:37:01,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:37:07,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-29 22:37:09,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:37:12,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:37:12,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:37:15,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:37:17,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-29 22:37:17,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:37:19,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:37:19,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:37:19,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:37:21,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=511666.6666666667, ans=0.2 2023-09-29 22:37:24,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:37:28,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-29 22:37:28,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:37:30,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:37:31,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:37:33,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-29 22:37:33,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:37:36,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-29 22:37:36,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:37:39,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-29 22:37:44,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-29 22:37:44,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:37:44,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-29 22:37:44,366 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-29 22:37:45,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-29 22:37:47,716 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=511733.3333333333, ans=0.1 2023-09-29 22:37:48,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-29 22:37:52,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:37:54,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=511800.0, ans=0.125 2023-09-29 22:37:56,454 INFO [train.py:1039] (3/4) Epoch 15, batch 2400, loss[loss=0.1895, simple_loss=0.276, pruned_loss=0.05153, over 24299.00 frames. ], tot_loss[loss=0.1872, simple_loss=0.2607, pruned_loss=0.05681, over 4730029.53 frames. ], batch size: 74, lr: 6.91e-03, grad_scale: 16.0 2023-09-29 22:37:56,911 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=511800.0, ans=0.2 2023-09-29 22:37:58,051 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:38:01,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:38:03,208 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.881e+02 2.096e+02 2.397e+02 4.111e+02, threshold=4.192e+02, percent-clipped=0.0 2023-09-29 22:38:03,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:38:03,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-29 22:38:03,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-29 22:38:12,687 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 22:38:12,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:38:15,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-29 22:38:15,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:38:16,329 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.07 vs. limit=15.0 2023-09-29 22:38:17,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:38:17,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-29 22:38:23,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:38:25,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-29 22:38:30,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:38:37,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-29 22:38:38,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:38:40,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:38:43,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:38:43,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-29 22:38:43,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:38:51,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:38:53,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:38:56,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:38:57,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=512000.0, ans=0.125 2023-09-29 22:38:58,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:38:58,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-29 22:38:58,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:38:58,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:38:58,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:39:00,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 22:39:05,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:39:07,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:39:07,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-29 22:39:08,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-29 22:39:10,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:39:10,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:39:10,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-29 22:39:11,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-29 22:39:11,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-29 22:39:11,956 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-29 22:39:13,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-29 22:39:14,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:39:16,465 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:39:16,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:39:17,953 INFO [train.py:1039] (3/4) Epoch 15, batch 2450, loss[loss=0.1755, simple_loss=0.2467, pruned_loss=0.05218, over 17195.00 frames. ], tot_loss[loss=0.1863, simple_loss=0.2592, pruned_loss=0.05666, over 4708096.39 frames. ], batch size: 37, lr: 6.91e-03, grad_scale: 16.0 2023-09-29 22:39:18,120 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-29 22:39:18,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:39:19,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-29 22:39:22,129 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.50 vs. limit=6.0 2023-09-29 22:39:22,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:39:22,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:39:26,703 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.47 vs. limit=15.0 2023-09-29 22:39:27,554 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:39:27,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:39:29,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-29 22:39:34,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:39:34,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:39:37,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:39:37,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:39:37,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:39:39,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-29 22:39:43,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:39:43,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=512200.0, ans=0.125 2023-09-29 22:39:46,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:39:47,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:39:51,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=512266.6666666667, ans=0.0 2023-09-29 22:39:51,608 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.66 vs. limit=15.0 2023-09-29 22:39:52,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:39:52,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:39:54,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:39:55,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:39:57,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-29 22:39:57,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:40:07,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:40:09,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:40:09,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:40:09,569 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:40:11,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:40:12,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:40:12,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-29 22:40:14,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:40:16,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:40:19,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:40:19,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:40:24,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:40:24,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-29 22:40:24,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=512400.0, ans=0.125 2023-09-29 22:40:26,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:40:26,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:40:26,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=512400.0, ans=0.1 2023-09-29 22:40:27,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-29 22:40:29,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:40:29,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:40:32,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:40:34,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:40:35,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:40:39,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-29 22:40:41,385 INFO [train.py:1039] (3/4) Epoch 15, batch 2500, loss[loss=0.1871, simple_loss=0.2492, pruned_loss=0.06253, over 22710.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2583, pruned_loss=0.05638, over 4709157.17 frames. ], batch size: 322, lr: 6.91e-03, grad_scale: 16.0 2023-09-29 22:40:41,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-29 22:40:48,499 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.863e+02 2.026e+02 2.249e+02 3.310e+02, threshold=4.053e+02, percent-clipped=0.0 2023-09-29 22:40:48,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:40:58,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:40:58,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:40:59,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=512533.3333333333, ans=0.125 2023-09-29 22:41:00,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:41:00,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-29 22:41:07,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:41:08,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:41:08,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-29 22:41:08,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 22:41:10,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-29 22:41:11,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:41:11,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:41:11,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-29 22:41:13,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:41:13,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-29 22:41:13,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=512600.0, ans=0.1 2023-09-29 22:41:14,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:41:15,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=512600.0, ans=0.125 2023-09-29 22:41:20,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:41:20,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:41:22,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 22:41:24,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-29 22:41:25,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:41:27,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:41:29,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=512600.0, ans=0.0 2023-09-29 22:41:32,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:41:32,295 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=512666.6666666667, ans=0.125 2023-09-29 22:41:36,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:41:38,359 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=512666.6666666667, ans=0.125 2023-09-29 22:41:39,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:41:44,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-29 22:41:46,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-29 22:41:48,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:41:48,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-29 22:41:50,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:41:50,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 22:41:50,316 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-29 22:41:50,317 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-29 22:41:50,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-29 22:41:54,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:41:56,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-29 22:41:56,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-29 22:41:57,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:41:59,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-29 22:42:03,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-29 22:42:03,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=512800.0, ans=0.125 2023-09-29 22:42:04,937 INFO [train.py:1039] (3/4) Epoch 15, batch 2550, loss[loss=0.1718, simple_loss=0.254, pruned_loss=0.0448, over 24528.00 frames. ], tot_loss[loss=0.187, simple_loss=0.2596, pruned_loss=0.05717, over 4695544.73 frames. ], batch size: 63, lr: 6.90e-03, grad_scale: 16.0 2023-09-29 22:42:07,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:42:10,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:42:10,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:42:13,554 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:42:15,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-29 22:42:15,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:42:19,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-29 22:42:21,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:42:24,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:42:26,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:42:26,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 22:42:28,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:42:28,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:42:29,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:42:32,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:42:32,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-29 22:42:32,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-29 22:42:32,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:42:32,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-29 22:42:38,971 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.52 vs. limit=15.0 2023-09-29 22:42:45,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:42:51,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:42:51,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:42:51,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:42:52,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 22:42:58,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:43:01,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 22:43:01,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:43:03,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:43:03,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-29 22:43:04,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:43:04,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=513000.0, ans=0.1 2023-09-29 22:43:07,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:43:07,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:43:11,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:43:11,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-29 22:43:11,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:43:11,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:43:13,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-29 22:43:15,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:43:18,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:43:20,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=513066.6666666667, ans=0.0 2023-09-29 22:43:23,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:43:23,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=513066.6666666667, ans=0.125 2023-09-29 22:43:24,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:43:27,862 INFO [train.py:1039] (3/4) Epoch 15, batch 2600, loss[loss=0.185, simple_loss=0.27, pruned_loss=0.05004, over 24637.00 frames. ], tot_loss[loss=0.1879, simple_loss=0.2608, pruned_loss=0.05743, over 4703847.22 frames. ], batch size: 68, lr: 6.90e-03, grad_scale: 16.0 2023-09-29 22:43:28,066 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-29 22:43:28,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=513133.3333333333, ans=0.125 2023-09-29 22:43:32,911 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-29 22:43:32,948 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:43:34,981 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.927e+02 2.129e+02 2.377e+02 3.619e+02, threshold=4.257e+02, percent-clipped=0.0 2023-09-29 22:43:35,098 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-29 22:43:35,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-29 22:43:35,277 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-29 22:43:36,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=513133.3333333333, ans=0.0 2023-09-29 22:43:40,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:43:40,490 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-29 22:43:40,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-29 22:43:42,063 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-29 22:43:45,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:43:45,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-29 22:43:48,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-29 22:43:49,674 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:43:49,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-29 22:43:50,262 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=513200.0, ans=0.125 2023-09-29 22:43:51,358 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-29 22:43:51,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-29 22:43:55,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=513200.0, ans=0.125 2023-09-29 22:44:01,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:44:01,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:44:01,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:44:01,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-29 22:44:04,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:44:10,617 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-29 22:44:18,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:44:18,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:44:19,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-29 22:44:19,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:44:19,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:44:21,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-29 22:44:24,102 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.29 vs. limit=5.0 2023-09-29 22:44:24,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:44:24,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:44:26,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:44:29,871 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-29 22:44:29,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:44:29,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:44:31,740 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 22:44:33,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=513400.0, ans=0.07 2023-09-29 22:44:36,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:44:36,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:44:38,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-29 22:44:38,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:44:39,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:44:41,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:44:41,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=513400.0, ans=0.125 2023-09-29 22:44:45,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=513400.0, ans=0.2 2023-09-29 22:44:46,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-29 22:44:48,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:44:50,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 22:44:51,962 INFO [train.py:1039] (3/4) Epoch 15, batch 2650, loss[loss=0.1781, simple_loss=0.2517, pruned_loss=0.05222, over 19257.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2612, pruned_loss=0.05701, over 4706126.26 frames. ], batch size: 42, lr: 6.90e-03, grad_scale: 16.0 2023-09-29 22:44:54,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=513466.6666666667, ans=0.125 2023-09-29 22:44:55,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-29 22:44:55,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:44:56,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:44:58,142 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-29 22:44:58,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:45:01,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:45:03,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 22:45:03,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=513466.6666666667, ans=0.1 2023-09-29 22:45:06,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:45:06,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:45:08,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-29 22:45:08,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:45:08,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:45:11,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-29 22:45:11,986 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.11 vs. limit=22.5 2023-09-29 22:45:13,364 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-29 22:45:16,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:45:16,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=513533.3333333333, ans=0.2 2023-09-29 22:45:17,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-29 22:45:18,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=513533.3333333333, ans=0.0 2023-09-29 22:45:19,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:45:20,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-29 22:45:23,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:45:23,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-29 22:45:25,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:45:25,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:45:30,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-29 22:45:31,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-29 22:45:31,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=513600.0, ans=0.125 2023-09-29 22:45:33,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:45:38,423 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-29 22:45:38,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:45:39,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:45:39,975 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:45:40,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:45:41,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:45:43,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:45:44,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:45:44,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:45:46,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:45:48,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:45:49,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:45:49,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:45:51,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:45:52,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:45:53,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-29 22:45:57,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:45:58,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:45:58,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:45:58,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-29 22:46:03,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:46:06,355 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:46:07,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:46:07,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:09,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-29 22:46:11,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:13,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:46:14,399 INFO [train.py:1039] (3/4) Epoch 15, batch 2700, loss[loss=0.2, simple_loss=0.2581, pruned_loss=0.07091, over 22678.00 frames. ], tot_loss[loss=0.1892, simple_loss=0.2627, pruned_loss=0.05791, over 4705367.84 frames. ], batch size: 322, lr: 6.90e-03, grad_scale: 16.0 2023-09-29 22:46:14,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-29 22:46:16,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:46:17,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 22:46:19,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:46:19,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:19,482 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:21,305 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.958e+02 2.156e+02 2.389e+02 4.797e+02, threshold=4.312e+02, percent-clipped=1.0 2023-09-29 22:46:21,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:46:21,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:46:22,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:46:23,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-29 22:46:23,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-29 22:46:24,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:46:25,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:46:26,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=513800.0, ans=0.125 2023-09-29 22:46:27,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:46:29,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:46:34,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:46:34,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-29 22:46:34,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:46:40,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:46:40,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:46:47,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:46:47,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:46:49,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:46:49,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:46:50,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:46:53,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:46:53,931 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:46:53,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:46:59,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:46:59,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:47:08,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:47:08,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:47:08,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=514000.0, ans=0.1 2023-09-29 22:47:12,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:47:12,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:47:14,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=514000.0, ans=0.0 2023-09-29 22:47:17,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:47:17,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:47:19,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:47:21,026 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:47:22,465 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:47:22,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:47:25,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:47:28,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:47:28,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:47:31,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-29 22:47:33,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:47:36,515 INFO [train.py:1039] (3/4) Epoch 15, batch 2750, loss[loss=0.1588, simple_loss=0.235, pruned_loss=0.04131, over 24362.00 frames. ], tot_loss[loss=0.1889, simple_loss=0.262, pruned_loss=0.05791, over 4704193.52 frames. ], batch size: 56, lr: 6.89e-03, grad_scale: 16.0 2023-09-29 22:47:36,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:47:36,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-29 22:47:38,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-29 22:47:40,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:47:42,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:47:42,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:47:45,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:47:45,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-29 22:47:47,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:47:50,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:47:50,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 22:47:50,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=514133.3333333333, ans=0.0 2023-09-29 22:47:51,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:47:51,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:47:51,753 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-29 22:47:51,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:47:53,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:47:58,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-29 22:48:00,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:48:01,569 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:48:01,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:48:01,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-29 22:48:03,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:48:03,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:48:03,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:48:04,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:48:09,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 22:48:11,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 22:48:11,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 22:48:12,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:48:13,091 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=514266.6666666667, ans=10.0 2023-09-29 22:48:14,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:48:20,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:48:23,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 22:48:23,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:48:25,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=514333.3333333333, ans=0.125 2023-09-29 22:48:26,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:48:26,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:48:28,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:48:28,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=514333.3333333333, ans=0.0 2023-09-29 22:48:35,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:48:35,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:48:35,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-29 22:48:39,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:48:42,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-29 22:48:44,895 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=514400.0, ans=0.1 2023-09-29 22:48:50,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-29 22:48:50,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=514400.0, ans=0.0 2023-09-29 22:48:51,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:48:51,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-29 22:48:52,635 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.75 vs. limit=12.0 2023-09-29 22:48:53,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:48:56,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:48:56,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-29 22:48:56,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:48:59,409 INFO [train.py:1039] (3/4) Epoch 15, batch 2800, loss[loss=0.2081, simple_loss=0.2657, pruned_loss=0.0753, over 23880.00 frames. ], tot_loss[loss=0.1881, simple_loss=0.2619, pruned_loss=0.05718, over 4722129.21 frames. ], batch size: 212, lr: 6.89e-03, grad_scale: 32.0 2023-09-29 22:48:59,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-29 22:48:59,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:49:00,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:49:02,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-29 22:49:02,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:49:02,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:49:05,353 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.12 vs. limit=15.0 2023-09-29 22:49:05,776 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.774e+02 1.954e+02 2.291e+02 3.351e+02, threshold=3.907e+02, percent-clipped=0.0 2023-09-29 22:49:05,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:49:07,380 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-29 22:49:07,381 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-29 22:49:09,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:49:10,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:49:10,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:49:15,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:49:17,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-29 22:49:19,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-29 22:49:20,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-29 22:49:22,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:49:22,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:49:23,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:49:27,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:49:28,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:49:28,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:49:28,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:49:39,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:49:39,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:49:42,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:49:42,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:49:44,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:49:48,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:49:48,194 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-29 22:49:50,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:49:51,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:49:51,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:49:54,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:49:56,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:50:00,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:50:01,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:50:01,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:50:01,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 22:50:01,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 22:50:03,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 22:50:04,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:50:04,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-29 22:50:04,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:50:05,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:50:05,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:50:07,331 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.84 vs. limit=12.0 2023-09-29 22:50:08,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-29 22:50:10,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:50:10,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:50:10,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:50:13,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-29 22:50:16,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=514733.3333333333, ans=0.1 2023-09-29 22:50:19,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:50:19,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:50:21,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:50:22,714 INFO [train.py:1039] (3/4) Epoch 15, batch 2850, loss[loss=0.1963, simple_loss=0.2857, pruned_loss=0.05344, over 24553.00 frames. ], tot_loss[loss=0.1868, simple_loss=0.2604, pruned_loss=0.05665, over 4714340.10 frames. ], batch size: 71, lr: 6.89e-03, grad_scale: 16.0 2023-09-29 22:50:24,298 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:50:27,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:50:27,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:50:29,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:50:31,298 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:50:33,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:50:35,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:50:36,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-29 22:50:43,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-29 22:50:43,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:50:45,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-29 22:50:45,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:50:49,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-29 22:50:49,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-29 22:50:50,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:50:50,943 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=514866.6666666667, ans=0.125 2023-09-29 22:50:51,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=514866.6666666667, ans=0.0 2023-09-29 22:51:03,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:51:06,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:51:06,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-29 22:51:07,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 22:51:07,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 22:51:09,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-29 22:51:10,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:51:10,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-29 22:51:14,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-29 22:51:14,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:51:14,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:51:15,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:51:15,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:51:16,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=515000.0, ans=0.1 2023-09-29 22:51:17,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:51:18,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:51:20,411 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:51:23,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:51:24,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:51:24,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:51:27,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:51:33,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:51:35,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-29 22:51:35,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-29 22:51:38,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 22:51:38,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:51:38,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-29 22:51:38,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:51:39,307 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.92 vs. limit=15.0 2023-09-29 22:51:41,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:51:41,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:51:41,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-29 22:51:41,568 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-29 22:51:42,939 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-29 22:51:42,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:51:43,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:51:46,103 INFO [train.py:1039] (3/4) Epoch 15, batch 2900, loss[loss=0.1964, simple_loss=0.2637, pruned_loss=0.06453, over 22726.00 frames. ], tot_loss[loss=0.1866, simple_loss=0.2599, pruned_loss=0.05662, over 4715940.60 frames. ], batch size: 322, lr: 6.89e-03, grad_scale: 16.0 2023-09-29 22:51:49,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:51:49,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:51:49,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:51:49,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=515133.3333333333, ans=0.0 2023-09-29 22:51:50,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-29 22:51:53,871 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.822e+02 2.046e+02 2.406e+02 3.211e+02, threshold=4.092e+02, percent-clipped=0.0 2023-09-29 22:51:54,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:51:54,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-29 22:51:55,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-29 22:51:56,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-29 22:51:57,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:51:59,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=515133.3333333333, ans=0.07 2023-09-29 22:52:00,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:52:02,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:52:03,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 22:52:05,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:52:09,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-29 22:52:10,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-29 22:52:10,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:52:12,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:52:15,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-29 22:52:15,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-29 22:52:20,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:52:20,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-29 22:52:20,427 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:52:23,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:52:23,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-29 22:52:26,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:52:26,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:52:28,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=515266.6666666667, ans=0.2 2023-09-29 22:52:31,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:52:32,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:52:33,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-29 22:52:33,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-29 22:52:33,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:52:38,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:52:40,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-29 22:52:41,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 22:52:43,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=515333.3333333333, ans=0.125 2023-09-29 22:52:47,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:52:56,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-29 22:52:56,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-29 22:52:58,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-29 22:53:01,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:53:01,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-29 22:53:02,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:53:02,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:53:07,968 INFO [train.py:1039] (3/4) Epoch 15, batch 2950, loss[loss=0.1913, simple_loss=0.2638, pruned_loss=0.05938, over 23674.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2615, pruned_loss=0.05684, over 4720420.39 frames. ], batch size: 232, lr: 6.89e-03, grad_scale: 16.0 2023-09-29 22:53:09,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:53:11,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-29 22:53:11,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=515466.6666666667, ans=0.0 2023-09-29 22:53:13,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:53:13,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:53:14,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:53:17,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:53:17,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-29 22:53:17,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-29 22:53:19,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 22:53:19,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:53:20,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=515466.6666666667, ans=0.0 2023-09-29 22:53:26,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:53:27,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:53:28,624 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.79 vs. limit=15.0 2023-09-29 22:53:29,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:53:30,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:53:34,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:53:34,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:53:35,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:53:37,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:53:37,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:53:40,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-29 22:53:45,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-29 22:53:45,696 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-29 22:53:45,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:53:47,845 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-29 22:53:49,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-29 22:53:49,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:53:50,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-29 22:53:50,899 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-29 22:53:50,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-29 22:53:54,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-29 22:53:56,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:53:58,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-29 22:53:59,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:54:01,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 22:54:02,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:54:02,965 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-29 22:54:04,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:54:04,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-29 22:54:10,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:54:10,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:54:12,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-29 22:54:12,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:54:14,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-29 22:54:17,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:54:20,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:54:20,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:54:22,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:54:22,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 22:54:23,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:54:25,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:54:25,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:54:25,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-29 22:54:26,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:54:27,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:54:29,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:54:29,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-29 22:54:31,022 INFO [train.py:1039] (3/4) Epoch 15, batch 3000, loss[loss=0.1983, simple_loss=0.2634, pruned_loss=0.06663, over 23785.00 frames. ], tot_loss[loss=0.1878, simple_loss=0.2618, pruned_loss=0.05689, over 4731405.15 frames. ], batch size: 232, lr: 6.88e-03, grad_scale: 16.0 2023-09-29 22:54:31,023 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 22:54:45,821 INFO [train.py:1071] (3/4) Epoch 15, validation: loss=0.2711, simple_loss=0.2767, pruned_loss=0.1327, over 1125622.00 frames. 2023-09-29 22:54:45,822 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 22:54:46,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:54:51,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:54:51,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-29 22:54:53,998 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.966e+02 2.278e+02 2.682e+02 4.156e+02, threshold=4.556e+02, percent-clipped=1.0 2023-09-29 22:54:54,237 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-29 22:54:54,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-29 22:54:57,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:54:57,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:54:57,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-29 22:54:57,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:55:06,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 22:55:07,936 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=515866.6666666667, ans=0.125 2023-09-29 22:55:08,073 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=515866.6666666667, ans=0.125 2023-09-29 22:55:11,072 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=515866.6666666667, ans=0.125 2023-09-29 22:55:16,943 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:55:17,991 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.61 vs. limit=10.0 2023-09-29 22:55:21,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-29 22:55:23,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-29 22:55:25,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:55:27,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:55:27,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:55:28,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:55:28,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-29 22:55:29,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=515933.3333333333, ans=0.125 2023-09-29 22:55:33,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-29 22:55:33,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:55:35,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 22:55:37,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 22:55:37,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:55:39,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:55:39,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:55:42,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 22:55:42,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:55:42,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-29 22:55:45,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:55:46,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-29 22:55:48,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-29 22:55:48,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:55:48,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 22:55:51,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:55:53,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:55:54,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-29 22:55:54,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-29 22:55:54,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:55:54,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-29 22:55:55,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=516066.6666666667, ans=0.09899494936611666 2023-09-29 22:55:56,255 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 22:55:58,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-29 22:56:02,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-29 22:56:02,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 22:56:02,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-29 22:56:05,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-29 22:56:05,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 22:56:07,789 INFO [train.py:1039] (3/4) Epoch 15, batch 3050, loss[loss=0.1902, simple_loss=0.2591, pruned_loss=0.06066, over 23473.00 frames. ], tot_loss[loss=0.1888, simple_loss=0.2621, pruned_loss=0.05772, over 4715891.39 frames. ], batch size: 134, lr: 6.88e-03, grad_scale: 16.0 2023-09-29 22:56:07,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:56:10,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:56:10,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-29 22:56:10,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:11,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:56:13,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-29 22:56:15,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:56:18,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:56:18,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 22:56:21,716 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:24,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-29 22:56:29,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-29 22:56:29,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-29 22:56:31,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:56:31,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=516200.0, ans=0.1 2023-09-29 22:56:36,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-29 22:56:37,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:37,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:56:38,739 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.83 vs. limit=15.0 2023-09-29 22:56:39,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:56:44,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:56:45,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-29 22:56:47,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:56:47,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:56:47,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:56:47,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:50,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:56:52,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:56:54,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-29 22:56:55,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:56:55,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 22:56:57,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:56:58,213 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.77 vs. limit=15.0 2023-09-29 22:56:58,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 22:56:58,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:57:00,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:57:06,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:57:06,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:57:13,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:57:14,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:57:14,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:57:16,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:57:16,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 22:57:16,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-29 22:57:18,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-29 22:57:19,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 22:57:19,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:57:20,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-29 22:57:23,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:57:29,730 INFO [train.py:1039] (3/4) Epoch 15, batch 3100, loss[loss=0.204, simple_loss=0.2478, pruned_loss=0.0801, over 19701.00 frames. ], tot_loss[loss=0.1886, simple_loss=0.2617, pruned_loss=0.05773, over 4695832.74 frames. ], batch size: 388, lr: 6.88e-03, grad_scale: 16.0 2023-09-29 22:57:29,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:57:31,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 22:57:33,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 22:57:33,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=516466.6666666667, ans=0.125 2023-09-29 22:57:34,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-29 22:57:37,779 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.874e+02 2.072e+02 2.284e+02 2.890e+02, threshold=4.143e+02, percent-clipped=0.0 2023-09-29 22:57:37,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-29 22:57:40,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-29 22:57:40,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 22:57:44,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:57:46,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:57:47,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-29 22:57:54,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:57:58,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-29 22:58:05,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 22:58:05,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:05,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:58:07,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:58:08,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-29 22:58:10,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:58:10,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-29 22:58:10,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 22:58:10,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:58:11,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-29 22:58:13,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:58:16,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-29 22:58:17,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-29 22:58:20,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-29 22:58:21,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:21,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:58:23,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:58:23,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:24,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:58:26,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-29 22:58:26,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 22:58:29,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:58:29,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:58:29,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:29,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 22:58:34,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:58:35,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-29 22:58:37,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-29 22:58:38,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-29 22:58:40,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:58:40,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:41,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-29 22:58:45,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=516733.3333333333, ans=0.125 2023-09-29 22:58:46,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=516733.3333333333, ans=0.125 2023-09-29 22:58:49,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=516800.0, ans=0.125 2023-09-29 22:58:50,880 INFO [train.py:1039] (3/4) Epoch 15, batch 3150, loss[loss=0.1796, simple_loss=0.2536, pruned_loss=0.05279, over 23244.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2601, pruned_loss=0.057, over 4707016.01 frames. ], batch size: 105, lr: 6.88e-03, grad_scale: 16.0 2023-09-29 22:58:51,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-29 22:58:52,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:58:54,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:58:56,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 22:58:56,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-29 22:58:59,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-29 22:59:00,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:59:00,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-29 22:59:00,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-29 22:59:03,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:59:05,315 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-29 22:59:05,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=516800.0, ans=0.125 2023-09-29 22:59:10,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-29 22:59:10,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:59:11,724 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-29 22:59:11,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-29 22:59:13,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-29 22:59:14,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-29 22:59:14,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-29 22:59:14,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:59:14,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:59:16,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-29 22:59:19,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-29 22:59:20,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:59:22,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-29 22:59:22,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:59:23,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-29 22:59:27,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-29 22:59:27,258 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-29 22:59:29,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-29 22:59:31,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 22:59:31,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-29 22:59:34,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-29 22:59:34,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 22:59:36,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 22:59:36,379 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 22:59:36,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:59:36,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 22:59:38,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-29 22:59:38,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-29 22:59:40,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-29 22:59:40,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 22:59:40,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:59:43,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-29 22:59:43,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 22:59:44,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-29 22:59:44,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:59:46,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-29 22:59:46,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:59:47,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-29 22:59:49,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-29 22:59:49,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 22:59:51,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 22:59:51,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-29 22:59:52,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 22:59:52,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 22:59:57,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 22:59:58,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-29 22:59:58,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 22:59:58,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=517066.6666666667, ans=0.015 2023-09-29 23:00:03,831 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.25 vs. limit=15.0 2023-09-29 23:00:05,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:00:06,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:00:09,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-29 23:00:11,570 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=517066.6666666667, ans=0.125 2023-09-29 23:00:12,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:00:12,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-29 23:00:14,366 INFO [train.py:1039] (3/4) Epoch 15, batch 3200, loss[loss=0.1646, simple_loss=0.2409, pruned_loss=0.04413, over 24443.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.2589, pruned_loss=0.05665, over 4706378.91 frames. ], batch size: 58, lr: 6.87e-03, grad_scale: 32.0 2023-09-29 23:00:16,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:00:17,738 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:00:17,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-29 23:00:20,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:00:22,256 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.845e+02 1.990e+02 2.356e+02 4.554e+02, threshold=3.981e+02, percent-clipped=2.0 2023-09-29 23:00:22,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=517133.3333333333, ans=0.125 2023-09-29 23:00:24,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:00:28,630 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:00:30,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=517200.0, ans=0.0 2023-09-29 23:00:38,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:00:49,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-29 23:00:52,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:00:55,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-29 23:00:56,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:01:01,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:01:01,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:01:03,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:01:04,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-29 23:01:07,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-29 23:01:07,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-29 23:01:12,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-29 23:01:15,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:01:22,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:01:22,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:01:22,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:01:23,815 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-29 23:01:23,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:01:27,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:01:27,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-29 23:01:27,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=517400.0, ans=0.125 2023-09-29 23:01:28,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-29 23:01:28,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-29 23:01:30,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-29 23:01:32,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:01:32,183 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=517400.0, ans=0.035 2023-09-29 23:01:33,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-29 23:01:35,048 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-29 23:01:35,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:01:35,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:01:36,529 INFO [train.py:1039] (3/4) Epoch 15, batch 3250, loss[loss=0.1927, simple_loss=0.2694, pruned_loss=0.05802, over 23225.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.2589, pruned_loss=0.05663, over 4709186.13 frames. ], batch size: 105, lr: 6.87e-03, grad_scale: 16.0 2023-09-29 23:01:36,683 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-29 23:01:41,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=517466.6666666667, ans=0.2 2023-09-29 23:01:43,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:01:45,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:01:54,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:01:54,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-29 23:01:54,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:01:55,779 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:01:55,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:01:57,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:01:57,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:02:00,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:02:00,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:02:01,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:02:01,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:02:01,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:02:01,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:02:03,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:02:06,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:02:09,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:02:09,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:02:11,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:02:11,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:02:11,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:02:16,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-29 23:02:18,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:02:18,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:02:20,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:02:20,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:02:27,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=517666.6666666667, ans=0.0 2023-09-29 23:02:28,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:02:38,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:02:38,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:02:38,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-29 23:02:38,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:02:38,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 23:02:39,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:02:39,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=517666.6666666667, ans=0.125 2023-09-29 23:02:41,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-29 23:02:42,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-29 23:02:42,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:02:44,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:02:44,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:02:45,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-29 23:02:45,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:02:49,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:02:49,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:02:51,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-29 23:02:51,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:02:53,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-29 23:02:53,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-29 23:02:53,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=517733.3333333333, ans=0.125 2023-09-29 23:02:58,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:02:58,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-29 23:02:59,925 INFO [train.py:1039] (3/4) Epoch 15, batch 3300, loss[loss=0.2043, simple_loss=0.2717, pruned_loss=0.06842, over 23618.00 frames. ], tot_loss[loss=0.187, simple_loss=0.26, pruned_loss=0.05698, over 4723098.89 frames. ], batch size: 256, lr: 6.87e-03, grad_scale: 16.0 2023-09-29 23:03:02,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-29 23:03:02,557 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=517800.0, ans=0.125 2023-09-29 23:03:03,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-29 23:03:03,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:03:03,993 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=517800.0, ans=0.125 2023-09-29 23:03:08,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:03:09,625 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.939e+02 2.168e+02 2.538e+02 3.579e+02, threshold=4.337e+02, percent-clipped=0.0 2023-09-29 23:03:09,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:03:09,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:11,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 23:03:11,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:03:16,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:03:18,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:03:22,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-29 23:03:22,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:03:22,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:03:23,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:23,858 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-29 23:03:25,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:03:26,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:03:28,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:03:28,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:03:28,270 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-29 23:03:32,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:03:32,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:03:34,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:34,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-29 23:03:36,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-29 23:03:36,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:38,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:03:41,191 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-29 23:03:42,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-29 23:03:44,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:03:47,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-29 23:03:48,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:03:50,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:03:50,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:03:52,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:03:53,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:03:53,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:03:53,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-29 23:03:55,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:03:55,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:03:56,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:03:58,500 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:03:59,886 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-29 23:04:01,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-29 23:04:03,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-29 23:04:05,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:04:05,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:04:07,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:04:07,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:04:11,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:04:11,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:11,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-29 23:04:12,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:04:13,055 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:04:13,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=518066.6666666667, ans=0.0 2023-09-29 23:04:14,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 23:04:15,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-29 23:04:17,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:04:18,764 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:20,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:04:20,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:04:21,785 INFO [train.py:1039] (3/4) Epoch 15, batch 3350, loss[loss=0.1854, simple_loss=0.2574, pruned_loss=0.0567, over 23627.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2605, pruned_loss=0.05684, over 4724135.71 frames. ], batch size: 149, lr: 6.87e-03, grad_scale: 16.0 2023-09-29 23:04:21,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:04:24,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:04:24,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:04:26,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:04:29,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:04:30,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:04:32,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:35,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:04:37,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:04:39,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:04:40,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-29 23:04:42,726 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-29 23:04:42,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:04:43,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=518200.0, ans=0.125 2023-09-29 23:04:47,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-29 23:04:47,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-29 23:04:48,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:04:48,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:04:50,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:04:50,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-29 23:04:50,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:51,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:04:52,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:54,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:04:55,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:04:56,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:04:59,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:04:59,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:05:01,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:05:05,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:05:07,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:05:10,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:05:10,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:05:11,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:05:15,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-29 23:05:15,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 23:05:16,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-29 23:05:16,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:05:17,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-29 23:05:19,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:05:21,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:05:28,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:05:29,825 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-29 23:05:31,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:05:31,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:05:32,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:05:39,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:05:40,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-29 23:05:42,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:05:42,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:05:42,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:05:43,762 INFO [train.py:1039] (3/4) Epoch 15, batch 3400, loss[loss=0.2041, simple_loss=0.2705, pruned_loss=0.06884, over 22689.00 frames. ], tot_loss[loss=0.1882, simple_loss=0.2614, pruned_loss=0.05747, over 4729042.25 frames. ], batch size: 322, lr: 6.87e-03, grad_scale: 16.0 2023-09-29 23:05:43,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-29 23:05:43,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:05:43,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-29 23:05:46,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:05:46,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:05:47,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:05:48,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:05:49,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-29 23:05:54,288 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.906e+02 2.208e+02 2.568e+02 3.814e+02, threshold=4.417e+02, percent-clipped=0.0 2023-09-29 23:05:54,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-29 23:05:54,442 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-29 23:05:54,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:05:56,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=518466.6666666667, ans=0.0 2023-09-29 23:05:59,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:05:59,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:06:00,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:06:02,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-29 23:06:06,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:06:09,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-29 23:06:14,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:06:16,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:06:16,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:06:16,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-29 23:06:26,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:06:31,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-29 23:06:37,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:06:38,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:06:39,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-29 23:06:39,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:06:39,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.max_abs, batch_count=518666.6666666667, ans=10.0 2023-09-29 23:06:40,404 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.70 vs. limit=6.0 2023-09-29 23:06:40,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:06:41,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:06:42,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:06:46,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:06:48,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:06:48,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:06:48,828 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=518733.3333333333, ans=0.0 2023-09-29 23:06:54,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:06:56,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-29 23:07:02,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:07:05,198 INFO [train.py:1039] (3/4) Epoch 15, batch 3450, loss[loss=0.1795, simple_loss=0.2361, pruned_loss=0.0614, over 23590.00 frames. ], tot_loss[loss=0.1878, simple_loss=0.2613, pruned_loss=0.05718, over 4726656.09 frames. ], batch size: 256, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:07:05,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-29 23:07:09,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-29 23:07:10,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:07:11,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=518800.0, ans=0.0 2023-09-29 23:07:13,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:07:13,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-29 23:07:13,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:07:16,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:07:19,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=518866.6666666667, ans=0.1 2023-09-29 23:07:21,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:07:22,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:07:23,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:07:23,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:07:26,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:07:33,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-29 23:07:39,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-29 23:07:39,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:07:39,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:07:40,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:07:45,699 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=518933.3333333333, ans=0.09899494936611666 2023-09-29 23:07:46,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-29 23:07:48,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:07:53,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:07:53,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:07:53,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:07:55,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:07:57,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-29 23:07:57,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:07:59,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:08:02,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:08:04,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-29 23:08:10,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:08:16,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:08:17,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:08:18,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=519066.6666666667, ans=0.1 2023-09-29 23:08:19,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:08:21,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:08:22,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:08:23,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:08:23,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:08:25,820 INFO [train.py:1039] (3/4) Epoch 15, batch 3500, loss[loss=0.171, simple_loss=0.2281, pruned_loss=0.05697, over 23646.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2605, pruned_loss=0.05681, over 4731877.38 frames. ], batch size: 256, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:08:28,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:08:32,142 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:08:33,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-29 23:08:35,733 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.501e+02 1.923e+02 2.112e+02 2.557e+02 4.010e+02, threshold=4.224e+02, percent-clipped=0.0 2023-09-29 23:08:35,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:08:39,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-29 23:08:43,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:08:43,079 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-29 23:08:47,724 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:08:49,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:08:49,556 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=519200.0, ans=0.125 2023-09-29 23:08:50,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:08:50,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:08:50,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-29 23:08:52,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:08:52,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:08:52,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-29 23:08:54,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:08:55,499 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-29 23:08:55,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:09:00,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:00,722 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=519266.6666666667, ans=0.0 2023-09-29 23:09:01,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-29 23:09:01,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:09:02,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=519266.6666666667, ans=0.125 2023-09-29 23:09:04,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:09:05,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=519266.6666666667, ans=0.125 2023-09-29 23:09:06,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:09:07,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:10,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=519266.6666666667, ans=0.1 2023-09-29 23:09:11,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:09:11,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:09:12,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-29 23:09:15,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-29 23:09:15,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-29 23:09:16,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:09:18,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:18,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:09:19,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:09:21,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 23:09:21,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:09:26,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:09:27,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-29 23:09:27,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-29 23:09:27,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:09:30,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:09:32,462 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:09:33,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:37,019 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-29 23:09:37,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=519400.0, ans=0.1 2023-09-29 23:09:38,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:09:40,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:09:42,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-29 23:09:43,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-29 23:09:44,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=519400.0, ans=0.2 2023-09-29 23:09:47,178 INFO [train.py:1039] (3/4) Epoch 15, batch 3550, loss[loss=0.1948, simple_loss=0.2696, pruned_loss=0.05995, over 23326.00 frames. ], tot_loss[loss=0.1867, simple_loss=0.26, pruned_loss=0.05672, over 4738868.41 frames. ], batch size: 119, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:09:47,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:09:47,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:09:48,914 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:09:48,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:09:54,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:10:03,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:10:05,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-29 23:10:08,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:10:09,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:10:11,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:10:12,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:10:12,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:10:15,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:10:15,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:10:18,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:10:18,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-29 23:10:18,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:10:18,959 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.51 vs. limit=15.0 2023-09-29 23:10:20,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=519600.0, ans=0.0 2023-09-29 23:10:25,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:10:25,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:10:27,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:10:28,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:10:29,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:10:29,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-29 23:10:29,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:10:31,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:10:32,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-29 23:10:32,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=519600.0, ans=0.0 2023-09-29 23:10:34,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=519600.0, ans=0.125 2023-09-29 23:10:38,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:10:38,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:10:40,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:10:41,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-29 23:10:41,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:10:43,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-29 23:10:44,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:10:46,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:10:47,208 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.98 vs. limit=22.5 2023-09-29 23:10:47,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:10:49,411 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-29 23:10:51,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:11:00,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:11:00,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-29 23:11:01,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:11:03,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:11:05,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-29 23:11:09,414 INFO [train.py:1039] (3/4) Epoch 15, batch 3600, loss[loss=0.207, simple_loss=0.2714, pruned_loss=0.07129, over 22793.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.26, pruned_loss=0.05637, over 4746319.61 frames. ], batch size: 322, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:11:12,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-29 23:11:12,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:11:14,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:11:15,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:11:17,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:11:17,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:11:19,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=519800.0, ans=0.125 2023-09-29 23:11:20,428 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.599e+02 1.981e+02 2.241e+02 2.559e+02 3.675e+02, threshold=4.482e+02, percent-clipped=0.0 2023-09-29 23:11:20,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:11:22,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:11:23,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:11:25,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:11:25,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:11:25,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-29 23:11:30,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:11:33,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:11:35,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:11:38,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:11:38,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:11:39,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=519866.6666666667, ans=0.125 2023-09-29 23:11:40,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:11:40,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-29 23:11:40,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:11:43,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:11:43,736 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:11:45,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:11:46,079 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.06 vs. limit=22.5 2023-09-29 23:11:47,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=519933.3333333333, ans=0.2 2023-09-29 23:11:48,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:11:49,310 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.94 vs. limit=15.0 2023-09-29 23:11:49,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:11:50,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=519933.3333333333, ans=0.0 2023-09-29 23:11:51,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-29 23:11:57,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:12:00,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:12:00,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-29 23:12:07,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:12:13,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:12:16,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:12:22,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-29 23:12:22,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:12:22,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-29 23:12:24,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-29 23:12:26,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-29 23:12:27,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:12:29,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:12:30,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-29 23:12:30,731 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:12:30,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:12:30,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:12:32,140 INFO [train.py:1039] (3/4) Epoch 15, batch 3650, loss[loss=0.1953, simple_loss=0.2602, pruned_loss=0.06515, over 23455.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.2601, pruned_loss=0.05683, over 4717349.08 frames. ], batch size: 285, lr: 6.86e-03, grad_scale: 16.0 2023-09-29 23:12:32,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-29 23:12:33,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-29 23:12:35,634 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=520133.3333333333, ans=0.2 2023-09-29 23:12:37,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:12:40,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-29 23:12:43,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-29 23:12:44,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:12:47,545 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.15 vs. limit=15.0 2023-09-29 23:12:48,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-29 23:12:50,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-29 23:12:54,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:12:54,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:12:55,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:12:59,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-29 23:12:59,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:13:01,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-29 23:13:01,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:13:02,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:13:02,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-29 23:13:04,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 23:13:04,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=520266.6666666667, ans=0.125 2023-09-29 23:13:05,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:13:05,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:13:07,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:13:11,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-29 23:13:13,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-29 23:13:14,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:13:16,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-29 23:13:18,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:13:18,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:13:23,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:13:25,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:13:26,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:13:28,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:13:28,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:13:31,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:13:31,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=520333.3333333333, ans=10.0 2023-09-29 23:13:34,208 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:13:35,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:13:35,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:13:37,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 23:13:37,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:13:39,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:13:45,915 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=520400.0, ans=0.125 2023-09-29 23:13:47,158 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-29 23:13:50,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:13:51,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:13:53,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-29 23:13:53,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:13:54,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-29 23:13:55,840 INFO [train.py:1039] (3/4) Epoch 15, batch 3700, loss[loss=0.2115, simple_loss=0.2799, pruned_loss=0.07151, over 23456.00 frames. ], tot_loss[loss=0.1873, simple_loss=0.261, pruned_loss=0.05686, over 4728296.51 frames. ], batch size: 285, lr: 6.85e-03, grad_scale: 16.0 2023-09-29 23:13:57,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:13:57,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-29 23:13:58,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:14:01,851 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.32 vs. limit=15.0 2023-09-29 23:14:01,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=520466.6666666667, ans=22.5 2023-09-29 23:14:02,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:14:03,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=520466.6666666667, ans=0.125 2023-09-29 23:14:04,322 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:14:04,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:14:07,296 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.795e+02 1.978e+02 2.320e+02 3.492e+02, threshold=3.956e+02, percent-clipped=0.0 2023-09-29 23:14:07,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:14:07,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-29 23:14:07,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:14:08,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 23:14:09,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:14:10,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:14:13,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:14:15,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:14:16,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:14:16,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:14:18,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 23:14:20,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:14:23,319 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-29 23:14:30,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:14:30,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 23:14:31,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:14:32,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-29 23:14:32,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:14:37,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:14:39,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-29 23:14:39,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:14:40,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:14:43,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:14:43,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:14:46,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 23:14:53,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:14:53,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-29 23:14:53,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:14:53,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-29 23:14:55,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=520666.6666666667, ans=0.0 2023-09-29 23:14:58,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:14:59,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:15:02,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:15:04,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-29 23:15:05,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:15:05,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-29 23:15:05,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:15:05,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:15:09,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:15:11,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-29 23:15:12,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-29 23:15:13,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:15:13,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:15:15,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:15:17,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:15:18,596 INFO [train.py:1039] (3/4) Epoch 15, batch 3750, loss[loss=0.2017, simple_loss=0.2662, pruned_loss=0.06856, over 23782.00 frames. ], tot_loss[loss=0.188, simple_loss=0.2621, pruned_loss=0.05697, over 4737433.45 frames. ], batch size: 179, lr: 6.85e-03, grad_scale: 16.0 2023-09-29 23:15:20,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:15:21,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:15:23,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:15:25,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-29 23:15:25,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 23:15:28,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-29 23:15:28,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-29 23:15:28,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:15:30,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:15:31,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:15:34,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:15:39,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:15:43,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:15:43,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:15:43,908 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.06 vs. limit=12.0 2023-09-29 23:15:44,862 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=520866.6666666667, ans=0.0 2023-09-29 23:15:45,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=520866.6666666667, ans=15.0 2023-09-29 23:15:46,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:15:49,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:15:50,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-29 23:15:50,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:15:52,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:15:53,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:15:58,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-29 23:16:00,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-29 23:16:02,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:16:02,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:16:05,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:16:05,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=520933.3333333333, ans=0.125 2023-09-29 23:16:10,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:16:12,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-29 23:16:17,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-29 23:16:17,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=521000.0, ans=0.125 2023-09-29 23:16:19,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:16:22,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:16:23,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:16:25,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:16:29,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-29 23:16:30,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:16:33,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:16:35,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:16:36,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-29 23:16:37,136 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=521066.6666666667, ans=0.125 2023-09-29 23:16:42,048 INFO [train.py:1039] (3/4) Epoch 15, batch 3800, loss[loss=0.1765, simple_loss=0.2345, pruned_loss=0.05921, over 22798.00 frames. ], tot_loss[loss=0.1886, simple_loss=0.2625, pruned_loss=0.0574, over 4746318.14 frames. ], batch size: 322, lr: 6.85e-03, grad_scale: 16.0 2023-09-29 23:16:48,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:16:52,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:16:53,959 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.649e+02 1.926e+02 2.197e+02 2.572e+02 3.793e+02, threshold=4.394e+02, percent-clipped=0.0 2023-09-29 23:16:54,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-29 23:16:55,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-29 23:16:57,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:16:58,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:16:58,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-29 23:17:01,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 23:17:01,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:17:03,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:17:05,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:17:05,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:17:05,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:17:06,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-29 23:17:09,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-29 23:17:11,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:17:14,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:17:16,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:17:16,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:17:20,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-29 23:17:20,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:17:22,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:17:23,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:17:26,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=521266.6666666667, ans=0.125 2023-09-29 23:17:28,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 23:17:28,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-29 23:17:32,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:17:37,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:17:42,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:17:43,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-29 23:17:45,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-29 23:17:45,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:17:48,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:17:48,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=521400.0, ans=0.1 2023-09-29 23:17:50,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:17:52,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-29 23:17:56,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-29 23:17:56,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-29 23:17:56,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:17:58,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:18:02,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:18:04,482 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:18:05,915 INFO [train.py:1039] (3/4) Epoch 15, batch 3850, loss[loss=0.1604, simple_loss=0.2414, pruned_loss=0.03972, over 24477.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.2609, pruned_loss=0.05698, over 4726873.12 frames. ], batch size: 63, lr: 6.85e-03, grad_scale: 8.0 2023-09-29 23:18:08,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:18:09,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-29 23:18:11,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:18:11,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:18:14,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:18:16,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=521466.6666666667, ans=0.125 2023-09-29 23:18:17,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:18:20,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-29 23:18:22,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-29 23:18:29,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:32,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:18:34,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:18:34,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:18:37,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:38,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:18:40,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:18:42,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:18:42,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:18:43,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:18:44,094 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=521600.0, ans=0.125 2023-09-29 23:18:45,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:45,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:18:46,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-29 23:18:46,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-29 23:18:46,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:18:47,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:50,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:18:50,816 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.26 vs. limit=22.5 2023-09-29 23:18:51,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:18:51,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-29 23:18:53,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-29 23:18:56,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:18:57,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-29 23:18:59,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-29 23:19:04,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:19:06,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:19:09,575 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=521666.6666666667, ans=0.0 2023-09-29 23:19:09,644 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=521666.6666666667, ans=0.0 2023-09-29 23:19:10,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:19:10,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-29 23:19:11,976 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.57 vs. limit=15.0 2023-09-29 23:19:14,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-29 23:19:17,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:19:17,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:19:19,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=521733.3333333333, ans=0.1 2023-09-29 23:19:22,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:19:22,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:19:22,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:23,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:23,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:19:23,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-29 23:19:25,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:19:25,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-29 23:19:27,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:27,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:19:28,424 INFO [train.py:1039] (3/4) Epoch 15, batch 3900, loss[loss=0.1899, simple_loss=0.2745, pruned_loss=0.05258, over 24660.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2599, pruned_loss=0.05644, over 4731490.20 frames. ], batch size: 68, lr: 6.84e-03, grad_scale: 8.0 2023-09-29 23:19:30,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:19:30,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:30,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=521800.0, ans=0.0 2023-09-29 23:19:32,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:19:32,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:19:32,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:19:33,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:19:33,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-29 23:19:33,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:38,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:19:40,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:19:40,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:19:41,905 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.819e+02 2.036e+02 2.423e+02 3.835e+02, threshold=4.073e+02, percent-clipped=0.0 2023-09-29 23:19:42,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:19:45,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:19:45,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:47,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:19:49,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-29 23:19:49,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:19:49,890 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.40 vs. limit=22.5 2023-09-29 23:19:50,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-29 23:19:50,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:19:50,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-29 23:19:52,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-29 23:19:55,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=521866.6666666667, ans=0.125 2023-09-29 23:19:57,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:19:58,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:19:58,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:20:00,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:20:05,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:20:06,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:20:09,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:20:09,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:20:10,775 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:20:12,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=521933.3333333333, ans=0.2 2023-09-29 23:20:15,639 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=521933.3333333333, ans=0.04949747468305833 2023-09-29 23:20:18,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:20:18,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:20:20,506 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=522000.0, ans=0.125 2023-09-29 23:20:25,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:20:28,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:20:39,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:20:43,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:20:43,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-29 23:20:44,559 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-29 23:20:44,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:20:46,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-29 23:20:47,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:20:49,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-29 23:20:51,225 INFO [train.py:1039] (3/4) Epoch 15, batch 3950, loss[loss=0.1741, simple_loss=0.2619, pruned_loss=0.04309, over 24639.00 frames. ], tot_loss[loss=0.1866, simple_loss=0.2604, pruned_loss=0.0564, over 4737478.67 frames. ], batch size: 73, lr: 6.84e-03, grad_scale: 8.0 2023-09-29 23:20:51,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=522133.3333333333, ans=0.1 2023-09-29 23:20:54,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=522133.3333333333, ans=0.0 2023-09-29 23:20:56,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:20:58,322 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-29 23:20:58,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:21:00,379 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=11.03 vs. limit=15.0 2023-09-29 23:21:01,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:21:03,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:21:07,523 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.87 vs. limit=22.5 2023-09-29 23:21:08,305 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-29 23:21:08,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:21:10,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-29 23:21:10,335 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-29 23:21:10,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:21:13,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:21:14,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:21:14,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:21:16,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-29 23:21:17,227 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.63 vs. limit=22.5 2023-09-29 23:21:19,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:21:19,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:21:19,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:21:21,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:21:21,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:21:23,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=522266.6666666667, ans=0.125 2023-09-29 23:21:33,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:21:33,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:21:41,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-29 23:21:41,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=522333.3333333333, ans=0.125 2023-09-29 23:21:45,599 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=9.13 vs. limit=15.0 2023-09-29 23:21:47,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-29 23:21:47,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-29 23:21:47,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:21:48,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=522333.3333333333, ans=0.5 2023-09-29 23:21:49,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:21:54,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:21:54,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:21:56,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:21:56,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:21:56,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-29 23:22:03,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:22:04,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:22:04,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=522400.0, ans=0.2 2023-09-29 23:22:07,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-29 23:22:15,385 INFO [train.py:1039] (3/4) Epoch 15, batch 4000, loss[loss=0.1888, simple_loss=0.2657, pruned_loss=0.05591, over 24027.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2614, pruned_loss=0.05691, over 4734429.16 frames. ], batch size: 80, lr: 6.84e-03, grad_scale: 16.0 2023-09-29 23:22:16,365 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.15 vs. limit=15.0 2023-09-29 23:22:17,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:22:19,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=522466.6666666667, ans=0.5 2023-09-29 23:22:24,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:22:28,374 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.844e+02 2.082e+02 2.375e+02 3.458e+02, threshold=4.164e+02, percent-clipped=0.0 2023-09-29 23:22:29,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:22:30,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:22:32,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:22:32,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-29 23:22:33,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-29 23:22:35,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-29 23:22:35,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:22:35,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-29 23:22:36,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:22:40,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:22:40,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:22:40,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:22:42,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:22:42,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-29 23:22:44,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:22:47,003 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-29 23:22:48,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:22:48,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:22:50,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=522600.0, ans=0.1 2023-09-29 23:22:51,643 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-29 23:22:53,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:22:53,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:22:57,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=522600.0, ans=0.125 2023-09-29 23:22:59,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-29 23:23:01,459 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:23:03,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:23:04,835 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-29 23:23:04,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:23:06,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-29 23:23:06,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:23:06,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:23:08,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:23:11,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:23:11,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:23:11,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:23:12,195 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=522666.6666666667, ans=0.0 2023-09-29 23:23:13,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-29 23:23:13,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:23:15,028 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-29 23:23:21,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:23:24,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-29 23:23:25,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:23:27,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:23:28,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:23:29,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:23:34,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:23:37,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-29 23:23:37,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-29 23:23:38,536 INFO [train.py:1039] (3/4) Epoch 15, batch 4050, loss[loss=0.192, simple_loss=0.2774, pruned_loss=0.0533, over 24067.00 frames. ], tot_loss[loss=0.1883, simple_loss=0.262, pruned_loss=0.05729, over 4720322.81 frames. ], batch size: 80, lr: 6.84e-03, grad_scale: 16.0 2023-09-29 23:23:38,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:23:38,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:23:40,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:23:42,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:23:42,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:23:47,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:23:50,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:23:50,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-29 23:23:54,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:23:55,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:24:00,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:24:02,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:24:07,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-29 23:24:07,887 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.73 vs. limit=15.0 2023-09-29 23:24:08,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-29 23:24:10,088 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-29 23:24:10,733 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.82 vs. limit=15.0 2023-09-29 23:24:11,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:24:19,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-29 23:24:19,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:24:24,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:24:27,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:24:27,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:24:27,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:24:32,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:24:36,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-29 23:24:36,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 23:24:38,548 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:24:40,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-29 23:24:44,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:24:46,715 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.46 vs. limit=22.5 2023-09-29 23:24:50,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-29 23:24:53,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:24:53,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:24:54,346 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=523066.6666666667, ans=0.125 2023-09-29 23:24:55,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-29 23:24:55,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-29 23:24:55,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:24:57,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:24:59,459 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:24:59,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:25:00,922 INFO [train.py:1039] (3/4) Epoch 15, batch 4100, loss[loss=0.1578, simple_loss=0.2334, pruned_loss=0.04113, over 24295.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.2613, pruned_loss=0.05626, over 4722093.86 frames. ], batch size: 56, lr: 6.84e-03, grad_scale: 8.0 2023-09-29 23:25:06,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-29 23:25:08,418 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-29 23:25:10,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-29 23:25:10,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=523133.3333333333, ans=0.2 2023-09-29 23:25:12,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-29 23:25:12,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:25:13,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:25:13,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:25:13,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:25:13,833 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=523133.3333333333, ans=0.0 2023-09-29 23:25:15,163 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-29 23:25:16,444 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.960e+02 2.243e+02 2.866e+02 4.978e+02, threshold=4.486e+02, percent-clipped=4.0 2023-09-29 23:25:18,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:25:18,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:25:18,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:25:19,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:25:22,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:25:23,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=523200.0, ans=0.125 2023-09-29 23:25:24,440 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:25:24,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:25:24,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-29 23:25:24,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:25:25,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:25:25,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:25:25,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:25:26,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-29 23:25:29,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:25:32,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-29 23:25:33,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:25:36,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:25:36,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-29 23:25:37,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=523266.6666666667, ans=0.2 2023-09-29 23:25:38,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:25:40,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:25:40,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:25:41,099 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.18 vs. limit=15.0 2023-09-29 23:25:41,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-29 23:25:43,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:25:43,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:25:47,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-29 23:25:48,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:25:48,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:25:51,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:25:58,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:26:01,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:26:02,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:26:06,025 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.60 vs. limit=22.5 2023-09-29 23:26:08,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten.whitening_limit, batch_count=523400.0, ans=22.5 2023-09-29 23:26:13,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:26:13,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:26:18,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:26:19,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:26:20,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=523400.0, ans=0.125 2023-09-29 23:26:23,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:26:25,105 INFO [train.py:1039] (3/4) Epoch 15, batch 4150, loss[loss=0.1778, simple_loss=0.2579, pruned_loss=0.04886, over 24676.00 frames. ], tot_loss[loss=0.1877, simple_loss=0.2616, pruned_loss=0.05694, over 4717890.54 frames. ], batch size: 65, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:26:26,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:26:26,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:26:26,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:26:27,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=523466.6666666667, ans=0.2 2023-09-29 23:26:29,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-29 23:26:29,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:26:31,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-29 23:26:31,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-29 23:26:32,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-29 23:26:34,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:26:40,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:26:40,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:26:45,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:26:47,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:26:47,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-29 23:26:50,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 23:26:51,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:26:53,277 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-29 23:26:57,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:27:01,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:27:03,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-29 23:27:04,278 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=523600.0, ans=0.125 2023-09-29 23:27:05,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-29 23:27:05,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:27:07,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-29 23:27:07,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:27:07,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:27:08,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:27:08,959 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=523600.0, ans=0.125 2023-09-29 23:27:10,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:27:13,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-29 23:27:15,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-29 23:27:19,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:27:19,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-29 23:27:19,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:27:20,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-29 23:27:21,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=523666.6666666667, ans=0.125 2023-09-29 23:27:23,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:27:26,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:27:26,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=523666.6666666667, ans=0.0 2023-09-29 23:27:27,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:27:28,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=523666.6666666667, ans=0.2 2023-09-29 23:27:29,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-29 23:27:29,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:27:29,204 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-29 23:27:30,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 23:27:32,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-29 23:27:34,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:27:34,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:27:34,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:27:34,526 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-29 23:27:35,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:27:36,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-29 23:27:36,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:27:38,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:27:39,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-29 23:27:40,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-29 23:27:46,931 INFO [train.py:1039] (3/4) Epoch 15, batch 4200, loss[loss=0.1642, simple_loss=0.2396, pruned_loss=0.04447, over 24430.00 frames. ], tot_loss[loss=0.1863, simple_loss=0.2599, pruned_loss=0.05638, over 4705816.73 frames. ], batch size: 58, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:27:47,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:27:49,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-29 23:27:52,096 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:27:54,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:27:55,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:27:56,636 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:27:56,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:27:58,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-29 23:28:01,700 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.915e+02 2.061e+02 2.276e+02 4.406e+02, threshold=4.122e+02, percent-clipped=0.0 2023-09-29 23:28:02,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-29 23:28:02,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:28:03,946 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=523866.6666666667, ans=0.2 2023-09-29 23:28:05,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:28:05,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=523866.6666666667, ans=0.0 2023-09-29 23:28:08,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:28:09,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-29 23:28:11,736 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:28:11,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:28:13,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-29 23:28:13,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:28:14,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:28:15,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:28:15,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:28:16,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:28:17,325 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=7.01 vs. limit=15.0 2023-09-29 23:28:21,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-29 23:28:21,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:28:21,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=523933.3333333333, ans=0.04949747468305833 2023-09-29 23:28:23,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=523933.3333333333, ans=0.0 2023-09-29 23:28:25,390 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=523933.3333333333, ans=10.0 2023-09-29 23:28:26,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-29 23:28:28,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:28:29,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:28:31,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:28:33,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:28:33,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-29 23:28:33,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:28:35,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:28:35,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.max_positive, batch_count=524000.0, ans=0.95 2023-09-29 23:28:40,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:28:43,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:28:49,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:28:52,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-29 23:28:55,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:28:56,768 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:28:59,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 23:29:01,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:29:01,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=524066.6666666667, ans=0.125 2023-09-29 23:29:02,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-29 23:29:07,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:29:09,503 INFO [train.py:1039] (3/4) Epoch 15, batch 4250, loss[loss=0.192, simple_loss=0.2797, pruned_loss=0.05217, over 24671.00 frames. ], tot_loss[loss=0.1854, simple_loss=0.2585, pruned_loss=0.05611, over 4702085.31 frames. ], batch size: 73, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:29:12,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:29:12,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-29 23:29:15,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:29:20,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:29:20,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-29 23:29:22,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:29:24,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:29:27,348 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.44 vs. limit=10.0 2023-09-29 23:29:27,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:29:28,235 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=524200.0, ans=0.0 2023-09-29 23:29:34,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:29:34,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:29:34,584 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:29:34,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:29:36,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:29:37,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:29:39,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:29:42,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:29:44,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:29:45,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-29 23:29:48,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-29 23:29:48,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:29:49,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:29:49,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:29:50,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:29:50,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:29:52,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:29:54,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=524266.6666666667, ans=0.125 2023-09-29 23:29:55,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-29 23:29:57,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:30:02,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:30:04,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:30:06,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-29 23:30:06,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:30:06,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-29 23:30:07,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:30:09,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:30:10,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:30:10,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:30:12,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-29 23:30:14,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 23:30:15,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:30:20,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:30:23,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:30:25,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:30:27,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:30:28,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:30:30,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:30:30,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:30:30,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-29 23:30:31,756 INFO [train.py:1039] (3/4) Epoch 15, batch 4300, loss[loss=0.1745, simple_loss=0.2511, pruned_loss=0.049, over 24611.00 frames. ], tot_loss[loss=0.1852, simple_loss=0.2583, pruned_loss=0.0561, over 4705036.55 frames. ], batch size: 60, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:30:32,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:30:36,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:30:38,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:30:41,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:30:47,073 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.553e+02 1.892e+02 2.099e+02 2.369e+02 3.970e+02, threshold=4.198e+02, percent-clipped=0.0 2023-09-29 23:30:50,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:30:50,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-29 23:30:51,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:30:53,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:30:55,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:30:55,319 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-29 23:30:58,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:31:00,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:31:05,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-29 23:31:05,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:31:05,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-29 23:31:08,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 23:31:10,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:31:14,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:31:14,345 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:31:14,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:31:15,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:31:16,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:31:16,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-29 23:31:18,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-29 23:31:20,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:31:22,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:31:22,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 23:31:22,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:31:23,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:31:23,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-29 23:31:23,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-29 23:31:23,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-29 23:31:26,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:31:26,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-29 23:31:26,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-29 23:31:32,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:31:33,675 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-29 23:31:35,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:31:37,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:31:37,322 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:31:39,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-29 23:31:39,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:31:39,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:31:41,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:31:42,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:31:42,859 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:31:44,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=524733.3333333334, ans=0.125 2023-09-29 23:31:45,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:31:47,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:31:48,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:31:49,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:31:55,394 INFO [train.py:1039] (3/4) Epoch 15, batch 4350, loss[loss=0.1902, simple_loss=0.259, pruned_loss=0.06075, over 23811.00 frames. ], tot_loss[loss=0.1865, simple_loss=0.2596, pruned_loss=0.05673, over 4711367.04 frames. ], batch size: 212, lr: 6.83e-03, grad_scale: 8.0 2023-09-29 23:31:55,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-29 23:31:55,599 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-29 23:31:55,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=524800.0, ans=0.125 2023-09-29 23:31:55,864 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=524800.0, ans=0.125 2023-09-29 23:31:58,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=524800.0, ans=0.125 2023-09-29 23:32:03,054 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:32:04,903 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=524800.0, ans=0.125 2023-09-29 23:32:06,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:32:09,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:32:09,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:32:13,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:32:13,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=524866.6666666666, ans=0.025 2023-09-29 23:32:17,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=524866.6666666666, ans=0.125 2023-09-29 23:32:18,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:32:21,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:32:21,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:32:25,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:32:27,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:32:28,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:32:35,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-29 23:32:36,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:32:37,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:32:40,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:32:44,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-29 23:32:49,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:32:52,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 23:32:57,614 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-29 23:32:59,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:32:59,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:32:59,887 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-29 23:33:01,330 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-29 23:33:01,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:33:01,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:33:02,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:33:02,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:33:04,485 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:33:05,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:33:07,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-29 23:33:08,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:08,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:33:08,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:10,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-29 23:33:11,971 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-29 23:33:11,978 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-29 23:33:11,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-29 23:33:13,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=525066.6666666666, ans=0.0 2023-09-29 23:33:15,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:33:16,459 INFO [train.py:1039] (3/4) Epoch 15, batch 4400, loss[loss=0.1959, simple_loss=0.2615, pruned_loss=0.06517, over 23831.00 frames. ], tot_loss[loss=0.1876, simple_loss=0.2612, pruned_loss=0.05693, over 4727376.00 frames. ], batch size: 195, lr: 6.82e-03, grad_scale: 16.0 2023-09-29 23:33:16,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:33:16,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:33:18,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:33:19,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-29 23:33:21,231 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-29 23:33:21,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:27,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:33:27,044 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:29,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:33:30,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-29 23:33:30,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-29 23:33:32,656 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.997e+02 2.183e+02 2.511e+02 3.955e+02, threshold=4.366e+02, percent-clipped=0.0 2023-09-29 23:33:32,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-29 23:33:32,830 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-29 23:33:34,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 23:33:34,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:33:35,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-29 23:33:39,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:39,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=525200.0, ans=10.0 2023-09-29 23:33:40,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:33:40,589 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-29 23:33:43,712 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:33:43,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-29 23:33:43,810 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-29 23:33:46,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-29 23:33:47,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-29 23:33:47,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-29 23:33:48,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:33:48,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:33:50,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:33:50,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:33:52,543 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.12 vs. limit=15.0 2023-09-29 23:33:53,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-29 23:33:53,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-29 23:33:54,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:33:56,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:33:56,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:33:58,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:33:58,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:33:58,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-29 23:34:00,502 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-29 23:34:02,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:34:06,830 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=525333.3333333334, ans=0.0 2023-09-29 23:34:11,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:34:12,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-29 23:34:16,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:34:17,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:34:20,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:34:20,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-29 23:34:20,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:34:20,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:34:20,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:34:22,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:34:26,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-29 23:34:29,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=525400.0, ans=0.125 2023-09-29 23:34:30,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-29 23:34:31,545 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.96 vs. limit=22.5 2023-09-29 23:34:32,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-29 23:34:32,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:34:32,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-29 23:34:32,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:34:36,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:34:40,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-29 23:34:41,775 INFO [train.py:1039] (3/4) Epoch 15, batch 4450, loss[loss=0.1624, simple_loss=0.2387, pruned_loss=0.04305, over 21104.00 frames. ], tot_loss[loss=0.1884, simple_loss=0.262, pruned_loss=0.05738, over 4724376.62 frames. ], batch size: 46, lr: 6.82e-03, grad_scale: 16.0 2023-09-29 23:34:43,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:34:45,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:34:46,399 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:34:47,653 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.49 vs. limit=5.0 2023-09-29 23:34:52,515 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:34:52,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:34:55,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:34:57,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=525533.3333333334, ans=0.0 2023-09-29 23:34:58,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:34:59,002 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=525533.3333333334, ans=0.0 2023-09-29 23:35:01,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:35:01,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:35:02,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-29 23:35:02,771 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:35:02,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:35:04,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:35:04,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:35:08,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:35:13,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:35:14,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:35:16,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:35:18,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:35:18,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:35:19,848 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=525600.0, ans=0.1 2023-09-29 23:35:22,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-29 23:35:25,536 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-29 23:35:25,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-29 23:35:25,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:35:28,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:35:30,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-29 23:35:34,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:35:38,377 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:35:39,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-29 23:35:39,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:35:39,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:35:39,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:35:39,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:35:42,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:35:45,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-29 23:35:45,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-29 23:35:46,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=525733.3333333334, ans=0.07 2023-09-29 23:35:47,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:35:49,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:35:51,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:35:53,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:35:53,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 23:35:58,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:36:01,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-29 23:36:02,734 INFO [train.py:1039] (3/4) Epoch 15, batch 4500, loss[loss=0.2022, simple_loss=0.2702, pruned_loss=0.06715, over 23594.00 frames. ], tot_loss[loss=0.1893, simple_loss=0.2626, pruned_loss=0.05801, over 4711361.61 frames. ], batch size: 106, lr: 6.82e-03, grad_scale: 16.0 2023-09-29 23:36:02,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:36:07,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:36:08,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-29 23:36:08,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-29 23:36:11,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:36:15,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:36:15,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:36:17,838 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.874e+02 2.112e+02 2.381e+02 3.744e+02, threshold=4.224e+02, percent-clipped=0.0 2023-09-29 23:36:17,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:36:18,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:36:19,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:36:19,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:36:30,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:36:32,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:36:35,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:36:35,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:36:37,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:36:43,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:36:43,478 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:36:47,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:36:51,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:36:56,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:36:56,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-29 23:36:57,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:36:57,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:36:59,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:37:00,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:37:02,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:37:02,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-29 23:37:02,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 23:37:02,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:37:05,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:37:05,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:37:09,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:37:12,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:37:12,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:37:13,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-29 23:37:16,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-29 23:37:16,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-29 23:37:18,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-29 23:37:19,610 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.83 vs. limit=15.0 2023-09-29 23:37:23,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-29 23:37:24,418 INFO [train.py:1039] (3/4) Epoch 15, batch 4550, loss[loss=0.1757, simple_loss=0.2207, pruned_loss=0.06529, over 19265.00 frames. ], tot_loss[loss=0.1888, simple_loss=0.2618, pruned_loss=0.0579, over 4696646.40 frames. ], batch size: 388, lr: 6.82e-03, grad_scale: 16.0 2023-09-29 23:37:24,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:37:26,879 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=526133.3333333334, ans=0.125 2023-09-29 23:37:29,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:37:29,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:37:31,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:37:37,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:37:39,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:37:39,670 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=3.04 vs. limit=10.0 2023-09-29 23:37:40,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:37:40,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:37:40,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:37:43,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:37:43,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:37:46,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:37:50,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-29 23:37:50,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-29 23:37:51,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:37:53,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-29 23:37:58,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-29 23:37:59,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:38:04,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-29 23:38:07,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:38:08,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:08,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:10,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:38:11,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-29 23:38:15,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:38:16,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:18,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:38:18,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:38:19,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=526333.3333333334, ans=0.09899494936611666 2023-09-29 23:38:21,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-29 23:38:21,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-29 23:38:21,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:38:22,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-29 23:38:25,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-29 23:38:25,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:38:27,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:38:27,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:38:29,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:29,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:38:32,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:38:32,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-29 23:38:34,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:38:34,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 23:38:36,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-29 23:38:36,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:38:36,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-29 23:38:36,759 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=526400.0, ans=0.0 2023-09-29 23:38:39,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:38:39,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:38:42,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:38:42,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:38:42,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-29 23:38:45,655 INFO [train.py:1039] (3/4) Epoch 15, batch 4600, loss[loss=0.1766, simple_loss=0.2416, pruned_loss=0.05576, over 23360.00 frames. ], tot_loss[loss=0.1879, simple_loss=0.2604, pruned_loss=0.05766, over 4696886.45 frames. ], batch size: 285, lr: 6.81e-03, grad_scale: 16.0 2023-09-29 23:38:45,738 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:38:47,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:38:50,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:38:51,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:38:55,019 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:38:55,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:38:55,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:38:56,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-29 23:38:58,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:38:59,841 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.932e+02 2.167e+02 2.436e+02 3.970e+02, threshold=4.334e+02, percent-clipped=0.0 2023-09-29 23:39:02,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:39:02,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=526533.3333333334, ans=0.125 2023-09-29 23:39:04,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:39:06,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:12,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-29 23:39:14,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:16,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:20,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:39:20,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:39:23,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-29 23:39:23,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:39:25,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:39:27,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=526600.0, ans=0.125 2023-09-29 23:39:31,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:33,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:39:33,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:39:38,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-29 23:39:39,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-29 23:39:44,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:39:45,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:39:48,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:39:48,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-29 23:39:49,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:39:49,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-29 23:39:49,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:39:51,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:39:51,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:39:51,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=526733.3333333334, ans=0.2 2023-09-29 23:39:52,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:39:52,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:39:54,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-29 23:39:54,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-29 23:39:54,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-29 23:39:54,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:39:56,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:39:57,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:39:57,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:40:09,564 INFO [train.py:1039] (3/4) Epoch 15, batch 4650, loss[loss=0.1655, simple_loss=0.2372, pruned_loss=0.04692, over 18418.00 frames. ], tot_loss[loss=0.1869, simple_loss=0.2592, pruned_loss=0.0573, over 4691263.19 frames. ], batch size: 40, lr: 6.81e-03, grad_scale: 16.0 2023-09-29 23:40:11,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:40:12,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:40:15,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:40:15,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:40:15,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:40:15,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:40:16,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:40:20,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-29 23:40:23,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:40:24,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-29 23:40:24,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:40:26,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-29 23:40:26,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:40:27,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-29 23:40:27,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-29 23:40:27,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:40:29,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:40:32,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:40:33,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:40:33,759 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-29 23:40:34,356 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.02 vs. limit=6.0 2023-09-29 23:40:36,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:40:36,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-29 23:40:40,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:40:40,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:40:42,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-29 23:40:45,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:40:47,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:40:52,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:40:57,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:41:00,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:41:00,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:41:00,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:41:03,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-29 23:41:04,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-29 23:41:06,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-29 23:41:06,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-29 23:41:07,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:41:15,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:41:15,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:41:15,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-29 23:41:16,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:41:17,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:41:17,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:41:21,802 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:41:23,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:41:23,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:41:25,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:41:28,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:41:28,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:41:30,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:41:30,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-29 23:41:30,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-29 23:41:32,036 INFO [train.py:1039] (3/4) Epoch 15, batch 4700, loss[loss=0.1715, simple_loss=0.2516, pruned_loss=0.04571, over 24300.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.26, pruned_loss=0.05713, over 4700239.49 frames. ], batch size: 61, lr: 6.81e-03, grad_scale: 8.0 2023-09-29 23:41:32,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=527133.3333333334, ans=0.0 2023-09-29 23:41:33,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-29 23:41:40,565 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.78 vs. limit=22.5 2023-09-29 23:41:41,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:41:42,948 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:41:43,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:41:44,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:41:44,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-29 23:41:48,320 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 1.872e+02 2.063e+02 2.349e+02 3.516e+02, threshold=4.126e+02, percent-clipped=0.0 2023-09-29 23:41:50,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-29 23:41:51,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-29 23:41:53,060 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:41:55,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:41:55,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:41:59,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:42:06,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:42:07,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-29 23:42:08,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=527266.6666666666, ans=0.125 2023-09-29 23:42:09,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:42:14,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=527266.6666666666, ans=0.07 2023-09-29 23:42:15,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-29 23:42:15,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:42:17,332 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=527266.6666666666, ans=0.0 2023-09-29 23:42:18,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:22,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-29 23:42:23,724 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:42:29,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:42:30,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-29 23:42:32,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:32,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:42:34,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:42:36,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:42:36,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-29 23:42:37,664 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-29 23:42:39,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:42:40,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:40,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:40,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-29 23:42:41,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:42:44,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-29 23:42:47,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:42:48,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:42:53,239 INFO [train.py:1039] (3/4) Epoch 15, batch 4750, loss[loss=0.1735, simple_loss=0.2437, pruned_loss=0.05161, over 21870.00 frames. ], tot_loss[loss=0.188, simple_loss=0.2611, pruned_loss=0.05746, over 4700705.70 frames. ], batch size: 48, lr: 6.81e-03, grad_scale: 8.0 2023-09-29 23:42:53,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:42:53,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:42:57,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-29 23:42:58,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:43:03,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-29 23:43:06,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:43:06,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:43:08,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:43:14,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-29 23:43:18,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:43:19,153 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.93 vs. limit=15.0 2023-09-29 23:43:21,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-29 23:43:22,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:43:22,994 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=527533.3333333334, ans=0.125 2023-09-29 23:43:24,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:43:24,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:43:25,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:43:25,885 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-29 23:43:25,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-29 23:43:33,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-29 23:43:33,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=527600.0, ans=0.125 2023-09-29 23:43:37,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:43:38,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:43:42,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:43:42,125 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-29 23:43:42,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:43:42,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=527666.6666666666, ans=0.125 2023-09-29 23:43:45,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:43:48,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:43:50,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-29 23:43:50,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-29 23:43:50,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:43:52,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:43:52,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:43:53,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:43:53,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-29 23:43:56,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-29 23:43:59,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:02,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:44:02,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-29 23:44:02,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:44:04,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:44:04,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=527733.3333333334, ans=0.1 2023-09-29 23:44:05,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:44:07,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:44:07,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-29 23:44:12,054 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:44:12,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-29 23:44:14,006 INFO [train.py:1039] (3/4) Epoch 15, batch 4800, loss[loss=0.2041, simple_loss=0.2647, pruned_loss=0.07179, over 23852.00 frames. ], tot_loss[loss=0.1882, simple_loss=0.2616, pruned_loss=0.05745, over 4717557.23 frames. ], batch size: 195, lr: 6.81e-03, grad_scale: 16.0 2023-09-29 23:44:14,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-29 23:44:15,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-29 23:44:18,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:44:19,399 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:44:19,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-29 23:44:27,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:44:27,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:30,101 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.901e+02 2.248e+02 2.763e+02 5.522e+02, threshold=4.496e+02, percent-clipped=3.0 2023-09-29 23:44:31,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:44:32,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=527866.6666666666, ans=0.125 2023-09-29 23:44:33,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:44:33,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:44:34,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-29 23:44:34,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:44:34,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:44:37,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:44:41,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:44:42,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:44:44,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:44:44,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:44:44,358 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-29 23:44:44,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:46,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:44:49,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:44:53,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:55,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:44:55,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:44:56,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-29 23:44:58,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:45:01,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-29 23:45:01,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-29 23:45:03,170 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:45:03,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:45:03,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:45:03,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:45:03,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:45:06,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:45:06,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:45:09,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:45:11,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:12,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:45:16,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=528000.0, ans=0.125 2023-09-29 23:45:17,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-29 23:45:17,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:45:17,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:18,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:45:19,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:45:24,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:45:24,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:45:24,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:26,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:45:26,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:45:28,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:45:32,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:45:32,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:32,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:45:34,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=528133.3333333334, ans=0.125 2023-09-29 23:45:36,015 INFO [train.py:1039] (3/4) Epoch 15, batch 4850, loss[loss=0.1794, simple_loss=0.2635, pruned_loss=0.04767, over 24064.00 frames. ], tot_loss[loss=0.1891, simple_loss=0.2622, pruned_loss=0.05802, over 4714400.72 frames. ], batch size: 80, lr: 6.80e-03, grad_scale: 16.0 2023-09-29 23:45:36,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-29 23:45:37,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-29 23:45:37,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:45:37,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:45:40,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:45:40,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:45:42,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:45:48,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-29 23:45:52,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:45:52,885 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.28 vs. limit=15.0 2023-09-29 23:45:57,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:45:57,722 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=528200.0, ans=0.0 2023-09-29 23:45:58,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-29 23:45:58,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:46:02,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:46:02,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:46:04,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:46:04,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-29 23:46:09,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:46:11,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:46:11,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-29 23:46:12,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-29 23:46:12,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-29 23:46:13,328 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.71 vs. limit=15.0 2023-09-29 23:46:15,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:46:15,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:46:19,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:46:19,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-29 23:46:19,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-29 23:46:20,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 23:46:28,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:46:29,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-29 23:46:31,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:46:31,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:46:32,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:46:35,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-29 23:46:35,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:46:35,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-29 23:46:35,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:46:37,355 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:46:38,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-29 23:46:39,071 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=528333.3333333334, ans=0.0 2023-09-29 23:46:48,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:46:54,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:46:54,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:46:58,247 INFO [train.py:1039] (3/4) Epoch 15, batch 4900, loss[loss=0.2086, simple_loss=0.2878, pruned_loss=0.0647, over 24053.00 frames. ], tot_loss[loss=0.1885, simple_loss=0.2617, pruned_loss=0.05764, over 4721141.11 frames. ], batch size: 80, lr: 6.80e-03, grad_scale: 16.0 2023-09-29 23:46:58,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-29 23:46:58,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:47:05,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:47:07,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:47:07,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:47:10,649 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=528466.6666666666, ans=0.0 2023-09-29 23:47:11,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-29 23:47:14,803 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.640e+02 1.963e+02 2.199e+02 2.506e+02 3.437e+02, threshold=4.398e+02, percent-clipped=0.0 2023-09-29 23:47:16,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-29 23:47:19,746 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.41 vs. limit=22.5 2023-09-29 23:47:21,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-29 23:47:23,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-29 23:47:23,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:47:23,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:47:23,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:47:23,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:47:23,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:47:24,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-29 23:47:28,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-29 23:47:28,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:47:30,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:47:30,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:47:33,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:47:33,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:47:33,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:47:33,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-29 23:47:37,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:47:39,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:47:39,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-29 23:47:39,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-29 23:47:44,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-29 23:47:47,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:47:47,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:47:47,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:47:47,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:47:47,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-29 23:47:49,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:47:49,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-29 23:47:51,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:47:54,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-29 23:47:55,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:47:55,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=528666.6666666666, ans=0.2 2023-09-29 23:48:00,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-29 23:48:02,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:48:02,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-29 23:48:03,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-29 23:48:10,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:48:12,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:48:14,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-29 23:48:14,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:48:14,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:48:17,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:48:20,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:48:20,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:48:20,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:48:20,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-29 23:48:21,876 INFO [train.py:1039] (3/4) Epoch 15, batch 4950, loss[loss=0.1593, simple_loss=0.2385, pruned_loss=0.04001, over 24593.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2601, pruned_loss=0.05711, over 4723466.29 frames. ], batch size: 60, lr: 6.80e-03, grad_scale: 16.0 2023-09-29 23:48:22,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:48:25,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:48:25,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-29 23:48:28,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-29 23:48:28,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-29 23:48:30,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:48:30,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-29 23:48:31,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:48:31,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:48:31,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-29 23:48:31,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:48:35,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:48:35,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:48:37,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:48:38,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:48:41,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:48:41,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:48:46,292 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.75 vs. limit=10.0 2023-09-29 23:48:46,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-29 23:48:52,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:48:52,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=528866.6666666666, ans=0.1 2023-09-29 23:48:53,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:48:55,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:48:55,373 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:48:55,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=528933.3333333334, ans=0.125 2023-09-29 23:48:56,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:48:57,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-29 23:48:58,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-29 23:49:01,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:49:03,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:49:03,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:49:05,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:49:05,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:49:07,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-29 23:49:08,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:49:10,912 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.25 vs. limit=15.0 2023-09-29 23:49:11,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:49:14,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:49:15,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:49:16,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:49:16,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-29 23:49:18,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:49:19,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-29 23:49:25,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:49:26,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:49:26,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:49:26,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:49:28,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:49:28,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:49:31,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:49:31,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:49:31,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:49:32,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-29 23:49:34,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:49:41,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-29 23:49:41,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-29 23:49:44,497 INFO [train.py:1039] (3/4) Epoch 15, batch 5000, loss[loss=0.1601, simple_loss=0.2356, pruned_loss=0.04237, over 24415.00 frames. ], tot_loss[loss=0.1866, simple_loss=0.2593, pruned_loss=0.05693, over 4707768.78 frames. ], batch size: 58, lr: 6.80e-03, grad_scale: 16.0 2023-09-29 23:49:48,345 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:49:48,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:49:49,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-29 23:49:51,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-29 23:49:51,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:49:55,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-29 23:49:55,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-29 23:49:55,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-29 23:49:55,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=529133.3333333334, ans=0.125 2023-09-29 23:49:57,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-29 23:49:57,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:49:58,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:49:59,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-29 23:49:59,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:50:01,014 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.874e+02 2.133e+02 2.483e+02 3.662e+02, threshold=4.266e+02, percent-clipped=0.0 2023-09-29 23:50:01,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:50:02,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-29 23:50:02,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-29 23:50:04,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:50:04,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-29 23:50:04,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:50:04,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:50:05,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 23:50:05,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-29 23:50:05,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-29 23:50:06,288 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=529200.0, ans=0.125 2023-09-29 23:50:07,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-29 23:50:07,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:50:07,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:50:08,333 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.32 vs. limit=15.0 2023-09-29 23:50:09,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-29 23:50:09,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:50:13,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:50:14,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:50:16,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-29 23:50:17,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-29 23:50:17,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:50:21,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:50:21,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=529266.6666666666, ans=0.125 2023-09-29 23:50:26,026 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-29 23:50:29,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-29 23:50:31,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:50:31,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:50:33,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-29 23:50:33,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:50:33,305 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:50:35,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:50:36,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-29 23:50:38,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:50:42,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-29 23:50:42,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:50:47,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-29 23:50:54,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:50:56,952 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.08 vs. limit=8.0 2023-09-29 23:50:57,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=529400.0, ans=0.125 2023-09-29 23:50:57,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=529400.0, ans=0.0 2023-09-29 23:50:57,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=529400.0, ans=0.2 2023-09-29 23:51:02,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:51:03,727 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:51:03,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:51:03,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:51:03,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:51:03,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-29 23:51:05,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:51:07,340 INFO [train.py:1039] (3/4) Epoch 15, batch 5050, loss[loss=0.1895, simple_loss=0.2779, pruned_loss=0.05056, over 24570.00 frames. ], tot_loss[loss=0.188, simple_loss=0.2607, pruned_loss=0.05768, over 4700499.56 frames. ], batch size: 71, lr: 6.80e-03, grad_scale: 8.0 2023-09-29 23:51:11,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:51:11,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-29 23:51:12,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:51:16,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:51:16,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=529466.6666666666, ans=0.0 2023-09-29 23:51:17,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:51:17,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-29 23:51:19,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:51:19,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:51:21,486 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.10 vs. limit=15.0 2023-09-29 23:51:22,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-29 23:51:24,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-29 23:51:24,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:51:34,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-29 23:51:34,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-29 23:51:36,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:51:36,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-29 23:51:36,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:51:37,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:51:39,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:51:39,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:51:39,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-29 23:51:40,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-29 23:51:40,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:51:44,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:51:47,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:51:47,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-29 23:51:49,587 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:51:50,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:51:52,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-29 23:51:56,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:51:56,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:51:56,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:51:56,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:51:59,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:52:01,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:52:02,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:02,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:52:02,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:52:03,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-29 23:52:04,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-29 23:52:06,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-29 23:52:08,387 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.70 vs. limit=15.0 2023-09-29 23:52:10,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:52:10,851 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-29 23:52:10,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-29 23:52:11,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:52:12,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:12,594 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-29 23:52:13,096 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=529733.3333333334, ans=0.2 2023-09-29 23:52:14,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:52:14,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-29 23:52:14,277 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:19,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:52:19,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:19,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-29 23:52:21,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-29 23:52:24,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:52:24,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:52:24,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-29 23:52:27,881 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-29 23:52:29,659 INFO [train.py:1039] (3/4) Epoch 15, batch 5100, loss[loss=0.1883, simple_loss=0.253, pruned_loss=0.06179, over 22753.00 frames. ], tot_loss[loss=0.1877, simple_loss=0.2609, pruned_loss=0.05729, over 4717084.22 frames. ], batch size: 322, lr: 6.79e-03, grad_scale: 8.0 2023-09-29 23:52:32,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:52:35,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-29 23:52:35,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-29 23:52:38,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:52:39,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:52:41,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:52:42,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-29 23:52:42,902 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-29 23:52:44,918 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=529866.6666666666, ans=0.1 2023-09-29 23:52:46,429 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-29 23:52:47,405 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.887e+02 2.076e+02 2.309e+02 3.546e+02, threshold=4.153e+02, percent-clipped=0.0 2023-09-29 23:52:47,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:52:47,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:52:48,785 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=11.35 vs. limit=15.0 2023-09-29 23:52:52,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:52:54,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-29 23:52:55,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:52:57,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:52:57,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-29 23:53:00,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:53:00,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:53:02,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-29 23:53:06,414 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-29 23:53:07,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:53:07,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-29 23:53:07,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-29 23:53:10,363 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=8.71 vs. limit=15.0 2023-09-29 23:53:13,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:53:23,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:53:26,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-29 23:53:27,511 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-29 23:53:27,528 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-29 23:53:28,556 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.94 vs. limit=15.0 2023-09-29 23:53:29,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-29 23:53:29,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:53:32,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-29 23:53:35,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-29 23:53:37,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-29 23:53:39,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-29 23:53:40,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-29 23:53:42,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-29 23:53:42,445 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-29 23:53:44,367 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.49 vs. limit=15.0 2023-09-29 23:53:49,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:53:49,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:53:49,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:53:49,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=530066.6666666666, ans=0.1 2023-09-29 23:53:50,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:53:50,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-29 23:53:52,310 INFO [train.py:1039] (3/4) Epoch 15, batch 5150, loss[loss=0.1978, simple_loss=0.2617, pruned_loss=0.06697, over 23823.00 frames. ], tot_loss[loss=0.1887, simple_loss=0.2621, pruned_loss=0.05758, over 4710514.74 frames. ], batch size: 164, lr: 6.79e-03, grad_scale: 8.0 2023-09-29 23:53:52,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:53:53,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-29 23:53:53,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-29 23:53:54,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=530133.3333333334, ans=0.1 2023-09-29 23:53:55,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-29 23:53:55,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-29 23:53:55,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-29 23:53:55,686 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=530133.3333333334, ans=0.0 2023-09-29 23:53:58,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:54:00,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-29 23:54:01,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:54:03,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:54:06,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-29 23:54:06,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-29 23:54:06,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=530200.0, ans=0.125 2023-09-29 23:54:08,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:54:09,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-29 23:54:11,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-29 23:54:11,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:54:11,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:54:12,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:54:12,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:54:13,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-29 23:54:14,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:54:16,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:54:18,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-29 23:54:19,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-29 23:54:19,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-29 23:54:27,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-29 23:54:27,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-29 23:54:31,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:54:37,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:54:37,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:54:42,419 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=530333.3333333334, ans=0.125 2023-09-29 23:54:43,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:54:43,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:54:43,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=530333.3333333334, ans=0.0 2023-09-29 23:54:45,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-29 23:54:50,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:54:51,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-29 23:54:51,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-29 23:54:55,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:54:56,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:54:58,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-29 23:55:05,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:55:07,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-29 23:55:10,150 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:55:10,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:55:11,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-29 23:55:11,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-29 23:55:11,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-29 23:55:12,011 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=530400.0, ans=0.1 2023-09-29 23:55:13,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:55:14,501 INFO [train.py:1039] (3/4) Epoch 15, batch 5200, loss[loss=0.1837, simple_loss=0.2558, pruned_loss=0.0558, over 22000.00 frames. ], tot_loss[loss=0.1886, simple_loss=0.262, pruned_loss=0.05759, over 4716711.15 frames. ], batch size: 48, lr: 6.79e-03, grad_scale: 16.0 2023-09-29 23:55:16,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:55:17,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:55:19,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:55:24,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-29 23:55:26,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:55:27,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:55:31,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:55:31,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-29 23:55:32,511 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.846e+02 2.066e+02 2.366e+02 4.637e+02, threshold=4.132e+02, percent-clipped=1.0 2023-09-29 23:55:32,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:55:34,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-29 23:55:36,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-29 23:55:36,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:55:38,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-29 23:55:42,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-29 23:55:44,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-29 23:55:44,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-29 23:55:44,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-29 23:55:45,695 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.71 vs. limit=15.0 2023-09-29 23:55:47,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-29 23:55:48,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=530600.0, ans=0.125 2023-09-29 23:55:49,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:55:49,509 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-29 23:55:49,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:55:51,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:55:51,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:55:52,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-29 23:55:52,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=530600.0, ans=0.0 2023-09-29 23:55:53,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:55:55,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:55:59,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-29 23:55:59,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-29 23:55:59,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-29 23:56:07,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-29 23:56:09,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-29 23:56:14,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:56:14,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:56:15,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-29 23:56:15,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:56:16,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-29 23:56:16,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:56:17,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:56:21,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:56:22,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-29 23:56:25,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:56:27,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:56:27,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:56:30,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:56:32,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-29 23:56:33,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-29 23:56:33,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:56:35,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:56:36,905 INFO [train.py:1039] (3/4) Epoch 15, batch 5250, loss[loss=0.1865, simple_loss=0.2727, pruned_loss=0.05015, over 24297.00 frames. ], tot_loss[loss=0.1877, simple_loss=0.261, pruned_loss=0.05721, over 4707626.13 frames. ], batch size: 74, lr: 6.79e-03, grad_scale: 16.0 2023-09-29 23:56:37,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-29 23:56:37,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-29 23:56:40,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=530800.0, ans=0.0 2023-09-29 23:56:42,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:56:44,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:56:44,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=530800.0, ans=0.125 2023-09-29 23:56:45,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:56:47,398 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:56:52,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:56:54,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=530866.6666666666, ans=0.2 2023-09-29 23:56:55,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-29 23:56:57,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:56:58,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-29 23:57:00,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=530866.6666666666, ans=0.1 2023-09-29 23:57:01,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-29 23:57:01,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:57:03,247 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:57:13,679 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=530933.3333333334, ans=0.2 2023-09-29 23:57:46,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=531066.6666666666, ans=0.1 2023-09-29 23:57:52,127 INFO [train.py:1039] (3/4) Epoch 15, batch 5300, loss[loss=0.1743, simple_loss=0.2555, pruned_loss=0.04649, over 24455.00 frames. ], tot_loss[loss=0.1872, simple_loss=0.2599, pruned_loss=0.05729, over 4707017.69 frames. ], batch size: 63, lr: 6.78e-03, grad_scale: 16.0 2023-09-29 23:57:52,326 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=531133.3333333334, ans=0.1 2023-09-29 23:58:06,822 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.927e+02 2.161e+02 2.637e+02 4.366e+02, threshold=4.323e+02, percent-clipped=1.0 2023-09-29 23:58:06,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-29 23:58:07,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-29 23:58:07,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-29 23:58:07,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:58:07,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:58:07,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:58:07,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:58:07,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:58:07,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:58:07,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:58:07,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-29 23:58:08,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:58:08,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-29 23:58:08,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-29 23:58:08,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-29 23:58:08,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-29 23:58:09,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-29 23:58:09,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-29 23:58:09,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:58:10,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:58:10,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:58:10,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:58:10,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-29 23:58:11,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:58:11,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:58:11,259 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:58:11,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:58:11,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-29 23:58:11,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-29 23:58:11,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:58:11,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-29 23:58:12,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-29 23:58:12,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-29 23:58:13,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-29 23:58:13,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-29 23:58:13,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-29 23:58:13,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-29 23:58:13,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:58:13,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-29 23:58:14,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-29 23:58:14,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:58:14,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-29 23:58:15,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-29 23:58:15,372 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-29 23:58:15,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-29 23:58:15,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-29 23:58:15,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:58:15,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-29 23:58:15,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-29 23:58:16,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-29 23:58:16,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-29 23:58:24,977 INFO [train.py:1039] (3/4) Epoch 16, batch 0, loss[loss=0.1904, simple_loss=0.2657, pruned_loss=0.0575, over 24639.00 frames. ], tot_loss[loss=0.1904, simple_loss=0.2657, pruned_loss=0.0575, over 24639.00 frames. ], batch size: 65, lr: 6.57e-03, grad_scale: 32.0 2023-09-29 23:58:24,977 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-29 23:58:41,245 INFO [train.py:1071] (3/4) Epoch 16, validation: loss=0.3148, simple_loss=0.2815, pruned_loss=0.174, over 1125622.00 frames. 2023-09-29 23:58:41,246 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-29 23:58:41,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-29 23:58:42,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-29 23:58:44,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-29 23:58:50,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=531213.3333333334, ans=0.04949747468305833 2023-09-29 23:58:51,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:58:53,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-29 23:58:53,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:58:53,570 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=531213.3333333334, ans=0.0 2023-09-29 23:58:54,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-29 23:58:57,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-29 23:58:57,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:58:59,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:58:59,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=531280.0, ans=0.125 2023-09-29 23:59:02,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-29 23:59:02,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:59:02,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-29 23:59:04,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:59:04,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-29 23:59:08,028 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-29 23:59:15,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-29 23:59:15,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:59:17,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-29 23:59:21,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-29 23:59:21,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-29 23:59:22,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:59:27,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-29 23:59:32,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-29 23:59:38,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-29 23:59:40,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=531413.3333333334, ans=0.125 2023-09-29 23:59:42,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-29 23:59:44,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-29 23:59:44,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:59:44,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-29 23:59:45,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-29 23:59:47,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-29 23:59:50,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:59:50,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-29 23:59:50,834 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.09 vs. limit=15.0 2023-09-29 23:59:55,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-29 23:59:57,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=531480.0, ans=0.2 2023-09-29 23:59:59,034 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 00:00:02,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:00:03,436 INFO [train.py:1039] (3/4) Epoch 16, batch 50, loss[loss=0.2021, simple_loss=0.2679, pruned_loss=0.06809, over 23483.00 frames. ], tot_loss[loss=0.1909, simple_loss=0.2655, pruned_loss=0.05819, over 1074148.38 frames. ], batch size: 119, lr: 6.56e-03, grad_scale: 16.0 2023-09-30 00:00:03,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:00:05,528 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=531546.6666666666, ans=0.125 2023-09-30 00:00:06,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:00:06,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 00:00:08,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:00:09,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:00:11,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:00:14,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:00:15,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:00:19,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 00:00:19,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:00:25,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:00:27,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 00:00:29,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 00:00:31,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:00:34,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:00:34,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:00:34,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:00:36,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:00:37,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 00:00:37,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:00:44,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:00:47,061 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:00:47,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:00:48,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 00:00:49,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=531680.0, ans=0.1 2023-09-30 00:00:50,390 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:00:51,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:00:51,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 00:00:54,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:00:55,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 00:01:03,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:01:03,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:01:04,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:01:08,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:01:08,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:01:13,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 00:01:13,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 00:01:14,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:01:14,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:01:17,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:01:17,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:01:19,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 00:01:19,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 00:01:22,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 00:01:23,585 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.948e+02 2.203e+02 2.562e+02 3.872e+02, threshold=4.407e+02, percent-clipped=0.0 2023-09-30 00:01:23,647 INFO [train.py:1039] (3/4) Epoch 16, batch 100, loss[loss=0.1721, simple_loss=0.254, pruned_loss=0.0451, over 24283.00 frames. ], tot_loss[loss=0.1882, simple_loss=0.2631, pruned_loss=0.05662, over 1879439.13 frames. ], batch size: 61, lr: 6.56e-03, grad_scale: 16.0 2023-09-30 00:01:23,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:01:23,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:01:25,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 00:01:25,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 00:01:25,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:01:27,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:01:28,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 00:01:28,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:01:33,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:01:35,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:01:38,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:01:38,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 00:01:38,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:01:43,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:01:43,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:01:43,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:01:43,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:01:45,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:01:45,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 00:01:49,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 00:01:49,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:01:49,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:01:49,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:01:52,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 00:01:54,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:01:56,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:01:57,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:01:59,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 00:01:59,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=532013.3333333334, ans=0.1 2023-09-30 00:02:02,168 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 00:02:02,196 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 00:02:04,590 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:02:04,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:02:08,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:02:10,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:02:11,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:12,828 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.14 vs. limit=12.0 2023-09-30 00:02:15,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=532080.0, ans=0.125 2023-09-30 00:02:18,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:18,334 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 00:02:22,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 00:02:25,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:02:27,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:02:28,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:33,247 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:02:34,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:02:36,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:02:40,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:40,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:02:41,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:02:41,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:02:41,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:02:43,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 00:02:43,370 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 00:02:43,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:02:43,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:02:43,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:02:43,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:02:43,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 00:02:45,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 00:02:45,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 00:02:45,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:02:46,470 INFO [train.py:1039] (3/4) Epoch 16, batch 150, loss[loss=0.1736, simple_loss=0.243, pruned_loss=0.05206, over 23592.00 frames. ], tot_loss[loss=0.1884, simple_loss=0.2628, pruned_loss=0.05703, over 2515677.64 frames. ], batch size: 149, lr: 6.56e-03, grad_scale: 8.0 2023-09-30 00:02:47,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:02:48,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:02:48,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:02:49,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:02:51,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:02:54,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:02:54,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:02:56,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:02:59,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:03:00,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:03,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:03:04,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:07,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 00:03:07,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 00:03:07,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 00:03:10,687 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:03:10,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:03:11,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:03:12,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:03:12,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:03:13,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:13,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:03:14,672 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 00:03:17,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:03:22,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:03:26,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:03:28,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 00:03:31,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:03:31,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:03:31,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:03:33,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:03:36,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:03:37,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:03:38,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:03:38,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=532413.3333333334, ans=0.07 2023-09-30 00:03:39,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 00:03:44,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:03:46,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:03:46,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:03:46,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:03:49,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:03:52,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 00:03:55,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:03:55,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=532480.0, ans=0.1 2023-09-30 00:03:58,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:04:00,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:04:03,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:04:04,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 00:04:05,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:04:05,045 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 00:04:08,510 INFO [train.py:1039] (3/4) Epoch 16, batch 200, loss[loss=0.1942, simple_loss=0.265, pruned_loss=0.06169, over 23448.00 frames. ], tot_loss[loss=0.1896, simple_loss=0.2634, pruned_loss=0.05791, over 3001179.20 frames. ], batch size: 134, lr: 6.56e-03, grad_scale: 8.0 2023-09-30 00:04:10,021 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.410e+02 1.995e+02 2.387e+02 2.784e+02 4.621e+02, threshold=4.773e+02, percent-clipped=1.0 2023-09-30 00:04:10,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:04:12,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:04:13,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:04:17,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 00:04:18,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:04:18,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:04:22,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 00:04:22,524 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=532546.6666666666, ans=0.1 2023-09-30 00:04:23,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:04:23,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:04:25,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:04:28,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:04:28,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:04:30,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:04:35,619 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.10 vs. limit=12.0 2023-09-30 00:04:49,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:04:49,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:04:50,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:04:52,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:04:52,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 00:04:52,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 00:04:55,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:04:57,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:04:59,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:04:59,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:05:00,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 00:05:02,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 00:05:02,142 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:05:06,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:05:07,748 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.73 vs. limit=15.0 2023-09-30 00:05:10,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.whiten.whitening_limit, batch_count=532746.6666666666, ans=12.0 2023-09-30 00:05:12,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:05:18,900 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:18,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:05:27,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:30,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 00:05:32,398 INFO [train.py:1039] (3/4) Epoch 16, batch 250, loss[loss=0.194, simple_loss=0.2775, pruned_loss=0.05521, over 23992.00 frames. ], tot_loss[loss=0.1896, simple_loss=0.2635, pruned_loss=0.05788, over 3389318.16 frames. ], batch size: 86, lr: 6.56e-03, grad_scale: 8.0 2023-09-30 00:05:32,482 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:05:32,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:05:32,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:05:32,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=532880.0, ans=0.1 2023-09-30 00:05:33,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:05:34,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 00:05:34,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:05:34,280 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 00:05:37,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:38,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:05:39,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=532880.0, ans=0.2 2023-09-30 00:05:40,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:41,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:05:44,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:05:45,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:05:47,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:05:49,509 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=9.81 vs. limit=15.0 2023-09-30 00:05:50,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:05:50,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=532946.6666666666, ans=0.2 2023-09-30 00:05:57,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=532946.6666666666, ans=0.0 2023-09-30 00:06:03,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:06:05,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:06:05,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:06:07,835 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=533013.3333333334, ans=0.125 2023-09-30 00:06:12,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:06:12,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 00:06:13,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:06:15,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:06:15,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 00:06:15,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:06:16,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:06:19,842 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:06:20,286 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 00:06:22,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 00:06:23,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:06:25,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:06:25,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:06:25,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:06:25,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:06:27,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:06:27,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:06:30,391 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:06:31,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:06:32,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:06:37,626 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.23 vs. limit=15.0 2023-09-30 00:06:38,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:06:40,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:06:42,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=533146.6666666666, ans=0.2 2023-09-30 00:06:44,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:06:46,077 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=533146.6666666666, ans=0.0 2023-09-30 00:06:48,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:06:50,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:06:53,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=533213.3333333334, ans=0.0 2023-09-30 00:06:53,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=533213.3333333334, ans=0.2 2023-09-30 00:06:54,774 INFO [train.py:1039] (3/4) Epoch 16, batch 300, loss[loss=0.1701, simple_loss=0.2297, pruned_loss=0.05519, over 22603.00 frames. ], tot_loss[loss=0.1873, simple_loss=0.2605, pruned_loss=0.05704, over 3670689.48 frames. ], batch size: 322, lr: 6.55e-03, grad_scale: 8.0 2023-09-30 00:06:54,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 00:06:55,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:06:55,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:06:56,944 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.879e+02 2.129e+02 2.398e+02 3.317e+02, threshold=4.257e+02, percent-clipped=0.0 2023-09-30 00:06:57,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 00:06:58,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 00:06:58,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=533213.3333333334, ans=0.0 2023-09-30 00:07:00,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:07:00,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 00:07:05,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:07:05,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=533213.3333333334, ans=0.1 2023-09-30 00:07:07,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:07:10,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:07:10,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 00:07:12,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:07:14,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:07:14,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 00:07:15,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:07:18,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 00:07:19,425 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 00:07:23,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:07:23,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 00:07:23,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=533280.0, ans=0.125 2023-09-30 00:07:30,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 00:07:30,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:07:34,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:07:35,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:07:35,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 00:07:35,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:07:37,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:07:39,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:07:41,096 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:07:47,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 00:07:47,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 00:07:48,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:07:52,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:07:53,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 00:07:54,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:07:59,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:08:02,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:08:02,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 00:08:06,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:08:06,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:08:08,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:08:11,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:08:11,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 00:08:11,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 00:08:13,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:08:15,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 00:08:17,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:08:18,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:20,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:08:20,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:08:20,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:22,265 INFO [train.py:1039] (3/4) Epoch 16, batch 350, loss[loss=0.1902, simple_loss=0.2504, pruned_loss=0.06502, over 22874.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2583, pruned_loss=0.05614, over 3893788.03 frames. ], batch size: 322, lr: 6.55e-03, grad_scale: 8.0 2023-09-30 00:08:25,177 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=533546.6666666666, ans=0.1 2023-09-30 00:08:26,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:08:26,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 00:08:28,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:34,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:08:37,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:08:39,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:39,734 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.81 vs. limit=15.0 2023-09-30 00:08:40,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 00:08:42,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:08:42,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 00:08:46,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:08:46,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 00:08:48,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:08:51,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 00:08:53,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:08:55,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:08:57,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:08:58,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:08:58,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:09:00,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:09:00,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:09:01,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:09:03,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:09:03,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:09:09,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:09:09,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:09:10,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:09:10,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:09:17,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 00:09:17,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:09:22,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:09:22,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:09:22,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:09:24,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 00:09:26,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:09:27,589 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 00:09:28,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 00:09:28,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:09:31,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:09:31,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 00:09:34,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:09:37,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:09:39,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:09:40,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:09:40,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:09:42,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:09:43,382 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=4.73 vs. limit=12.0 2023-09-30 00:09:45,709 INFO [train.py:1039] (3/4) Epoch 16, batch 400, loss[loss=0.206, simple_loss=0.2839, pruned_loss=0.06407, over 23238.00 frames. ], tot_loss[loss=0.1847, simple_loss=0.2576, pruned_loss=0.05588, over 4066653.48 frames. ], batch size: 93, lr: 6.55e-03, grad_scale: 16.0 2023-09-30 00:09:45,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:09:47,266 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.835e+02 1.997e+02 2.324e+02 4.354e+02, threshold=3.993e+02, percent-clipped=1.0 2023-09-30 00:09:47,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:09:48,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 00:09:48,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:09:49,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:09:51,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:09:52,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:09:56,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:09:57,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:09:59,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 00:10:01,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 00:10:01,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:10:03,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 00:10:03,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:10:08,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:10:08,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:10:08,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 00:10:10,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:10:10,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:10:10,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:10:11,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:10:13,164 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 00:10:14,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 00:10:19,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:10:20,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:10:22,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 00:10:22,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=534013.3333333334, ans=0.125 2023-09-30 00:10:23,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 00:10:27,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:10:31,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:10:38,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 00:10:44,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:10:44,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 00:10:45,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:10:47,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:10:47,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 00:10:52,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:10:52,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=534146.6666666666, ans=0.125 2023-09-30 00:10:55,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 00:10:56,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:10:58,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:10:58,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 00:10:58,825 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=534146.6666666666, ans=0.125 2023-09-30 00:11:00,100 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 00:11:00,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=534146.6666666666, ans=0.125 2023-09-30 00:11:01,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 00:11:03,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:11:05,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:11:06,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 00:11:08,092 INFO [train.py:1039] (3/4) Epoch 16, batch 450, loss[loss=0.1866, simple_loss=0.2539, pruned_loss=0.0596, over 23714.00 frames. ], tot_loss[loss=0.1845, simple_loss=0.2579, pruned_loss=0.05555, over 4214416.60 frames. ], batch size: 135, lr: 6.55e-03, grad_scale: 16.0 2023-09-30 00:11:09,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:11:09,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:11:09,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 00:11:10,433 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.78 vs. limit=15.0 2023-09-30 00:11:12,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 00:11:12,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:11:14,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:11:14,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:11:15,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 00:11:15,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:11:15,595 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=534213.3333333334, ans=0.0 2023-09-30 00:11:15,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=534213.3333333334, ans=0.0 2023-09-30 00:11:16,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:11:19,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:11:29,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:11:29,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:11:30,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 00:11:30,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 00:11:36,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:11:37,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:11:40,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:11:43,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:11:45,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:11:45,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 00:11:47,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 00:11:49,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 00:11:49,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:11:51,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:11:52,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:11:54,562 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 00:11:54,581 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 00:11:54,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:11:56,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:11:57,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 00:12:00,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 00:12:00,727 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:12:00,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 00:12:02,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 00:12:04,419 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.60 vs. limit=15.0 2023-09-30 00:12:05,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:12:07,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:12:07,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:12:09,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 00:12:14,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:12:14,332 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=534480.0, ans=0.1 2023-09-30 00:12:15,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 00:12:15,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 00:12:17,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:12:24,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:12:26,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:12:27,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:12:27,716 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 00:12:30,616 INFO [train.py:1039] (3/4) Epoch 16, batch 500, loss[loss=0.1981, simple_loss=0.288, pruned_loss=0.05411, over 24331.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2593, pruned_loss=0.05587, over 4330985.92 frames. ], batch size: 74, lr: 6.55e-03, grad_scale: 8.0 2023-09-30 00:12:32,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:12:33,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:12:35,021 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.824e+02 2.052e+02 2.354e+02 3.367e+02, threshold=4.104e+02, percent-clipped=0.0 2023-09-30 00:12:35,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:12:35,176 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 00:12:36,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 00:12:36,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:12:39,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 00:12:44,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 00:12:44,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:12:48,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:12:48,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:12:49,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:03,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:13:03,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:13:03,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 00:13:03,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:13:04,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 00:13:04,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:13:05,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=534680.0, ans=0.0 2023-09-30 00:13:08,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:13:08,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:13:08,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:13:08,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:13:09,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 00:13:12,785 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 00:13:14,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:13:14,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:17,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:17,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:17,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:13:20,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 00:13:24,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:13:25,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:13:31,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:13:35,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:13:40,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=534813.3333333334, ans=0.1 2023-09-30 00:13:41,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:13:44,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 00:13:44,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:13:44,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:13:47,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 00:13:47,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 00:13:48,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:13:51,996 INFO [train.py:1039] (3/4) Epoch 16, batch 550, loss[loss=0.1923, simple_loss=0.2751, pruned_loss=0.05475, over 24403.00 frames. ], tot_loss[loss=0.1877, simple_loss=0.2609, pruned_loss=0.05723, over 4411759.30 frames. ], batch size: 77, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:13:55,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 00:13:57,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 00:13:57,657 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:13:57,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 00:13:59,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:13:59,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:13:59,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:01,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:01,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:14:01,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:14:04,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:14:05,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 00:14:06,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:14:11,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:14:11,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:14,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:14:15,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:20,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 00:14:20,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 00:14:22,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=534946.6666666666, ans=0.125 2023-09-30 00:14:23,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:14:28,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:14:30,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:14:31,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:14:34,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:14:34,905 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 00:14:35,032 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:14:36,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 00:14:39,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:14:39,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:14:39,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:14:40,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:14:42,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 00:14:43,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 00:14:45,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:14:45,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:14:45,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:14:45,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:14:48,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:14:51,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:14:53,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=535080.0, ans=0.125 2023-09-30 00:14:54,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:14:55,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:14:56,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 00:14:58,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:15:00,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:15:02,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:15:03,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:15:05,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 00:15:05,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 00:15:12,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 00:15:14,319 INFO [train.py:1039] (3/4) Epoch 16, batch 600, loss[loss=0.1878, simple_loss=0.2712, pruned_loss=0.05226, over 24661.00 frames. ], tot_loss[loss=0.1872, simple_loss=0.2614, pruned_loss=0.05649, over 4501559.77 frames. ], batch size: 68, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:15:15,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 00:15:16,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:15:17,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=535213.3333333334, ans=0.0 2023-09-30 00:15:18,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:15:18,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:15:18,686 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=535213.3333333334, ans=0.125 2023-09-30 00:15:19,621 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.903e+02 2.137e+02 2.465e+02 5.407e+02, threshold=4.275e+02, percent-clipped=1.0 2023-09-30 00:15:26,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:15:28,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 00:15:29,859 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 00:15:31,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:15:34,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:15:37,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:15:38,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 00:15:40,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:15:43,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 00:15:45,286 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=535280.0, ans=0.2 2023-09-30 00:15:49,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:15:49,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:15:50,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:15:56,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:15:56,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:15:57,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:16:01,819 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=535346.6666666666, ans=0.0 2023-09-30 00:16:04,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:16:08,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:16:08,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:16:08,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:16:16,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 00:16:18,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=535413.3333333334, ans=0.0 2023-09-30 00:16:21,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 00:16:21,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:16:23,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=535480.0, ans=0.0 2023-09-30 00:16:26,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 00:16:26,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:16:29,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 00:16:29,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:16:29,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:16:37,770 INFO [train.py:1039] (3/4) Epoch 16, batch 650, loss[loss=0.1871, simple_loss=0.2699, pruned_loss=0.05216, over 24590.00 frames. ], tot_loss[loss=0.186, simple_loss=0.2604, pruned_loss=0.05582, over 4546317.70 frames. ], batch size: 68, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:16:37,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 00:16:41,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 00:16:43,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:16:43,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=535546.6666666666, ans=0.0 2023-09-30 00:16:44,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:16:46,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:16:46,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=535546.6666666666, ans=0.0 2023-09-30 00:16:49,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 00:16:49,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:16:55,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:16:55,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:16:56,253 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=535613.3333333334, ans=0.125 2023-09-30 00:16:58,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:17:02,850 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 00:17:04,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:17:05,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:17:09,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:17:09,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 00:17:12,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten.whitening_limit, batch_count=535680.0, ans=15.0 2023-09-30 00:17:12,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:17:14,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:16,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 00:17:16,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:18,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:17:20,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:17:20,112 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 00:17:20,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:17:20,147 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:17:24,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:26,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:17:26,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:17:27,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:17:28,108 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=535746.6666666666, ans=0.125 2023-09-30 00:17:29,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 00:17:29,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:17:29,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:17:31,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:17:31,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:17:32,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:17:34,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 00:17:35,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 00:17:35,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:35,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:17:36,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:17:36,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:17:39,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:17:44,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=535813.3333333334, ans=0.0 2023-09-30 00:17:47,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:17:47,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:17:49,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:17:53,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:17:53,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:17:54,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:17:59,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:17:59,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:17:59,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:17:59,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:18:01,161 INFO [train.py:1039] (3/4) Epoch 16, batch 700, loss[loss=0.1622, simple_loss=0.2351, pruned_loss=0.04467, over 21580.00 frames. ], tot_loss[loss=0.1846, simple_loss=0.2586, pruned_loss=0.05529, over 4579398.00 frames. ], batch size: 47, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:18:05,521 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.452e+02 1.867e+02 2.174e+02 2.485e+02 3.899e+02, threshold=4.348e+02, percent-clipped=0.0 2023-09-30 00:18:05,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 00:18:07,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 00:18:10,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 00:18:11,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:18:12,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:18:14,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 00:18:19,149 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:18:22,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:18:23,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=535946.6666666666, ans=0.0 2023-09-30 00:18:25,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:18:25,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:18:27,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:18:29,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:18:31,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 00:18:31,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:18:32,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 00:18:37,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 00:18:41,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:18:41,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:18:44,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:18:45,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=536013.3333333334, ans=0.125 2023-09-30 00:18:49,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:18:49,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 00:18:54,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:18:56,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:18:56,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 00:19:01,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:19:01,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:19:04,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:19:08,220 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.82 vs. limit=15.0 2023-09-30 00:19:10,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:19:10,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 00:19:12,557 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=536146.6666666666, ans=0.0 2023-09-30 00:19:15,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 00:19:15,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 00:19:18,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:19:20,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:19:22,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:19:24,046 INFO [train.py:1039] (3/4) Epoch 16, batch 750, loss[loss=0.1939, simple_loss=0.2743, pruned_loss=0.05678, over 23564.00 frames. ], tot_loss[loss=0.1848, simple_loss=0.2588, pruned_loss=0.05546, over 4616707.94 frames. ], batch size: 94, lr: 6.54e-03, grad_scale: 8.0 2023-09-30 00:19:24,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:19:24,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 00:19:28,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 00:19:28,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 00:19:28,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 00:19:31,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 00:19:31,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 00:19:31,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:19:32,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 00:19:32,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:19:34,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:19:36,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:19:39,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:19:39,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 00:19:39,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:19:41,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:19:41,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=536280.0, ans=0.125 2023-09-30 00:19:42,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:19:42,837 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=536280.0, ans=0.125 2023-09-30 00:19:45,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:19:48,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:19:48,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:19:48,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 00:19:50,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:19:52,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:19:53,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:19:54,818 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.55 vs. limit=15.0 2023-09-30 00:19:55,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 00:19:57,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 00:19:57,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:19:59,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 00:19:59,330 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 00:20:00,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 00:20:00,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:20:02,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 00:20:05,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:20:11,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:20:12,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:20:12,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:20:14,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:20:17,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:20:17,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 00:20:17,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:20:19,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 00:20:19,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=536413.3333333334, ans=0.05 2023-09-30 00:20:20,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:20:23,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:20:23,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 00:20:25,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:20:30,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:20:32,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:20:32,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:20:34,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:20:38,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 00:20:38,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:20:40,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:20:40,850 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.30 vs. limit=15.0 2023-09-30 00:20:41,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:20:41,918 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=536480.0, ans=0.0 2023-09-30 00:20:43,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:20:44,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:20:46,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 00:20:47,959 INFO [train.py:1039] (3/4) Epoch 16, batch 800, loss[loss=0.1861, simple_loss=0.2553, pruned_loss=0.05847, over 23372.00 frames. ], tot_loss[loss=0.1854, simple_loss=0.2591, pruned_loss=0.0558, over 4636198.90 frames. ], batch size: 119, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:20:52,608 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.946e+02 2.133e+02 2.496e+02 4.467e+02, threshold=4.266e+02, percent-clipped=1.0 2023-09-30 00:20:54,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:20:54,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:20:55,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:20:55,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:20:57,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:20:58,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:20:59,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:21:02,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=536613.3333333334, ans=0.0 2023-09-30 00:21:04,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:21:05,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:21:09,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 00:21:10,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:21:13,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:21:13,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:21:13,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:21:13,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 00:21:13,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:21:15,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 00:21:17,887 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.74 vs. limit=22.5 2023-09-30 00:21:19,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:21:20,281 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.40 vs. limit=15.0 2023-09-30 00:21:22,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:21:25,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:21:26,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:21:28,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:21:28,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:21:32,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:21:32,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:21:34,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 00:21:37,368 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 00:21:37,418 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 00:21:37,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:21:37,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:21:39,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:21:39,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:21:45,772 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 00:21:45,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 00:21:48,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:21:51,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:21:53,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:21:58,377 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:21:59,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 00:21:59,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:22:02,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 00:22:06,025 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=536813.3333333334, ans=0.125 2023-09-30 00:22:08,785 INFO [train.py:1039] (3/4) Epoch 16, batch 850, loss[loss=0.1972, simple_loss=0.2769, pruned_loss=0.05877, over 24361.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2595, pruned_loss=0.05555, over 4673947.81 frames. ], batch size: 77, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:22:10,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:22:12,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:22:13,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 00:22:14,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:22:15,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:22:16,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=536880.0, ans=0.1 2023-09-30 00:22:17,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 00:22:17,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:22:18,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:22:20,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:22:20,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:22:24,051 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:22:25,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 00:22:25,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 00:22:25,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 00:22:27,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:22:28,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:22:30,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:22:30,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:22:30,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:22:35,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:22:35,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:22:35,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 00:22:38,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 00:22:40,783 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=537013.3333333334, ans=0.0 2023-09-30 00:22:43,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:22:44,223 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.95 vs. limit=15.0 2023-09-30 00:22:44,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 00:22:47,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 00:22:50,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 00:22:53,517 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 00:22:53,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:22:53,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:22:53,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 00:22:53,811 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=537013.3333333334, ans=0.0 2023-09-30 00:22:57,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:22:59,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:22:59,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 00:23:00,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:23:02,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:23:03,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:23:03,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:23:05,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:23:06,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 00:23:07,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 00:23:12,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:23:12,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:23:13,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:23:13,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:23:14,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:23:17,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:23:19,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:23:21,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:23:23,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:23:23,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:23:29,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 00:23:31,173 INFO [train.py:1039] (3/4) Epoch 16, batch 900, loss[loss=0.1664, simple_loss=0.2389, pruned_loss=0.0469, over 24328.00 frames. ], tot_loss[loss=0.187, simple_loss=0.2605, pruned_loss=0.05673, over 4685544.99 frames. ], batch size: 56, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:23:31,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:23:31,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 00:23:31,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:23:31,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:23:33,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 00:23:36,675 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.611e+02 2.050e+02 2.390e+02 2.977e+02 4.145e+02, threshold=4.781e+02, percent-clipped=0.0 2023-09-30 00:23:38,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:23:41,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:23:42,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 00:23:46,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:23:46,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 00:23:48,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 00:23:48,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=537280.0, ans=0.1 2023-09-30 00:23:49,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:23:49,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:23:49,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:23:49,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:24:02,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:24:02,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:24:02,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:24:05,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:24:11,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 00:24:14,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:24:18,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:24:18,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:24:20,071 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 00:24:21,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 00:24:27,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:24:27,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:24:30,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:24:36,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:24:36,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:24:39,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 00:24:39,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:24:41,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 00:24:43,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:24:43,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:24:45,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:24:45,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:24:50,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 00:24:50,615 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 00:24:52,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 00:24:52,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 00:24:54,970 INFO [train.py:1039] (3/4) Epoch 16, batch 950, loss[loss=0.1795, simple_loss=0.264, pruned_loss=0.04751, over 24687.00 frames. ], tot_loss[loss=0.1875, simple_loss=0.2611, pruned_loss=0.05701, over 4682401.99 frames. ], batch size: 73, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:24:55,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:24:58,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=537546.6666666666, ans=0.125 2023-09-30 00:24:59,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 00:25:04,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:25:08,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:25:08,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:25:09,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 00:25:12,914 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 00:25:15,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:25:16,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:25:17,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:25:17,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:25:17,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 00:25:19,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 00:25:19,492 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 00:25:21,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:25:21,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 00:25:21,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:25:27,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:25:27,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:25:27,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:25:29,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 00:25:30,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 00:25:32,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:25:32,471 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=537680.0, ans=0.015 2023-09-30 00:25:34,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=537680.0, ans=0.125 2023-09-30 00:25:35,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:25:40,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:25:40,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:25:45,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 00:25:46,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 00:25:46,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:25:46,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=537746.6666666666, ans=0.0 2023-09-30 00:25:48,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:25:48,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:25:48,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:25:51,544 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=537746.6666666666, ans=0.125 2023-09-30 00:25:53,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 00:25:54,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:25:58,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:25:59,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:25:59,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 00:25:59,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:25:59,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:25:59,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 00:26:04,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:26:06,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=537813.3333333334, ans=0.0 2023-09-30 00:26:07,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:26:07,949 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 00:26:09,597 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=537813.3333333334, ans=0.125 2023-09-30 00:26:10,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:26:13,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 00:26:13,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 00:26:16,017 INFO [train.py:1039] (3/4) Epoch 16, batch 1000, loss[loss=0.2093, simple_loss=0.2717, pruned_loss=0.07341, over 23770.00 frames. ], tot_loss[loss=0.1866, simple_loss=0.2599, pruned_loss=0.05668, over 4689141.44 frames. ], batch size: 150, lr: 6.53e-03, grad_scale: 16.0 2023-09-30 00:26:16,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:26:20,694 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 2.021e+02 2.226e+02 2.513e+02 3.322e+02, threshold=4.453e+02, percent-clipped=0.0 2023-09-30 00:26:20,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 00:26:20,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:26:24,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:26:26,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 00:26:26,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 00:26:26,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=537880.0, ans=0.0 2023-09-30 00:26:31,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:26:31,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:26:33,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:26:34,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 00:26:39,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=537946.6666666666, ans=0.0 2023-09-30 00:26:40,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 00:26:42,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 00:26:43,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:26:46,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 00:26:46,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 00:26:46,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 00:26:49,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:26:50,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:26:57,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:26:59,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:27:01,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:27:01,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:27:01,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 00:27:01,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:27:02,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:27:03,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:27:04,452 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 00:27:06,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 00:27:07,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 00:27:09,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 00:27:09,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=538080.0, ans=0.125 2023-09-30 00:27:10,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:27:17,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:27:17,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:27:19,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:27:20,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:27:21,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 00:27:23,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:27:23,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 00:27:24,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 00:27:26,322 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:27:26,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:27:27,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:27:31,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:27:33,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:27:36,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:27:37,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:27:39,077 INFO [train.py:1039] (3/4) Epoch 16, batch 1050, loss[loss=0.1944, simple_loss=0.2593, pruned_loss=0.06472, over 23208.00 frames. ], tot_loss[loss=0.1854, simple_loss=0.2583, pruned_loss=0.05624, over 4689655.68 frames. ], batch size: 119, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:27:39,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 00:27:40,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:27:43,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:27:47,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:27:48,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:27:52,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:27:53,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:27:54,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:27:55,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:27:55,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 00:27:55,880 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=538280.0, ans=0.5 2023-09-30 00:27:57,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:27:58,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 00:27:58,903 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=538280.0, ans=0.2 2023-09-30 00:28:00,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:28:01,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 00:28:01,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 00:28:07,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:28:09,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:28:09,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:28:12,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 00:28:12,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 00:28:12,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:28:15,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 00:28:19,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 00:28:19,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:28:23,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 00:28:27,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 00:28:27,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:28:28,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:28:31,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:28:36,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 00:28:37,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 00:28:38,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 00:28:38,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:28:39,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:28:40,039 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 00:28:40,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=538413.3333333334, ans=0.1 2023-09-30 00:28:43,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=538480.0, ans=0.125 2023-09-30 00:28:45,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:28:47,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:28:47,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:28:49,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:28:49,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:28:53,537 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.87 vs. limit=22.5 2023-09-30 00:28:54,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:28:54,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 00:28:54,925 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.19 vs. limit=22.5 2023-09-30 00:28:55,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:28:55,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 00:28:57,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 00:28:57,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:29:00,863 INFO [train.py:1039] (3/4) Epoch 16, batch 1100, loss[loss=0.1583, simple_loss=0.2327, pruned_loss=0.04198, over 24338.00 frames. ], tot_loss[loss=0.184, simple_loss=0.2573, pruned_loss=0.05539, over 4693617.80 frames. ], batch size: 56, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:29:01,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:29:06,282 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.606e+02 1.914e+02 2.113e+02 2.523e+02 4.579e+02, threshold=4.227e+02, percent-clipped=1.0 2023-09-30 00:29:07,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:29:14,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:29:15,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:29:15,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:29:17,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 00:29:19,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:29:21,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 00:29:24,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:29:25,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:29:25,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 00:29:25,974 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=538613.3333333334, ans=0.2 2023-09-30 00:29:27,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 00:29:27,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:29:29,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:29:29,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=538613.3333333334, ans=0.125 2023-09-30 00:29:31,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:29:34,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:29:36,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=538680.0, ans=0.125 2023-09-30 00:29:39,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:29:44,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 00:29:44,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=538680.0, ans=0.0 2023-09-30 00:29:45,120 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=10.20 vs. limit=15.0 2023-09-30 00:29:45,783 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 00:29:45,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:29:47,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:29:49,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 00:29:49,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:29:51,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 00:29:52,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:29:52,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:29:52,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:29:52,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:29:54,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 00:29:59,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:29:59,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 00:29:59,988 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=538746.6666666666, ans=0.1 2023-09-30 00:30:02,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:30:08,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:30:12,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 00:30:12,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 00:30:14,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:30:15,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:30:15,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:30:18,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 00:30:19,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:30:19,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:30:20,445 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.33 vs. limit=22.5 2023-09-30 00:30:21,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 00:30:21,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:30:22,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 00:30:24,581 INFO [train.py:1039] (3/4) Epoch 16, batch 1150, loss[loss=0.183, simple_loss=0.2639, pruned_loss=0.05106, over 24447.00 frames. ], tot_loss[loss=0.185, simple_loss=0.2587, pruned_loss=0.05568, over 4702166.81 frames. ], batch size: 69, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:30:24,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:30:24,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:30:25,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=538880.0, ans=0.0 2023-09-30 00:30:26,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:30:31,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:30:34,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:30:36,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:30:36,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:30:36,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 00:30:36,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:30:39,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 00:30:39,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:30:39,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:30:46,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 00:30:47,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:30:53,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:30:53,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=538946.6666666666, ans=0.0 2023-09-30 00:30:54,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:30:54,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 00:30:56,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:30:56,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:31:01,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 00:31:01,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:31:02,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:31:13,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:31:13,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.min_positive, batch_count=539080.0, ans=0.05 2023-09-30 00:31:19,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:31:19,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 00:31:21,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:31:21,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:31:27,783 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 00:31:29,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:31:29,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=539146.6666666666, ans=0.125 2023-09-30 00:31:37,370 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 00:31:41,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:31:42,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:31:42,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:31:44,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:31:46,372 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.95 vs. limit=15.0 2023-09-30 00:31:47,051 INFO [train.py:1039] (3/4) Epoch 16, batch 1200, loss[loss=0.1913, simple_loss=0.2552, pruned_loss=0.06374, over 23440.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.2599, pruned_loss=0.05617, over 4701987.94 frames. ], batch size: 285, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:31:47,987 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.97 vs. limit=15.0 2023-09-30 00:31:48,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:31:53,134 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.416e+02 1.828e+02 2.089e+02 2.357e+02 3.548e+02, threshold=4.177e+02, percent-clipped=0.0 2023-09-30 00:31:55,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:31:55,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:31:57,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:31:57,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:31:58,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:32:00,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:32:01,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=539213.3333333334, ans=0.125 2023-09-30 00:32:02,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:32:03,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:32:03,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:32:04,375 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=539280.0, ans=0.1 2023-09-30 00:32:07,046 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 00:32:10,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 00:32:13,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:32:16,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:32:19,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:32:20,921 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=539346.6666666666, ans=0.025 2023-09-30 00:32:22,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:32:22,140 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 00:32:22,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:32:27,539 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.67 vs. limit=15.0 2023-09-30 00:32:28,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 00:32:29,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:32:29,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 00:32:29,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:32:31,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=539346.6666666666, ans=0.125 2023-09-30 00:32:34,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 00:32:38,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 00:32:40,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:32:40,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=539413.3333333334, ans=0.2 2023-09-30 00:32:41,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:32:43,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=539413.3333333334, ans=0.0 2023-09-30 00:32:44,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:32:44,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:32:46,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:32:46,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:32:47,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:32:48,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 00:32:48,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:32:48,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:32:48,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:32:50,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:32:51,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:32:51,248 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=539413.3333333334, ans=0.07 2023-09-30 00:32:55,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 00:32:57,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:32:57,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=539480.0, ans=0.0 2023-09-30 00:33:01,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 00:33:04,977 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 00:33:09,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:33:10,642 INFO [train.py:1039] (3/4) Epoch 16, batch 1250, loss[loss=0.1632, simple_loss=0.2367, pruned_loss=0.04488, over 24276.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2598, pruned_loss=0.0556, over 4709969.87 frames. ], batch size: 56, lr: 6.52e-03, grad_scale: 16.0 2023-09-30 00:33:12,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:33:13,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:33:15,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:33:17,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 00:33:22,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:33:22,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:33:22,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 00:33:22,854 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=9.05 vs. limit=15.0 2023-09-30 00:33:23,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:33:25,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:33:30,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 00:33:30,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:33:32,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:33:32,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:33:35,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:33:35,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=539613.3333333334, ans=0.125 2023-09-30 00:33:36,866 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=539613.3333333334, ans=0.125 2023-09-30 00:33:38,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 00:33:38,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:33:38,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:33:42,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:33:43,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:33:46,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:33:48,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 00:33:52,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 00:33:53,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:33:55,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:33:56,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 00:33:58,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:33:58,298 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 00:33:58,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:33:58,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:34:01,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:34:01,825 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=539746.6666666666, ans=0.1 2023-09-30 00:34:05,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=539746.6666666666, ans=0.125 2023-09-30 00:34:06,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:34:06,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:34:08,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 00:34:08,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 00:34:08,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 00:34:11,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:34:11,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=539746.6666666666, ans=0.0 2023-09-30 00:34:14,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 00:34:14,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:34:14,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=539746.6666666666, ans=0.125 2023-09-30 00:34:15,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 00:34:15,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:34:18,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 00:34:18,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 00:34:18,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:34:19,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 00:34:20,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:34:23,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 00:34:27,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:34:27,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=539813.3333333334, ans=0.125 2023-09-30 00:34:28,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:34:29,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:34:31,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:34:33,155 INFO [train.py:1039] (3/4) Epoch 16, batch 1300, loss[loss=0.1654, simple_loss=0.2418, pruned_loss=0.04451, over 19921.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2605, pruned_loss=0.05609, over 4705688.55 frames. ], batch size: 43, lr: 6.51e-03, grad_scale: 16.0 2023-09-30 00:34:36,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:34:36,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 00:34:39,913 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.902e+02 2.089e+02 2.370e+02 3.462e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-30 00:34:41,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:34:43,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 00:34:43,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:34:46,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:34:48,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:34:48,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 00:34:53,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:34:53,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:34:55,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 00:35:00,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:35:03,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:35:05,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:35:06,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:35:08,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:35:08,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:35:09,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 00:35:09,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 00:35:11,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=540013.3333333334, ans=0.125 2023-09-30 00:35:16,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:35:17,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:35:19,894 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 00:35:19,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 00:35:22,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:35:25,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:35:26,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 00:35:26,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:35:26,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 00:35:27,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=540080.0, ans=0.125 2023-09-30 00:35:28,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:35:31,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=540080.0, ans=0.0 2023-09-30 00:35:33,383 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:35:33,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:35:36,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 00:35:38,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 00:35:39,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 00:35:42,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:35:45,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 00:35:47,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:35:51,388 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=8.15 vs. limit=15.0 2023-09-30 00:35:52,994 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=540146.6666666666, ans=0.125 2023-09-30 00:35:54,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 00:35:56,317 INFO [train.py:1039] (3/4) Epoch 16, batch 1350, loss[loss=0.1808, simple_loss=0.2671, pruned_loss=0.04721, over 24421.00 frames. ], tot_loss[loss=0.1854, simple_loss=0.2591, pruned_loss=0.05583, over 4698414.59 frames. ], batch size: 69, lr: 6.51e-03, grad_scale: 16.0 2023-09-30 00:35:59,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:36:02,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:36:06,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:36:07,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:36:08,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:36:09,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:36:12,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:36:12,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=540280.0, ans=0.0 2023-09-30 00:36:13,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 00:36:15,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 00:36:16,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:36:18,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 00:36:19,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:36:21,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:36:21,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 00:36:24,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 00:36:26,785 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=540346.6666666666, ans=0.125 2023-09-30 00:36:28,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 00:36:29,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:36:29,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 00:36:42,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:36:44,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=540413.3333333334, ans=0.1 2023-09-30 00:36:50,893 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=540413.3333333334, ans=0.0 2023-09-30 00:36:52,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:36:52,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:36:52,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 00:36:55,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:36:58,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 00:36:58,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 00:36:58,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:36:58,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=540480.0, ans=0.0 2023-09-30 00:37:02,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:37:04,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 00:37:07,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:37:13,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 00:37:14,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 00:37:17,855 INFO [train.py:1039] (3/4) Epoch 16, batch 1400, loss[loss=0.1792, simple_loss=0.2679, pruned_loss=0.04529, over 24423.00 frames. ], tot_loss[loss=0.1844, simple_loss=0.2588, pruned_loss=0.05501, over 4714628.11 frames. ], batch size: 69, lr: 6.51e-03, grad_scale: 16.0 2023-09-30 00:37:19,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 00:37:22,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:37:23,989 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.643e+02 1.842e+02 1.998e+02 2.370e+02 3.291e+02, threshold=3.996e+02, percent-clipped=0.0 2023-09-30 00:37:24,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:37:24,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:37:31,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 00:37:33,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 00:37:44,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:37:45,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:37:49,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:37:49,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 00:37:54,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:37:55,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 00:38:03,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:04,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:09,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 00:38:10,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:38:12,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:38:12,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:38:13,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:38:15,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:38:15,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:38:15,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:38:15,575 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=540746.6666666666, ans=0.07 2023-09-30 00:38:16,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 00:38:16,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:38:18,177 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.23 vs. limit=12.0 2023-09-30 00:38:19,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=540746.6666666666, ans=0.125 2023-09-30 00:38:22,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:25,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:38:33,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 00:38:33,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 00:38:34,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:38:36,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 00:38:38,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:38:39,954 INFO [train.py:1039] (3/4) Epoch 16, batch 1450, loss[loss=0.1769, simple_loss=0.2613, pruned_loss=0.04629, over 24491.00 frames. ], tot_loss[loss=0.184, simple_loss=0.2582, pruned_loss=0.05488, over 4705411.39 frames. ], batch size: 66, lr: 6.51e-03, grad_scale: 8.0 2023-09-30 00:38:40,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:38:43,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:38:45,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:38:45,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:45,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 00:38:47,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=540880.0, ans=0.1 2023-09-30 00:38:50,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:38:51,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:38:52,295 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=540880.0, ans=0.0 2023-09-30 00:38:53,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:38:54,026 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 00:38:55,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:38:57,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 00:38:57,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:38:58,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:38:58,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 00:39:00,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:39:00,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:39:01,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 00:39:01,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:39:03,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:39:04,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:39:07,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:39:11,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:39:11,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:39:13,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:39:13,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:39:14,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:39:14,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:39:16,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:39:16,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:39:19,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=541013.3333333334, ans=0.0 2023-09-30 00:39:21,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 00:39:24,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:39:28,506 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 00:39:30,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:39:30,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:39:31,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:39:33,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 00:39:37,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:39:39,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 00:39:40,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 00:39:42,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:39:43,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:39:44,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=541146.6666666666, ans=0.2 2023-09-30 00:39:45,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:39:48,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 00:39:48,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.min_positive, batch_count=541146.6666666666, ans=0.025 2023-09-30 00:39:51,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 00:39:51,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 00:39:52,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:39:55,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:40:02,489 INFO [train.py:1039] (3/4) Epoch 16, batch 1500, loss[loss=0.2113, simple_loss=0.2851, pruned_loss=0.06875, over 23391.00 frames. ], tot_loss[loss=0.1841, simple_loss=0.2582, pruned_loss=0.05501, over 4708928.32 frames. ], batch size: 105, lr: 6.51e-03, grad_scale: 8.0 2023-09-30 00:40:06,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 00:40:07,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:40:07,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:40:09,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:40:10,586 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.398e+02 1.908e+02 2.053e+02 2.386e+02 4.299e+02, threshold=4.105e+02, percent-clipped=2.0 2023-09-30 00:40:10,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:40:10,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:40:12,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 00:40:13,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:40:13,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 00:40:13,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:40:15,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:40:18,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:40:18,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:40:23,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:40:23,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 00:40:24,182 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=541280.0, ans=0.125 2023-09-30 00:40:25,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:40:26,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:40:26,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:40:29,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 00:40:35,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 00:40:37,900 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:40:39,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 00:40:41,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 00:40:43,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:40:45,421 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:40:45,459 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:40:45,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=541346.6666666666, ans=0.125 2023-09-30 00:40:46,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 00:40:47,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:40:47,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:40:48,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 00:40:48,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:40:50,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=541413.3333333334, ans=0.0 2023-09-30 00:40:54,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:40:54,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 00:41:00,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=541413.3333333334, ans=0.1 2023-09-30 00:41:01,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 00:41:03,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:41:08,736 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 00:41:08,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:41:08,853 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 00:41:10,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:41:11,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:41:13,786 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 00:41:15,336 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:41:18,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 00:41:19,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:41:22,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:41:22,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:41:23,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:41:24,351 INFO [train.py:1039] (3/4) Epoch 16, batch 1550, loss[loss=0.2612, simple_loss=0.3162, pruned_loss=0.1031, over 19090.00 frames. ], tot_loss[loss=0.1845, simple_loss=0.2585, pruned_loss=0.05521, over 4706002.74 frames. ], batch size: 388, lr: 6.50e-03, grad_scale: 8.0 2023-09-30 00:41:24,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:41:24,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:41:26,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 00:41:26,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 00:41:26,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:41:27,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 00:41:27,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 00:41:30,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:41:32,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:41:34,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:41:34,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:41:35,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:41:35,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:41:39,573 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 00:41:39,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:41:41,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:41:41,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 00:41:42,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=541613.3333333334, ans=0.125 2023-09-30 00:41:42,636 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.35 vs. limit=6.0 2023-09-30 00:41:44,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:41:44,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 00:41:46,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:41:46,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 00:41:47,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 00:41:47,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 00:41:49,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:41:49,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:41:50,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=541613.3333333334, ans=0.1 2023-09-30 00:41:54,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:41:55,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 00:41:55,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 00:42:06,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:42:10,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:42:10,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 00:42:10,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:42:12,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 00:42:17,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 00:42:18,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:42:22,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:42:25,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:42:25,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:42:25,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 00:42:27,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:42:28,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:42:28,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:42:30,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 00:42:30,379 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 00:42:33,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:42:38,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 00:42:44,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:42:45,942 INFO [train.py:1039] (3/4) Epoch 16, batch 1600, loss[loss=0.1736, simple_loss=0.2604, pruned_loss=0.04333, over 24624.00 frames. ], tot_loss[loss=0.1856, simple_loss=0.2596, pruned_loss=0.05582, over 4713457.77 frames. ], batch size: 68, lr: 6.50e-03, grad_scale: 16.0 2023-09-30 00:42:46,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:42:46,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 00:42:46,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:42:48,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:42:48,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:42:48,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:42:49,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:42:53,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:42:54,302 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.627e+02 1.801e+02 1.974e+02 2.195e+02 3.172e+02, threshold=3.948e+02, percent-clipped=0.0 2023-09-30 00:42:54,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 00:42:56,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 00:42:57,108 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.88 vs. limit=15.0 2023-09-30 00:42:59,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 00:43:02,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:43:04,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 00:43:04,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:43:07,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:43:11,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:43:13,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 00:43:16,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:43:17,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 00:43:17,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:43:19,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 00:43:25,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 00:43:33,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:43:33,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 00:43:34,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:43:35,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:43:35,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:43:38,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 00:43:41,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 00:43:42,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:43:44,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:43:44,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:43:44,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:43:47,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:43:48,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:43:50,803 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:43:57,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:43:59,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:43:59,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=542146.6666666666, ans=0.1 2023-09-30 00:44:02,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 00:44:02,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:44:03,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 00:44:06,867 INFO [train.py:1039] (3/4) Epoch 16, batch 1650, loss[loss=0.1841, simple_loss=0.2577, pruned_loss=0.05527, over 24455.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.2611, pruned_loss=0.05681, over 4709682.70 frames. ], batch size: 63, lr: 6.50e-03, grad_scale: 8.0 2023-09-30 00:44:10,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:44:11,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:44:11,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:44:11,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 00:44:11,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 00:44:11,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 00:44:11,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 00:44:14,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:44:15,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:44:16,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:44:16,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:44:19,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:44:21,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 00:44:22,350 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.93 vs. limit=15.0 2023-09-30 00:44:22,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:44:24,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:44:24,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:44:24,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:44:27,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 00:44:27,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 00:44:27,949 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=10.11 vs. limit=15.0 2023-09-30 00:44:31,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=542280.0, ans=0.5 2023-09-30 00:44:32,455 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:44:35,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 00:44:42,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 00:44:42,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:44:42,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=542346.6666666666, ans=0.1 2023-09-30 00:44:45,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 00:44:46,440 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=14.85 vs. limit=15.0 2023-09-30 00:44:47,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:44:50,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:44:51,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:44:51,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:44:53,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:44:53,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:44:56,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:44:56,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:44:58,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:44:58,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:45:00,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:45:01,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:45:05,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:45:07,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 00:45:07,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:45:08,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 00:45:11,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 00:45:11,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 00:45:11,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:45:12,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:45:12,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:45:12,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:45:12,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 00:45:17,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:45:18,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:45:18,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:45:21,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 00:45:26,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:45:26,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:45:26,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 00:45:28,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:45:28,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:45:28,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:45:30,572 INFO [train.py:1039] (3/4) Epoch 16, batch 1700, loss[loss=0.1826, simple_loss=0.2272, pruned_loss=0.06894, over 19483.00 frames. ], tot_loss[loss=0.1875, simple_loss=0.2615, pruned_loss=0.05672, over 4695900.11 frames. ], batch size: 388, lr: 6.50e-03, grad_scale: 8.0 2023-09-30 00:45:32,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:45:33,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:45:33,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 00:45:37,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:45:40,402 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.475e+02 1.927e+02 2.213e+02 2.603e+02 4.204e+02, threshold=4.426e+02, percent-clipped=1.0 2023-09-30 00:45:45,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:45:48,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:45:53,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:45:53,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:45:55,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:45:55,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:45:58,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 00:46:01,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:46:01,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:46:02,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:46:03,203 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 00:46:04,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 00:46:07,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 00:46:08,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 00:46:09,087 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=542680.0, ans=0.0 2023-09-30 00:46:10,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:46:12,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 00:46:13,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:46:20,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:46:22,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:46:22,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:46:23,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 00:46:24,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 00:46:24,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:46:27,039 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:46:27,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 00:46:28,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:46:28,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:46:28,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:46:28,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:46:28,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=542746.6666666666, ans=10.0 2023-09-30 00:46:31,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_na.min_abs, batch_count=542746.6666666666, ans=0.02 2023-09-30 00:46:32,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:46:32,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:46:33,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:46:33,265 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 00:46:35,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:46:35,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:46:40,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:46:42,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 00:46:45,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:46:45,590 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:46:47,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 00:46:51,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=542880.0, ans=0.125 2023-09-30 00:46:53,071 INFO [train.py:1039] (3/4) Epoch 16, batch 1750, loss[loss=0.1728, simple_loss=0.2568, pruned_loss=0.04441, over 24503.00 frames. ], tot_loss[loss=0.1868, simple_loss=0.2608, pruned_loss=0.05638, over 4707204.38 frames. ], batch size: 66, lr: 6.50e-03, grad_scale: 8.0 2023-09-30 00:46:53,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:46:58,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:46:58,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 00:46:58,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 00:46:58,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:47:01,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:47:01,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:47:06,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 00:47:06,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=542880.0, ans=0.125 2023-09-30 00:47:08,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:47:10,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 00:47:10,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:47:11,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:47:15,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 00:47:16,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 00:47:18,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:47:19,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 00:47:28,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:47:32,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:47:32,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:47:35,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:47:35,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:47:38,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:47:38,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:47:43,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:47:43,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:47:44,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 00:47:47,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:47:50,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 00:47:50,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:47:51,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:47:53,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:47:57,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:47:57,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 00:47:57,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:47:59,925 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.80 vs. limit=15.0 2023-09-30 00:48:00,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:48:01,623 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.91 vs. limit=22.5 2023-09-30 00:48:03,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:48:05,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=543146.6666666666, ans=0.125 2023-09-30 00:48:06,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:48:08,355 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:48:08,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 00:48:08,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:48:10,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 00:48:10,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:48:10,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 00:48:10,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:48:11,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:48:14,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:48:16,441 INFO [train.py:1039] (3/4) Epoch 16, batch 1800, loss[loss=0.1858, simple_loss=0.2666, pruned_loss=0.05249, over 24479.00 frames. ], tot_loss[loss=0.1857, simple_loss=0.2597, pruned_loss=0.05591, over 4719190.49 frames. ], batch size: 63, lr: 6.49e-03, grad_scale: 8.0 2023-09-30 00:48:17,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:48:19,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 00:48:21,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:48:22,602 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=5.34 vs. limit=12.0 2023-09-30 00:48:26,084 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.882e+02 2.134e+02 2.523e+02 4.257e+02, threshold=4.267e+02, percent-clipped=0.0 2023-09-30 00:48:26,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 00:48:26,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:48:26,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=543213.3333333334, ans=0.0 2023-09-30 00:48:28,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=543213.3333333334, ans=0.2 2023-09-30 00:48:31,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:48:34,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:48:34,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:48:36,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:48:39,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:48:39,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 00:48:41,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:48:43,228 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.07 vs. limit=15.0 2023-09-30 00:48:44,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:48:47,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=543346.6666666666, ans=0.1 2023-09-30 00:48:48,724 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 00:48:50,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 00:48:50,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 00:48:50,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:48:52,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:48:52,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:48:52,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:49:00,850 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 00:49:03,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:49:05,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:49:08,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 00:49:08,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 00:49:08,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 00:49:09,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:49:11,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:49:12,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=543413.3333333334, ans=0.05 2023-09-30 00:49:12,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=543413.3333333334, ans=0.0 2023-09-30 00:49:16,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 00:49:23,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:49:23,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=543480.0, ans=0.0 2023-09-30 00:49:24,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 00:49:24,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:49:24,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:49:24,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:49:26,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 00:49:27,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=543480.0, ans=0.1 2023-09-30 00:49:29,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:49:29,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:49:34,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 00:49:34,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:49:37,837 INFO [train.py:1039] (3/4) Epoch 16, batch 1850, loss[loss=0.1964, simple_loss=0.2612, pruned_loss=0.0658, over 22825.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2599, pruned_loss=0.05553, over 4726957.59 frames. ], batch size: 322, lr: 6.49e-03, grad_scale: 8.0 2023-09-30 00:49:37,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:49:37,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:49:37,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:49:39,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:49:39,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:49:42,636 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:49:42,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:49:46,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:49:48,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:49:48,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=543546.6666666666, ans=0.0 2023-09-30 00:49:55,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:49:55,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 00:49:59,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 00:50:02,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 00:50:06,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:50:06,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 00:50:06,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 00:50:18,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:50:20,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 00:50:22,815 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=543680.0, ans=0.0 2023-09-30 00:50:23,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:50:24,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:50:28,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 00:50:29,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:50:29,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 00:50:29,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:50:30,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 00:50:33,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:50:37,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:50:37,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:50:38,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 00:50:38,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:50:40,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:50:42,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:50:45,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 00:50:46,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:50:50,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:50:50,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 00:50:50,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 00:50:50,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 00:50:52,879 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 00:50:54,880 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 00:50:56,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:50:56,559 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:50:56,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:50:57,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:50:59,445 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 00:50:59,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:50:59,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:51:00,951 INFO [train.py:1039] (3/4) Epoch 16, batch 1900, loss[loss=0.1626, simple_loss=0.2368, pruned_loss=0.04422, over 24318.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2603, pruned_loss=0.05578, over 4717800.04 frames. ], batch size: 56, lr: 6.49e-03, grad_scale: 8.0 2023-09-30 00:51:01,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:51:02,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:51:02,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:51:02,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 00:51:05,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:51:05,835 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 00:51:05,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 00:51:07,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:51:10,289 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.936e+02 2.154e+02 2.566e+02 3.893e+02, threshold=4.308e+02, percent-clipped=0.0 2023-09-30 00:51:12,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:51:15,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 00:51:16,984 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 00:51:17,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 00:51:19,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 00:51:20,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:51:20,689 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 00:51:22,080 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 00:51:25,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 00:51:27,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:51:31,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 00:51:34,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 00:51:36,751 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.46 vs. limit=15.0 2023-09-30 00:51:45,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 00:51:46,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 00:51:46,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:51:48,325 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 00:51:48,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 00:51:48,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 00:51:49,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 00:51:49,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:51:54,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 00:51:58,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:52:00,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:52:00,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 00:52:02,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:52:05,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 00:52:05,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:52:09,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=544146.6666666666, ans=0.0 2023-09-30 00:52:12,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 00:52:12,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:52:12,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:52:12,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:52:13,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 00:52:13,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 00:52:15,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:52:18,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:52:18,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:52:21,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:52:21,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:52:21,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 00:52:23,322 INFO [train.py:1039] (3/4) Epoch 16, batch 1950, loss[loss=0.1936, simple_loss=0.2728, pruned_loss=0.05717, over 24121.00 frames. ], tot_loss[loss=0.1866, simple_loss=0.2609, pruned_loss=0.0562, over 4718844.36 frames. ], batch size: 80, lr: 6.49e-03, grad_scale: 8.0 2023-09-30 00:52:23,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:52:26,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:52:30,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:52:30,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:52:30,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:52:31,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 00:52:33,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 00:52:33,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:52:35,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:52:37,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:52:37,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:52:37,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:52:40,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:52:45,284 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:52:45,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:52:45,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 00:52:45,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:52:47,133 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=544280.0, ans=0.0 2023-09-30 00:52:49,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:52:53,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:52:53,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:52:54,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 00:52:54,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 00:52:54,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 00:52:55,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:52:56,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:52:59,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:53:01,568 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=544346.6666666666, ans=0.0 2023-09-30 00:53:02,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:53:09,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 00:53:14,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:53:15,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:53:15,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 00:53:16,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:53:19,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:53:21,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:53:22,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:53:29,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:53:30,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:53:32,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:53:34,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:53:37,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:53:37,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:53:39,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 00:53:39,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 00:53:41,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:53:41,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 00:53:44,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:53:46,234 INFO [train.py:1039] (3/4) Epoch 16, batch 2000, loss[loss=0.1588, simple_loss=0.238, pruned_loss=0.03984, over 24486.00 frames. ], tot_loss[loss=0.1873, simple_loss=0.2614, pruned_loss=0.05659, over 4706330.37 frames. ], batch size: 63, lr: 6.49e-03, grad_scale: 16.0 2023-09-30 00:53:47,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:53:49,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:53:49,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:53:51,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:53:54,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:53:55,958 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.874e+02 2.052e+02 2.476e+02 4.888e+02, threshold=4.104e+02, percent-clipped=2.0 2023-09-30 00:53:57,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 00:53:57,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 00:53:58,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=544546.6666666666, ans=0.125 2023-09-30 00:54:00,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:54:03,825 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 00:54:04,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 00:54:05,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:54:08,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:54:10,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 00:54:12,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:13,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:13,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:16,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 00:54:16,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 00:54:17,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 00:54:17,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:54:17,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=544680.0, ans=0.125 2023-09-30 00:54:20,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:54:22,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 00:54:22,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:22,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:54:24,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:54:24,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 00:54:26,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 00:54:26,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:54:26,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:54:29,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=544680.0, ans=0.2 2023-09-30 00:54:33,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:54:35,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:54:35,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:54:35,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=544746.6666666666, ans=0.1 2023-09-30 00:54:36,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:54:39,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:54:41,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:54:41,150 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:54:41,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:54:42,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:54:45,606 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=7.02 vs. limit=15.0 2023-09-30 00:54:47,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:54:47,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 00:54:52,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:54:52,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:54:57,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:54:57,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:55:02,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:02,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:55:02,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:02,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=544813.3333333334, ans=0.125 2023-09-30 00:55:03,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:55:03,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 00:55:06,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:55:07,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:08,441 INFO [train.py:1039] (3/4) Epoch 16, batch 2050, loss[loss=0.1738, simple_loss=0.2373, pruned_loss=0.05515, over 23594.00 frames. ], tot_loss[loss=0.1867, simple_loss=0.2598, pruned_loss=0.0568, over 4689940.29 frames. ], batch size: 256, lr: 6.48e-03, grad_scale: 16.0 2023-09-30 00:55:09,700 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.73 vs. limit=5.0 2023-09-30 00:55:10,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:55:11,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:17,618 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.21 vs. limit=15.0 2023-09-30 00:55:18,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:55:21,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:55:23,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:55:23,355 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:55:23,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=544946.6666666666, ans=0.0 2023-09-30 00:55:24,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 00:55:24,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:55:26,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:55:26,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:55:32,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=544946.6666666666, ans=0.0 2023-09-30 00:55:38,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:55:38,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:55:39,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 00:55:40,017 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 00:55:42,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:55:42,999 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=545013.3333333334, ans=0.0 2023-09-30 00:55:44,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 00:55:44,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:55:47,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:55:49,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:55:49,894 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=545013.3333333334, ans=0.0 2023-09-30 00:55:51,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 00:55:52,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:55:54,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:55:54,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:55:54,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 00:56:00,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:56:01,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 00:56:03,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 00:56:04,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:56:08,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:56:13,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:56:14,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 00:56:19,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:56:21,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:56:23,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 00:56:26,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 00:56:31,384 INFO [train.py:1039] (3/4) Epoch 16, batch 2100, loss[loss=0.1928, simple_loss=0.2662, pruned_loss=0.05966, over 23385.00 frames. ], tot_loss[loss=0.1857, simple_loss=0.259, pruned_loss=0.05619, over 4709367.69 frames. ], batch size: 106, lr: 6.48e-03, grad_scale: 16.0 2023-09-30 00:56:31,440 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 00:56:31,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:56:31,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:56:33,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:56:34,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:56:34,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 00:56:34,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 00:56:37,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 00:56:41,271 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.889e+02 2.067e+02 2.438e+02 3.667e+02, threshold=4.134e+02, percent-clipped=0.0 2023-09-30 00:56:41,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 00:56:41,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:56:42,559 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.99 vs. limit=22.5 2023-09-30 00:56:44,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:56:46,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:56:46,171 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 00:56:47,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 00:56:47,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 00:56:47,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 00:56:49,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:56:50,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:56:50,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 00:56:50,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 00:56:55,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 00:56:55,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 00:56:58,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:56:58,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:57:04,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:57:05,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 00:57:06,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:57:06,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 00:57:08,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 00:57:09,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:57:09,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 00:57:11,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 00:57:11,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 00:57:11,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=545346.6666666666, ans=0.125 2023-09-30 00:57:14,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:57:16,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:57:17,142 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.91 vs. limit=15.0 2023-09-30 00:57:19,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:57:20,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 00:57:22,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:57:22,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=545413.3333333334, ans=0.0 2023-09-30 00:57:23,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:57:23,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 00:57:23,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:57:23,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:57:24,762 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.62 vs. limit=15.0 2023-09-30 00:57:25,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:57:25,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 00:57:26,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 00:57:28,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 00:57:30,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 00:57:34,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 00:57:34,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 00:57:38,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=545480.0, ans=0.0 2023-09-30 00:57:41,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:57:44,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 00:57:46,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:57:46,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:57:46,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 00:57:46,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:57:48,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:57:48,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:57:49,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 00:57:50,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:57:50,439 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=545480.0, ans=0.125 2023-09-30 00:57:51,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 00:57:53,083 INFO [train.py:1039] (3/4) Epoch 16, batch 2150, loss[loss=0.1912, simple_loss=0.2608, pruned_loss=0.06076, over 23404.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2571, pruned_loss=0.05504, over 4710448.51 frames. ], batch size: 285, lr: 6.48e-03, grad_scale: 8.0 2023-09-30 00:57:53,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 00:57:53,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:57:56,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.min_positive, batch_count=545546.6666666666, ans=0.05 2023-09-30 00:57:57,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:57:57,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 00:57:57,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 00:57:58,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:58:05,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 00:58:06,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:58:08,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:58:09,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:58:09,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:11,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:58:15,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:58:15,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 00:58:15,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 00:58:18,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:18,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 00:58:23,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:58:25,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 00:58:28,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:28,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:58:28,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:29,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:58:29,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:58:29,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 00:58:29,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 00:58:31,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 00:58:32,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 00:58:33,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:58:33,629 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.16 vs. limit=22.5 2023-09-30 00:58:34,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:58:35,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 00:58:37,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 00:58:38,991 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:58:39,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 00:58:40,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:58:40,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 00:58:40,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 00:58:44,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:58:44,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:46,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 00:58:46,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=545746.6666666666, ans=0.0 2023-09-30 00:58:48,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 00:58:49,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:58:51,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:58:51,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 00:58:52,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 00:58:52,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 00:58:54,090 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 00:58:54,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:58:54,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:58:55,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 00:58:55,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 00:58:55,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 00:58:55,800 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 00:58:55,801 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 00:58:57,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 00:58:59,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:59:01,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 00:59:01,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 00:59:02,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:03,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 00:59:04,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:59:04,506 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=545813.3333333334, ans=0.0 2023-09-30 00:59:05,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:13,272 INFO [train.py:1039] (3/4) Epoch 16, batch 2200, loss[loss=0.1984, simple_loss=0.2677, pruned_loss=0.06458, over 23625.00 frames. ], tot_loss[loss=0.184, simple_loss=0.2577, pruned_loss=0.05514, over 4712118.97 frames. ], batch size: 256, lr: 6.48e-03, grad_scale: 8.0 2023-09-30 00:59:13,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 00:59:13,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 00:59:17,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 00:59:24,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:24,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 00:59:24,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 00:59:25,922 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.877e+02 2.145e+02 2.548e+02 4.503e+02, threshold=4.290e+02, percent-clipped=1.0 2023-09-30 00:59:26,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 00:59:27,226 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.18 vs. limit=15.0 2023-09-30 00:59:27,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 00:59:29,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 00:59:29,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 00:59:33,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 00:59:36,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 00:59:42,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 00:59:44,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:45,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 00:59:45,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 00:59:49,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 00:59:50,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 00:59:53,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 00:59:53,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=546013.3333333334, ans=0.125 2023-09-30 00:59:54,995 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 00:59:56,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 00:59:59,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:00:00,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:00:03,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:00:04,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:00:04,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=546080.0, ans=0.1 2023-09-30 01:00:07,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 01:00:09,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:00:09,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 01:00:09,546 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=546080.0, ans=10.0 2023-09-30 01:00:10,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:00:12,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:00:12,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:00:13,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:00:13,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:00:13,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:00:15,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:00:15,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 01:00:16,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:00:18,597 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:00:22,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 01:00:23,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:00:26,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:00:28,099 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 01:00:29,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:00:31,166 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 01:00:32,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 01:00:32,911 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 01:00:34,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:00:34,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 01:00:37,314 INFO [train.py:1039] (3/4) Epoch 16, batch 2250, loss[loss=0.1595, simple_loss=0.2431, pruned_loss=0.03794, over 24549.00 frames. ], tot_loss[loss=0.1847, simple_loss=0.258, pruned_loss=0.05567, over 4705781.88 frames. ], batch size: 60, lr: 6.48e-03, grad_scale: 8.0 2023-09-30 01:00:37,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:00:39,526 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 01:00:41,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:00:41,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=546213.3333333334, ans=0.0 2023-09-30 01:00:42,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:00:48,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:00:51,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:00:53,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:00:55,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:00:55,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:00:58,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 01:00:58,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:00:58,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:01:02,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 01:01:03,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:01:03,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:01:04,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:01:08,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:01:10,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 01:01:10,482 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 01:01:12,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 01:01:14,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:01:15,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:01:20,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:01:21,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:01:24,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:01:24,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:01:26,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:01:27,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:01:32,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:01:34,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 01:01:40,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:01:40,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:01:40,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:01:50,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 01:01:52,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:01:52,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 01:01:52,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:01:52,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:01:55,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 01:01:57,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:01:57,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:01:59,851 INFO [train.py:1039] (3/4) Epoch 16, batch 2300, loss[loss=0.2218, simple_loss=0.2856, pruned_loss=0.07897, over 22965.00 frames. ], tot_loss[loss=0.186, simple_loss=0.2598, pruned_loss=0.05611, over 4715051.09 frames. ], batch size: 323, lr: 6.47e-03, grad_scale: 8.0 2023-09-30 01:02:06,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:02:06,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:02:10,579 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 01:02:11,888 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.854e+02 2.032e+02 2.223e+02 2.869e+02, threshold=4.064e+02, percent-clipped=0.0 2023-09-30 01:02:12,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:02:18,728 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=546613.3333333334, ans=0.125 2023-09-30 01:02:18,784 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=546613.3333333334, ans=0.125 2023-09-30 01:02:19,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:02:19,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 01:02:19,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:02:20,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:02:20,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 01:02:21,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:02:23,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:02:24,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:02:27,330 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.63 vs. limit=15.0 2023-09-30 01:02:29,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:02:31,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:02:34,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:02:40,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:02:41,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:02:43,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:02:46,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:02:50,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:02:51,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:02:52,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:02:52,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 01:02:57,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 01:02:57,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:02:57,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:02:57,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:02:57,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:02:59,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 01:02:59,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:02:59,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 01:03:01,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:03:01,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:03:01,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 01:03:06,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:03:09,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:03:14,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:03:14,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:03:16,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:03:18,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:03:18,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:03:19,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:03:21,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 01:03:22,718 INFO [train.py:1039] (3/4) Epoch 16, batch 2350, loss[loss=0.2045, simple_loss=0.2773, pruned_loss=0.06585, over 23342.00 frames. ], tot_loss[loss=0.1879, simple_loss=0.2611, pruned_loss=0.05736, over 4690115.12 frames. ], batch size: 119, lr: 6.47e-03, grad_scale: 8.0 2023-09-30 01:03:26,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:03:26,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 01:03:30,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 01:03:33,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:03:37,548 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:03:37,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:03:37,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:03:39,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:03:40,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 01:03:45,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:03:51,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 01:03:54,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:03:56,559 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=547013.3333333334, ans=0.125 2023-09-30 01:03:59,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:03:59,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:04:00,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:04:01,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 01:04:02,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:04:05,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:04:05,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:04:05,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:04:09,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:04:12,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 01:04:12,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:04:15,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:04:15,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:04:16,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 01:04:18,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:04:22,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 01:04:22,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:04:27,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 01:04:31,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 01:04:32,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:04:32,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 01:04:32,716 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 01:04:32,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 01:04:37,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 01:04:40,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:04:43,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=547213.3333333334, ans=0.2 2023-09-30 01:04:44,706 INFO [train.py:1039] (3/4) Epoch 16, batch 2400, loss[loss=0.1848, simple_loss=0.253, pruned_loss=0.05829, over 23551.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2599, pruned_loss=0.05642, over 4708000.11 frames. ], batch size: 134, lr: 6.47e-03, grad_scale: 16.0 2023-09-30 01:04:44,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:04:48,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:04:51,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:04:52,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.whiten.whitening_limit, batch_count=547213.3333333334, ans=12.0 2023-09-30 01:04:53,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 01:04:53,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 01:04:56,004 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.435e+02 1.899e+02 2.114e+02 2.474e+02 3.602e+02, threshold=4.228e+02, percent-clipped=0.0 2023-09-30 01:04:59,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=547213.3333333334, ans=0.125 2023-09-30 01:05:00,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:05:00,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:05:03,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 01:05:03,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:05:05,060 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:05:05,911 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.63 vs. limit=12.0 2023-09-30 01:05:07,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 01:05:09,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=547280.0, ans=10.0 2023-09-30 01:05:10,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:05:12,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 01:05:18,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:05:21,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 01:05:23,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:05:25,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:05:28,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:05:31,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 01:05:31,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:05:41,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:05:41,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=547413.3333333334, ans=0.125 2023-09-30 01:05:42,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:05:46,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=547413.3333333334, ans=0.125 2023-09-30 01:05:46,257 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.55 vs. limit=15.0 2023-09-30 01:05:47,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:05:49,842 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.71 vs. limit=22.5 2023-09-30 01:05:50,284 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:05:50,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 01:05:50,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:05:50,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:05:50,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:05:50,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:05:57,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:05:58,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:05:58,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 01:06:00,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 01:06:01,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:06:01,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:06:01,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 01:06:03,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 01:06:03,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 01:06:03,293 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 01:06:04,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 01:06:06,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:06:08,338 INFO [train.py:1039] (3/4) Epoch 16, batch 2450, loss[loss=0.1832, simple_loss=0.2599, pruned_loss=0.05324, over 24336.00 frames. ], tot_loss[loss=0.1847, simple_loss=0.2582, pruned_loss=0.05561, over 4709866.19 frames. ], batch size: 61, lr: 6.47e-03, grad_scale: 16.0 2023-09-30 01:06:08,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:06:08,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:06:08,790 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=547546.6666666666, ans=0.0 2023-09-30 01:06:10,606 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 01:06:10,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:06:12,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:06:15,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:06:15,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:06:18,509 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.21 vs. limit=15.0 2023-09-30 01:06:19,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:06:19,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:06:20,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 01:06:22,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=547546.6666666666, ans=0.1 2023-09-30 01:06:26,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:06:26,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:06:30,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:06:30,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:06:30,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:06:31,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 01:06:33,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:06:36,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:06:38,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:06:39,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=547613.3333333334, ans=10.0 2023-09-30 01:06:44,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:06:44,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:06:45,249 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.88 vs. limit=5.0 2023-09-30 01:06:45,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:06:45,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:06:48,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 01:06:48,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:06:57,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:06:57,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=547746.6666666666, ans=0.04949747468305833 2023-09-30 01:06:58,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:06:58,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:07:00,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:07:00,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:07:00,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:07:01,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 01:07:05,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:07:06,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:07:09,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:07:09,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:07:15,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=547813.3333333334, ans=0.05 2023-09-30 01:07:16,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:07:16,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 01:07:18,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:07:20,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:07:20,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 01:07:22,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:07:22,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:07:26,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:07:26,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=547813.3333333334, ans=0.025 2023-09-30 01:07:27,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:07:29,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:07:30,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 01:07:32,333 INFO [train.py:1039] (3/4) Epoch 16, batch 2500, loss[loss=0.1771, simple_loss=0.2423, pruned_loss=0.05601, over 23560.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2567, pruned_loss=0.05523, over 4704046.08 frames. ], batch size: 256, lr: 6.47e-03, grad_scale: 16.0 2023-09-30 01:07:32,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:07:37,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:07:44,700 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.913e+02 2.172e+02 2.502e+02 3.550e+02, threshold=4.344e+02, percent-clipped=0.0 2023-09-30 01:07:47,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:07:47,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:07:49,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:07:49,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 01:07:57,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:07:59,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:08:01,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 01:08:01,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 01:08:01,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 01:08:04,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:08:04,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:08:06,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 01:08:06,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:08:07,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 01:08:07,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:08:11,133 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=548013.3333333334, ans=0.0 2023-09-30 01:08:12,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:08:12,988 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.25 vs. limit=22.5 2023-09-30 01:08:14,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:08:17,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:08:17,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 01:08:18,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:08:21,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:08:24,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:08:29,345 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:08:31,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:08:37,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 01:08:37,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_positive, batch_count=548146.6666666666, ans=0.05 2023-09-30 01:08:39,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=548146.6666666666, ans=0.125 2023-09-30 01:08:40,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 01:08:40,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:08:40,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 01:08:41,542 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.75 vs. limit=22.5 2023-09-30 01:08:42,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:08:42,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:08:43,686 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 01:08:43,687 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 01:08:43,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 01:08:48,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:08:50,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 01:08:50,284 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 01:08:51,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:08:51,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 01:08:54,992 INFO [train.py:1039] (3/4) Epoch 16, batch 2550, loss[loss=0.1702, simple_loss=0.2411, pruned_loss=0.0496, over 24410.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2577, pruned_loss=0.05543, over 4714559.86 frames. ], batch size: 58, lr: 6.46e-03, grad_scale: 16.0 2023-09-30 01:08:55,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 01:08:58,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:09:00,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:09:02,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:09:03,831 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:09:05,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 01:09:06,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:09:09,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 01:09:10,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:09:12,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:09:15,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:09:15,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 01:09:17,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:09:17,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:09:18,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:09:20,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:09:20,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 01:09:21,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 01:09:21,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:09:21,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 01:09:33,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:09:39,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=548346.6666666666, ans=0.2 2023-09-30 01:09:41,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:09:41,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:09:41,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:09:43,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 01:09:48,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=548413.3333333334, ans=0.0 2023-09-30 01:09:51,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:09:54,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:09:54,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:09:54,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:09:54,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 01:09:54,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:09:58,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:09:58,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:10:03,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:10:03,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 01:10:04,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:10:05,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:10:06,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 01:10:06,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:10:07,154 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=548480.0, ans=0.125 2023-09-30 01:10:10,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:10:16,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:10:18,263 INFO [train.py:1039] (3/4) Epoch 16, batch 2600, loss[loss=0.1889, simple_loss=0.2597, pruned_loss=0.05908, over 23254.00 frames. ], tot_loss[loss=0.1849, simple_loss=0.2585, pruned_loss=0.0557, over 4713098.61 frames. ], batch size: 105, lr: 6.46e-03, grad_scale: 8.0 2023-09-30 01:10:19,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:10:20,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=548546.6666666666, ans=0.1 2023-09-30 01:10:24,931 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 01:10:26,542 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 01:10:26,581 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:10:26,629 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 01:10:28,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 01:10:28,162 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 01:10:28,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=548546.6666666666, ans=0.0 2023-09-30 01:10:31,178 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.861e+02 2.102e+02 2.275e+02 3.590e+02, threshold=4.204e+02, percent-clipped=0.0 2023-09-30 01:10:31,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:10:31,404 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 01:10:32,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 01:10:34,367 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 01:10:37,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:10:40,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 01:10:41,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 01:10:43,654 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=548613.3333333334, ans=0.2 2023-09-30 01:10:44,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 01:10:44,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 01:10:48,111 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 01:10:48,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 01:10:54,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:10:54,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:10:54,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:10:54,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 01:10:58,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:11:04,501 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 01:11:09,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:11:09,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:11:10,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 01:11:10,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:11:10,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:11:12,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 01:11:16,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:11:16,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:11:19,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:11:22,832 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 01:11:22,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:11:22,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:11:27,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:11:29,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:11:29,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 01:11:31,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:11:32,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:11:33,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=548813.3333333334, ans=0.125 2023-09-30 01:11:33,142 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:11:34,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:11:40,239 INFO [train.py:1039] (3/4) Epoch 16, batch 2650, loss[loss=0.1903, simple_loss=0.2571, pruned_loss=0.0617, over 23733.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2591, pruned_loss=0.05579, over 4725796.18 frames. ], batch size: 212, lr: 6.46e-03, grad_scale: 4.0 2023-09-30 01:11:40,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 01:11:40,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:11:42,319 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=548880.0, ans=0.125 2023-09-30 01:11:43,772 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:11:48,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 01:11:48,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:11:49,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:11:49,096 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 01:11:51,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:11:54,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:11:55,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:11:57,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:12:00,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:12:02,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 01:12:02,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:12:02,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:12:05,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 01:12:07,452 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 01:12:10,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:12:11,505 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.49 vs. limit=15.0 2023-09-30 01:12:12,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 01:12:13,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:12:13,727 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 01:12:17,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:12:17,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:12:17,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:12:18,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:12:23,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 01:12:25,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 01:12:28,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:12:30,015 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=549080.0, ans=0.05 2023-09-30 01:12:32,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 01:12:32,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:12:34,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:12:34,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:12:34,908 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.82 vs. limit=15.0 2023-09-30 01:12:35,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:12:35,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:12:37,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:12:39,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:12:40,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:12:40,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:12:42,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:12:44,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:12:44,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:12:46,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:12:47,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:12:49,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:12:52,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:12:52,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:12:52,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:12:53,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 01:12:54,961 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.18 vs. limit=15.0 2023-09-30 01:12:57,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:12:59,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:13:03,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:13:03,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:04,932 INFO [train.py:1039] (3/4) Epoch 16, batch 2700, loss[loss=0.1884, simple_loss=0.2714, pruned_loss=0.05265, over 24697.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.2601, pruned_loss=0.05604, over 4717444.05 frames. ], batch size: 73, lr: 6.46e-03, grad_scale: 8.0 2023-09-30 01:13:06,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:13:06,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:08,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:13:08,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 01:13:11,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:13:11,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 01:13:16,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:13:16,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:16,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:18,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:13:18,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:13:18,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:13:18,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 01:13:18,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 01:13:19,478 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.941e+02 2.174e+02 2.573e+02 4.504e+02, threshold=4.348e+02, percent-clipped=1.0 2023-09-30 01:13:19,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:13:22,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:13:22,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:13:22,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:13:25,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:13:27,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 01:13:28,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:13:29,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=549280.0, ans=0.0 2023-09-30 01:13:33,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:13:33,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:13:40,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:13:40,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:13:40,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:13:41,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:13:43,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:13:48,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:13:48,064 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:13:48,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:13:51,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:13:51,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:13:59,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:14:01,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:14:02,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:14:02,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:14:03,751 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.86 vs. limit=10.0 2023-09-30 01:14:08,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:14:08,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=549480.0, ans=0.0 2023-09-30 01:14:09,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:14:11,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:14:12,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:14,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:14:14,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:14:15,149 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.95 vs. limit=15.0 2023-09-30 01:14:16,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:14:17,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=549480.0, ans=0.04949747468305833 2023-09-30 01:14:18,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:14:18,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:14:22,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 01:14:22,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:14:25,728 INFO [train.py:1039] (3/4) Epoch 16, batch 2750, loss[loss=0.2057, simple_loss=0.256, pruned_loss=0.07774, over 19429.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2595, pruned_loss=0.05613, over 4709629.80 frames. ], batch size: 388, lr: 6.46e-03, grad_scale: 8.0 2023-09-30 01:14:25,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:14:27,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 01:14:28,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 01:14:29,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:14:32,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:14:33,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:14:35,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:35,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:14:36,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:40,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:14:40,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 01:14:40,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:14:40,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:40,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 01:14:42,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:14:42,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:14:47,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=549613.3333333334, ans=0.07 2023-09-30 01:14:48,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 01:14:50,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:14:50,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:14:50,674 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:14:52,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:14:52,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:14:53,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:14:53,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:14:53,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:14:58,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:15:00,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 01:15:00,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:15:01,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:15:01,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 01:15:08,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:15:10,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:15:10,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:15:17,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:15:17,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:15:18,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:15:23,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:15:25,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:15:25,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 01:15:30,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:15:32,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 01:15:37,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 01:15:39,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:15:40,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 01:15:41,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:15:42,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:15:42,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 01:15:44,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:15:47,547 INFO [train.py:1039] (3/4) Epoch 16, batch 2800, loss[loss=0.1989, simple_loss=0.2519, pruned_loss=0.07295, over 22718.00 frames. ], tot_loss[loss=0.1848, simple_loss=0.2583, pruned_loss=0.05564, over 4701508.16 frames. ], batch size: 322, lr: 6.46e-03, grad_scale: 16.0 2023-09-30 01:15:47,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 01:15:47,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:15:47,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:15:49,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 01:15:49,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:15:51,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:15:52,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:15:53,861 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.59 vs. limit=15.0 2023-09-30 01:15:54,419 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 01:15:54,420 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 01:15:57,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:16:00,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:16:00,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:16:01,254 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.29 vs. limit=10.0 2023-09-30 01:16:01,933 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.870e+02 2.056e+02 2.355e+02 4.086e+02, threshold=4.112e+02, percent-clipped=0.0 2023-09-30 01:16:02,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:16:04,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 01:16:07,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 01:16:08,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 01:16:10,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:16:10,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:16:10,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:16:13,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:16:13,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=549946.6666666666, ans=0.125 2023-09-30 01:16:15,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:16:15,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 01:16:16,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:16:26,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:16:28,039 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:16:29,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:16:31,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:16:31,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:16:36,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:16:36,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 01:16:36,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:16:38,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:16:38,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:16:45,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:16:45,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:16:48,219 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.27 vs. limit=12.0 2023-09-30 01:16:48,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:16:49,297 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=550080.0, ans=0.5 2023-09-30 01:16:50,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:16:50,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:16:50,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:16:51,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 01:16:52,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:16:54,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:16:54,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 01:16:54,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:16:56,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:16:56,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:16:58,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 01:16:59,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:17:01,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:17:01,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:17:04,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 01:17:10,296 INFO [train.py:1039] (3/4) Epoch 16, batch 2850, loss[loss=0.1808, simple_loss=0.2528, pruned_loss=0.0544, over 23735.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2576, pruned_loss=0.05549, over 4698669.02 frames. ], batch size: 164, lr: 6.45e-03, grad_scale: 16.0 2023-09-30 01:17:10,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:17:10,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:17:10,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:17:12,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:17:15,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:17:15,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:17:17,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:17:19,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:17:20,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:17:22,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:17:23,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 01:17:29,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 01:17:30,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=550280.0, ans=0.1 2023-09-30 01:17:31,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:17:31,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 01:17:33,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:17:34,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=550280.0, ans=15.0 2023-09-30 01:17:36,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 01:17:37,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 01:17:39,465 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:17:49,633 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=550346.6666666666, ans=0.0 2023-09-30 01:17:50,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:17:52,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:17:53,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:17:55,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 01:17:55,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:17:55,204 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:17:56,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:17:57,163 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=550346.6666666666, ans=0.125 2023-09-30 01:17:58,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 01:17:59,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:17:59,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:18:01,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:18:01,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:18:05,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:18:05,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:18:07,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:18:09,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:18:10,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:18:12,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:18:14,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:18:15,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:18:18,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:18:21,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 01:18:22,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 01:18:24,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:18:24,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:18:25,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 01:18:25,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:18:25,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:18:25,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:18:27,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:18:27,302 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 01:18:27,361 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 01:18:27,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:18:27,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:18:33,166 INFO [train.py:1039] (3/4) Epoch 16, batch 2900, loss[loss=0.2009, simple_loss=0.2552, pruned_loss=0.07331, over 19190.00 frames. ], tot_loss[loss=0.1845, simple_loss=0.258, pruned_loss=0.05547, over 4702579.81 frames. ], batch size: 388, lr: 6.45e-03, grad_scale: 16.0 2023-09-30 01:18:33,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 01:18:34,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:18:34,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:18:37,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 01:18:38,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=550546.6666666666, ans=0.05 2023-09-30 01:18:41,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:18:41,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 01:18:43,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 01:18:45,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:18:46,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:18:46,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:18:48,254 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.629e+02 1.959e+02 2.363e+02 2.831e+02 4.091e+02, threshold=4.726e+02, percent-clipped=0.0 2023-09-30 01:18:48,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:18:50,324 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:18:51,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:18:52,311 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.08 vs. limit=15.0 2023-09-30 01:18:52,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:18:56,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:18:56,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 01:18:56,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:18:57,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=550613.3333333334, ans=0.0 2023-09-30 01:18:58,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:18:58,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=550613.3333333334, ans=0.125 2023-09-30 01:19:01,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 01:19:03,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 01:19:04,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:19:04,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 01:19:04,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:19:07,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:19:07,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 01:19:10,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:19:11,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:19:14,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:19:19,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:19:21,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 01:19:21,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 01:19:21,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:19:27,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:19:28,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 01:19:29,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:19:35,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:19:45,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:19:45,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:19:47,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 01:19:49,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=550813.3333333334, ans=0.0 2023-09-30 01:19:49,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=550813.3333333334, ans=0.2 2023-09-30 01:19:50,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:19:52,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 01:19:52,358 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:19:52,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:19:55,411 INFO [train.py:1039] (3/4) Epoch 16, batch 2950, loss[loss=0.1844, simple_loss=0.2603, pruned_loss=0.05425, over 24650.00 frames. ], tot_loss[loss=0.1851, simple_loss=0.2591, pruned_loss=0.05554, over 4707511.04 frames. ], batch size: 65, lr: 6.45e-03, grad_scale: 16.0 2023-09-30 01:19:57,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=550880.0, ans=0.125 2023-09-30 01:19:59,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:20:00,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 01:20:02,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:20:02,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:20:04,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:20:06,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:20:07,775 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 01:20:07,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 01:20:07,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:20:07,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:20:12,771 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=550946.6666666666, ans=0.0 2023-09-30 01:20:14,567 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.32 vs. limit=22.5 2023-09-30 01:20:15,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:20:19,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:20:20,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:20:20,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:20:24,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:20:24,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:20:27,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:20:27,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:20:27,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:20:29,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 01:20:34,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 01:20:34,740 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 01:20:34,964 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=551013.3333333334, ans=0.125 2023-09-30 01:20:36,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:20:37,801 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 01:20:39,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 01:20:39,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:20:41,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:20:41,186 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 01:20:41,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:20:41,673 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=551013.3333333334, ans=0.1 2023-09-30 01:20:44,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 01:20:45,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:20:45,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:20:47,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:20:49,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:20:49,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:20:49,948 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 01:20:51,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:20:51,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 01:20:58,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:20:59,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:20:59,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 01:20:59,889 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:21:01,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 01:21:06,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:21:08,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:21:09,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:21:12,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:21:12,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 01:21:14,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:21:14,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:21:16,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:21:16,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:21:17,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:21:19,000 INFO [train.py:1039] (3/4) Epoch 16, batch 3000, loss[loss=0.1847, simple_loss=0.2728, pruned_loss=0.04834, over 24440.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2602, pruned_loss=0.05627, over 4689016.42 frames. ], batch size: 69, lr: 6.45e-03, grad_scale: 16.0 2023-09-30 01:21:19,000 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-30 01:21:31,795 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.3.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([1.5188, 2.8344, 2.3640, 2.5858, 2.4423, 2.8022, 2.0462, 2.7591], device='cuda:3') 2023-09-30 01:21:34,556 INFO [train.py:1071] (3/4) Epoch 16, validation: loss=0.3091, simple_loss=0.2818, pruned_loss=0.1682, over 1125622.00 frames. 2023-09-30 01:21:34,557 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-30 01:21:34,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:21:36,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:21:36,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 01:21:37,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:21:39,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:21:39,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:21:43,153 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 01:21:44,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 01:21:47,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:21:47,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:21:47,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 01:21:47,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:21:49,607 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.865e+02 2.031e+02 2.277e+02 3.298e+02, threshold=4.063e+02, percent-clipped=0.0 2023-09-30 01:21:55,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:22:05,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:22:13,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 01:22:14,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:22:16,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:22:16,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:22:16,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=551346.6666666666, ans=0.125 2023-09-30 01:22:18,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:22:20,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:22:20,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 01:22:23,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 01:22:25,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:22:25,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 01:22:28,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:22:30,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:22:31,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:22:31,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:22:35,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:22:35,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:22:35,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:22:38,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:22:40,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 01:22:40,899 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=551480.0, ans=0.2 2023-09-30 01:22:42,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:22:43,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:22:43,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:22:45,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:22:47,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:22:48,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 01:22:48,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 01:22:49,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:22:49,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 01:22:50,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:22:52,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 01:22:53,802 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:22:55,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:22:56,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 01:22:57,407 INFO [train.py:1039] (3/4) Epoch 16, batch 3050, loss[loss=0.1702, simple_loss=0.2497, pruned_loss=0.04538, over 24344.00 frames. ], tot_loss[loss=0.1874, simple_loss=0.2612, pruned_loss=0.0568, over 4689903.95 frames. ], batch size: 61, lr: 6.45e-03, grad_scale: 8.0 2023-09-30 01:22:57,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 01:22:57,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 01:22:59,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:23:01,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:23:01,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 01:23:01,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:01,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:23:04,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 01:23:05,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:23:08,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:23:10,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:23:15,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:19,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 01:23:23,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 01:23:23,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 01:23:23,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:23:28,852 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.70 vs. limit=12.0 2023-09-30 01:23:29,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:23:31,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:33,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:23:33,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:23:37,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:23:38,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:23:38,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:23:38,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:23:38,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:23:40,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:41,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:23:45,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:23:45,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 01:23:45,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:23:46,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:23:50,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:23:50,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:23:51,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:23:51,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:23:56,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:23:58,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:23:59,125 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.78 vs. limit=15.0 2023-09-30 01:24:03,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:24:05,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:24:05,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:24:05,557 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=551813.3333333334, ans=0.125 2023-09-30 01:24:06,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:24:08,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 01:24:08,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:24:08,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 01:24:10,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:24:10,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:24:10,780 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=551813.3333333334, ans=0.125 2023-09-30 01:24:12,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 01:24:12,824 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.52 vs. limit=15.0 2023-09-30 01:24:15,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:24:20,273 INFO [train.py:1039] (3/4) Epoch 16, batch 3100, loss[loss=0.1893, simple_loss=0.2658, pruned_loss=0.05638, over 23318.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2607, pruned_loss=0.05672, over 4698679.73 frames. ], batch size: 93, lr: 6.44e-03, grad_scale: 8.0 2023-09-30 01:24:20,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:24:22,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:24:25,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:24:26,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 01:24:30,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 01:24:31,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 01:24:33,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:24:35,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:24:36,434 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.437e+02 1.867e+02 2.041e+02 2.309e+02 3.619e+02, threshold=4.081e+02, percent-clipped=0.0 2023-09-30 01:24:36,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:24:38,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 01:24:43,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:24:50,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 01:24:54,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 01:24:55,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:24:56,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:24:56,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:24:57,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 01:24:59,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:24:59,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 01:24:59,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:25:00,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:25:00,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=552013.3333333334, ans=0.1 2023-09-30 01:25:02,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 01:25:03,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:25:06,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:25:07,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 01:25:08,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 01:25:10,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:11,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:25:13,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:25:13,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:13,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:25:15,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:25:15,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:25:18,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:25:18,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:25:18,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:18,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 01:25:23,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:25:23,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 01:25:25,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:25:27,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 01:25:27,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:25:27,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:27,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 01:25:39,607 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.44 vs. limit=6.0 2023-09-30 01:25:40,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 01:25:41,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:25:43,543 INFO [train.py:1039] (3/4) Epoch 16, batch 3150, loss[loss=0.1669, simple_loss=0.2388, pruned_loss=0.04749, over 20459.00 frames. ], tot_loss[loss=0.1854, simple_loss=0.2588, pruned_loss=0.056, over 4703427.36 frames. ], batch size: 44, lr: 6.44e-03, grad_scale: 8.0 2023-09-30 01:25:43,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:25:44,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=552213.3333333334, ans=0.2 2023-09-30 01:25:45,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:25:45,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:25:46,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 01:25:47,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:25:47,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 01:25:49,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 01:25:50,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:25:52,400 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 01:25:54,604 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.41 vs. limit=6.0 2023-09-30 01:25:56,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 01:25:56,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:25:57,002 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 01:25:59,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 01:25:59,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 01:26:00,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 01:26:00,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 01:26:00,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:26:00,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:26:02,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:26:02,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=552280.0, ans=0.5 2023-09-30 01:26:05,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 01:26:05,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:26:06,480 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.82 vs. limit=15.0 2023-09-30 01:26:07,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:26:07,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:26:09,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 01:26:13,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 01:26:15,900 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:26:16,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=552346.6666666666, ans=0.125 2023-09-30 01:26:18,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:26:19,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:26:19,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=552346.6666666666, ans=0.2 2023-09-30 01:26:20,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 01:26:23,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 01:26:25,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:26:27,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 01:26:27,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 01:26:27,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:26:27,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:26:28,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:26:28,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:26:30,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 01:26:31,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:26:31,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:26:33,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:26:33,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:26:35,429 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 01:26:35,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:26:38,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 01:26:38,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:26:40,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 01:26:42,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 01:26:43,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:26:43,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:26:45,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 01:26:45,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 01:26:46,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:26:50,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:26:51,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:26:51,802 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:26:58,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:26:59,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:27:00,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 01:27:07,218 INFO [train.py:1039] (3/4) Epoch 16, batch 3200, loss[loss=0.1899, simple_loss=0.2579, pruned_loss=0.06095, over 23589.00 frames. ], tot_loss[loss=0.1845, simple_loss=0.2578, pruned_loss=0.05565, over 4709003.76 frames. ], batch size: 256, lr: 6.44e-03, grad_scale: 16.0 2023-09-30 01:27:07,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:27:07,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 01:27:10,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:27:11,978 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:27:11,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 01:27:15,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:27:19,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:27:19,319 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=552546.6666666666, ans=0.0 2023-09-30 01:27:23,444 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.941e+02 2.200e+02 2.639e+02 4.791e+02, threshold=4.401e+02, percent-clipped=2.0 2023-09-30 01:27:23,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:27:27,216 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=552613.3333333334, ans=0.09899494936611666 2023-09-30 01:27:32,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:27:43,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 01:27:44,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:27:47,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 01:27:48,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 01:27:51,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:27:51,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:27:53,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:27:55,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=552680.0, ans=0.0 2023-09-30 01:27:56,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 01:27:58,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 01:28:01,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 01:28:03,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 01:28:05,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:28:10,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=552746.6666666666, ans=0.0 2023-09-30 01:28:11,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:28:11,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:28:11,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:28:14,139 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 01:28:14,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:28:14,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=552813.3333333334, ans=0.125 2023-09-30 01:28:18,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:28:20,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 01:28:20,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 01:28:20,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 01:28:23,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 01:28:25,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:28:27,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:28:27,198 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 01:28:27,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:28:27,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:28:30,168 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 01:28:31,918 INFO [train.py:1039] (3/4) Epoch 16, batch 3250, loss[loss=0.1902, simple_loss=0.2656, pruned_loss=0.05739, over 23363.00 frames. ], tot_loss[loss=0.1847, simple_loss=0.2579, pruned_loss=0.05575, over 4720719.10 frames. ], batch size: 93, lr: 6.44e-03, grad_scale: 16.0 2023-09-30 01:28:32,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:28:35,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:28:45,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:28:45,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 01:28:48,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:28:49,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:28:49,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:28:50,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:28:50,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:28:53,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:28:53,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:28:55,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:28:55,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:28:55,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:28:55,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:28:55,674 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=552946.6666666666, ans=0.0 2023-09-30 01:28:58,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:29:00,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:29:00,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=552946.6666666666, ans=0.125 2023-09-30 01:29:02,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:29:03,155 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.71 vs. limit=6.0 2023-09-30 01:29:03,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:29:05,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:29:05,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:29:05,490 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:29:12,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 01:29:12,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:29:12,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:29:13,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:29:15,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:29:22,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:29:29,660 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.54 vs. limit=15.0 2023-09-30 01:29:30,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:29:30,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:29:30,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 01:29:30,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:29:30,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 01:29:31,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:29:34,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 01:29:35,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 01:29:35,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:29:37,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:29:38,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:29:40,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 01:29:40,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:29:44,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:29:44,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:29:45,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=553146.6666666666, ans=0.125 2023-09-30 01:29:46,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 01:29:46,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:29:49,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:29:49,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 01:29:53,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:29:53,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 01:29:54,965 INFO [train.py:1039] (3/4) Epoch 16, batch 3300, loss[loss=0.1985, simple_loss=0.261, pruned_loss=0.06796, over 22740.00 frames. ], tot_loss[loss=0.1849, simple_loss=0.2589, pruned_loss=0.05547, over 4728366.28 frames. ], batch size: 322, lr: 6.44e-03, grad_scale: 16.0 2023-09-30 01:29:55,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 01:29:56,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 01:29:56,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:30:01,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:30:03,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:30:03,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:05,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 01:30:05,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 01:30:06,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:30:08,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:30:11,912 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.919e+02 2.093e+02 2.361e+02 4.091e+02, threshold=4.187e+02, percent-clipped=0.0 2023-09-30 01:30:12,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 01:30:13,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:30:13,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:30:16,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:16,563 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 01:30:18,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:30:19,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 01:30:19,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:30:19,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:30:19,712 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 01:30:25,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:30:25,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:30:25,652 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=553280.0, ans=0.125 2023-09-30 01:30:28,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:28,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 01:30:30,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 01:30:30,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:32,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:30:33,690 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 01:30:35,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 01:30:35,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:30:39,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 01:30:40,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:30:42,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=553346.6666666666, ans=0.1 2023-09-30 01:30:44,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 01:30:45,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:30:49,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:30:50,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:30:50,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:30:50,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:30:53,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:30:53,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:30:53,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:30:55,240 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 01:30:56,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 01:30:59,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:31:00,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:31:00,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:31:03,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:31:03,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:31:03,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:31:05,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:05,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:31:06,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:31:08,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:31:10,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 01:31:12,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:14,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:15,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:31:15,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:31:15,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:31:19,257 INFO [train.py:1039] (3/4) Epoch 16, batch 3350, loss[loss=0.2025, simple_loss=0.2676, pruned_loss=0.06871, over 23404.00 frames. ], tot_loss[loss=0.186, simple_loss=0.2598, pruned_loss=0.0561, over 4717213.53 frames. ], batch size: 285, lr: 6.43e-03, grad_scale: 16.0 2023-09-30 01:31:19,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:31:19,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:21,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:31:21,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=553546.6666666666, ans=0.125 2023-09-30 01:31:22,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:24,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:31:27,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:28,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:31:30,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:31:31,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:31:32,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=553546.6666666666, ans=0.0 2023-09-30 01:31:33,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 01:31:34,740 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 01:31:34,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:31:38,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 01:31:38,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 01:31:39,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:31:40,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:31:40,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:31:42,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 01:31:42,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:42,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:31:45,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:47,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:47,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:31:49,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:31:50,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:31:55,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:31:55,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:31:58,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:31:59,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:32:01,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:32:01,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:32:03,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:32:06,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 01:32:06,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:32:07,482 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 01:32:07,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:32:09,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 01:32:10,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:32:12,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:32:20,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:32:21,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 01:32:23,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:32:25,022 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:32:25,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:32:31,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:32:34,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 01:32:34,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:32:34,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:32:37,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:32:38,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 01:32:38,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:32:39,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 01:32:40,393 INFO [train.py:1039] (3/4) Epoch 16, batch 3400, loss[loss=0.1574, simple_loss=0.2419, pruned_loss=0.03641, over 24467.00 frames. ], tot_loss[loss=0.187, simple_loss=0.2608, pruned_loss=0.05664, over 4719410.81 frames. ], batch size: 63, lr: 6.43e-03, grad_scale: 16.0 2023-09-30 01:32:40,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:32:40,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:32:42,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:32:42,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:32:42,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 01:32:43,198 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.69 vs. limit=10.0 2023-09-30 01:32:46,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 01:32:46,926 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 01:32:46,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:32:52,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:32:53,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:32:54,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:32:56,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:32:58,066 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.849e+02 2.111e+02 2.348e+02 3.492e+02, threshold=4.221e+02, percent-clipped=0.0 2023-09-30 01:33:01,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:33:01,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=553946.6666666666, ans=0.125 2023-09-30 01:33:04,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 01:33:10,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:33:11,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:33:11,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:33:13,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 01:33:13,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=554013.3333333334, ans=0.2 2023-09-30 01:33:19,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:33:25,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 01:33:32,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:33:32,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:33:33,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 01:33:33,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:33:35,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:33:36,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:33:36,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:33:40,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:33:43,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:33:43,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:33:43,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=554080.0, ans=0.035 2023-09-30 01:33:51,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:33:52,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 01:33:54,550 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=554146.6666666666, ans=0.0 2023-09-30 01:33:56,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 01:34:01,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 01:34:03,388 INFO [train.py:1039] (3/4) Epoch 16, batch 3450, loss[loss=0.211, simple_loss=0.2646, pruned_loss=0.07873, over 19897.00 frames. ], tot_loss[loss=0.1863, simple_loss=0.2602, pruned_loss=0.05617, over 4724523.55 frames. ], batch size: 389, lr: 6.43e-03, grad_scale: 16.0 2023-09-30 01:34:06,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 01:34:07,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:34:09,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:34:09,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 01:34:10,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:34:15,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:34:15,579 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=554213.3333333334, ans=0.125 2023-09-30 01:34:16,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=554213.3333333334, ans=0.125 2023-09-30 01:34:21,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:34:21,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:34:21,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:34:21,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:34:24,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:34:28,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=554280.0, ans=0.2 2023-09-30 01:34:29,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 01:34:36,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 01:34:36,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 01:34:36,126 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:34:38,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:34:44,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 01:34:44,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:34:49,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:34:49,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:34:50,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:34:52,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:34:53,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 01:34:53,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:34:55,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:34:59,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:35:02,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 01:35:05,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:35:11,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:35:14,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:35:17,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:35:22,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:35:22,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:35:22,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:35:22,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:35:25,298 INFO [train.py:1039] (3/4) Epoch 16, batch 3500, loss[loss=0.1856, simple_loss=0.2717, pruned_loss=0.04975, over 24552.00 frames. ], tot_loss[loss=0.1849, simple_loss=0.2586, pruned_loss=0.05564, over 4724367.05 frames. ], batch size: 71, lr: 6.43e-03, grad_scale: 8.0 2023-09-30 01:35:27,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:35:30,336 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:35:30,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 01:35:30,723 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=554546.6666666666, ans=0.125 2023-09-30 01:35:34,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 01:35:35,312 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=554546.6666666666, ans=0.125 2023-09-30 01:35:36,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 01:35:39,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:35:39,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 01:35:39,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=554613.3333333334, ans=0.1 2023-09-30 01:35:43,267 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 1.893e+02 2.078e+02 2.352e+02 3.454e+02, threshold=4.155e+02, percent-clipped=0.0 2023-09-30 01:35:45,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:35:47,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:35:47,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:35:47,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:35:49,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 01:35:49,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:35:51,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:35:51,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 01:35:53,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:35:53,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 01:35:56,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:36:00,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:00,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 01:36:00,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:36:04,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:36:05,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:36:05,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:06,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=554680.0, ans=0.07 2023-09-30 01:36:08,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:36:08,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:36:11,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 01:36:13,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 01:36:13,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 01:36:13,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:36:16,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:17,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:36:17,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:36:22,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 01:36:22,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:36:26,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:36:26,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=554746.6666666666, ans=0.05 2023-09-30 01:36:28,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 01:36:28,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 01:36:28,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:36:31,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:36:32,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:36:33,592 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.16 vs. limit=15.0 2023-09-30 01:36:34,383 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:37,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 01:36:37,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:36:40,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:36:41,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 01:36:43,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 01:36:46,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:36:48,095 INFO [train.py:1039] (3/4) Epoch 16, batch 3550, loss[loss=0.1897, simple_loss=0.2567, pruned_loss=0.06137, over 23521.00 frames. ], tot_loss[loss=0.1841, simple_loss=0.2577, pruned_loss=0.05525, over 4721528.54 frames. ], batch size: 256, lr: 6.43e-03, grad_scale: 8.0 2023-09-30 01:36:48,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:36:48,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:36:48,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:36:52,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:36:59,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:37:00,112 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=554880.0, ans=0.125 2023-09-30 01:37:01,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 01:37:01,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:37:03,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:37:03,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:37:04,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:37:04,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:37:09,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:37:09,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:37:10,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:37:10,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 01:37:12,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:37:18,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:37:18,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:37:20,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:37:20,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:37:20,394 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=555013.3333333334, ans=0.0 2023-09-30 01:37:21,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:37:21,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 01:37:21,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:37:23,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:37:24,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 01:37:31,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:37:33,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:37:33,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:37:33,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=555013.3333333334, ans=0.125 2023-09-30 01:37:34,483 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.67 vs. limit=10.0 2023-09-30 01:37:35,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 01:37:36,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:37:38,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 01:37:39,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:37:41,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:37:41,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:37:43,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=555080.0, ans=0.0 2023-09-30 01:37:44,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 01:37:44,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:37:46,918 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.86 vs. limit=6.0 2023-09-30 01:37:50,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:37:52,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 01:37:52,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:37:58,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:37:58,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 01:38:07,471 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=555146.6666666666, ans=0.1 2023-09-30 01:38:08,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 01:38:09,414 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.46 vs. limit=22.5 2023-09-30 01:38:10,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:38:10,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:38:11,656 INFO [train.py:1039] (3/4) Epoch 16, batch 3600, loss[loss=0.1908, simple_loss=0.2613, pruned_loss=0.06019, over 23158.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2573, pruned_loss=0.0547, over 4727965.89 frames. ], batch size: 105, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:38:11,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:38:13,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:38:14,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:38:18,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:38:18,444 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=555213.3333333334, ans=0.0 2023-09-30 01:38:19,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:38:21,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:38:22,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:38:22,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:38:22,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 01:38:25,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:38:27,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:38:28,709 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.483e+02 2.004e+02 2.343e+02 2.780e+02 3.954e+02, threshold=4.687e+02, percent-clipped=0.0 2023-09-30 01:38:31,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:38:35,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:38:37,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:38:39,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:38:39,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 01:38:39,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:38:42,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:38:43,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:38:43,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:38:45,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=555346.6666666666, ans=0.125 2023-09-30 01:38:47,044 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:38:47,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:38:48,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 01:38:54,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:38:55,576 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=6.04 vs. limit=10.0 2023-09-30 01:38:56,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 01:38:56,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 01:39:01,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:39:06,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:39:09,143 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:39:15,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:39:15,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:39:15,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 01:39:17,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 01:39:17,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 01:39:20,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:39:20,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:39:22,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 01:39:23,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:39:23,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:39:23,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:39:25,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 01:39:26,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 01:39:29,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:39:30,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 01:39:32,461 INFO [train.py:1039] (3/4) Epoch 16, batch 3650, loss[loss=0.1884, simple_loss=0.2731, pruned_loss=0.05187, over 24610.00 frames. ], tot_loss[loss=0.1838, simple_loss=0.2582, pruned_loss=0.05476, over 4731306.89 frames. ], batch size: 68, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:39:36,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 01:39:38,591 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:39:45,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 01:39:46,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 01:39:49,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:39:49,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:39:49,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:39:55,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 01:39:55,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:39:57,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 01:39:57,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:39:57,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:39:57,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 01:39:59,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 01:40:00,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:40:00,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:40:00,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:40:03,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 01:40:06,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 01:40:06,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:40:08,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 01:40:10,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:40:10,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:40:15,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:40:17,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:40:17,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:40:19,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:40:20,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:40:20,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:40:25,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:40:27,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:40:27,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:40:28,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:40:28,980 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:40:30,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:40:36,382 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 01:40:39,782 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:40:41,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:40:41,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:40:42,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:40:42,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:40:44,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:40:44,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 01:40:46,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:40:49,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:40:53,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:40:53,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:40:53,730 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=555880.0, ans=0.0 2023-09-30 01:40:54,850 INFO [train.py:1039] (3/4) Epoch 16, batch 3700, loss[loss=0.1761, simple_loss=0.2533, pruned_loss=0.04947, over 21235.00 frames. ], tot_loss[loss=0.1857, simple_loss=0.2601, pruned_loss=0.05567, over 4700827.13 frames. ], batch size: 46, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:40:56,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:40:56,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 01:40:56,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:40:58,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 01:40:59,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 01:41:01,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 01:41:05,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:41:05,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:41:08,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:41:08,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:41:08,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 01:41:11,793 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.470e+02 1.928e+02 2.143e+02 2.451e+02 3.411e+02, threshold=4.285e+02, percent-clipped=0.0 2023-09-30 01:41:12,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:41:12,394 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=555946.6666666666, ans=0.1 2023-09-30 01:41:14,961 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 01:41:21,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:41:24,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 01:41:24,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:41:24,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 01:41:24,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:41:24,926 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=7.40 vs. limit=15.0 2023-09-30 01:41:30,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:41:31,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 01:41:32,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:41:34,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:41:34,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=556013.3333333334, ans=0.05 2023-09-30 01:41:36,878 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=8.18 vs. limit=15.0 2023-09-30 01:41:37,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:41:37,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:41:39,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 01:41:43,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:41:43,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 01:41:44,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:41:45,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 01:41:48,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:41:48,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:41:51,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:41:53,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 01:41:54,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:41:54,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:41:54,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:41:55,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:41:58,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:42:01,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 01:42:02,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 01:42:04,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:42:04,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:42:05,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:42:06,020 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=556146.6666666666, ans=0.125 2023-09-30 01:42:07,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:42:07,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=556146.6666666666, ans=0.125 2023-09-30 01:42:10,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:42:11,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:42:11,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:42:13,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 01:42:16,455 INFO [train.py:1039] (3/4) Epoch 16, batch 3750, loss[loss=0.1782, simple_loss=0.2495, pruned_loss=0.05339, over 20627.00 frames. ], tot_loss[loss=0.1873, simple_loss=0.2616, pruned_loss=0.05649, over 4705150.81 frames. ], batch size: 45, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:42:16,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 01:42:19,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 01:42:21,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 01:42:21,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:42:22,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:42:23,649 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.57 vs. limit=15.0 2023-09-30 01:42:24,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:42:25,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:42:30,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:42:34,835 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=556280.0, ans=0.125 2023-09-30 01:42:36,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 01:42:36,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:42:39,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:42:44,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:42:44,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 01:42:46,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:42:46,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=556280.0, ans=10.0 2023-09-30 01:42:46,425 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=556280.0, ans=0.1 2023-09-30 01:42:47,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:42:49,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:42:53,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 01:42:57,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 01:42:59,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:43:00,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:43:01,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:43:01,907 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=556346.6666666666, ans=0.125 2023-09-30 01:43:06,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:43:08,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 01:43:10,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=556413.3333333334, ans=0.125 2023-09-30 01:43:13,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 01:43:16,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:43:20,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:43:22,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:43:25,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:43:27,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=556480.0, ans=0.0 2023-09-30 01:43:28,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 01:43:30,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 01:43:30,826 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.85 vs. limit=15.0 2023-09-30 01:43:31,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:43:34,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:43:36,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 01:43:37,529 INFO [train.py:1039] (3/4) Epoch 16, batch 3800, loss[loss=0.1739, simple_loss=0.2598, pruned_loss=0.04399, over 24482.00 frames. ], tot_loss[loss=0.187, simple_loss=0.2615, pruned_loss=0.05625, over 4707735.95 frames. ], batch size: 66, lr: 6.42e-03, grad_scale: 16.0 2023-09-30 01:43:41,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=556546.6666666666, ans=0.125 2023-09-30 01:43:43,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:43:49,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:43:49,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 01:43:50,019 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 01:43:52,292 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=556546.6666666666, ans=0.0 2023-09-30 01:43:53,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:43:53,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:43:55,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 01:43:56,511 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.836e+02 2.013e+02 2.222e+02 3.108e+02, threshold=4.026e+02, percent-clipped=0.0 2023-09-30 01:43:56,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 01:43:56,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:43:58,255 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:43:58,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:43:59,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:43:59,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:44:01,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 01:44:01,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=556613.3333333334, ans=0.2 2023-09-30 01:44:04,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 01:44:06,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:44:07,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:44:10,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:44:12,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 01:44:12,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:44:12,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:44:17,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:44:17,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:44:23,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:44:23,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 01:44:25,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:44:30,157 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=556746.6666666666, ans=0.2 2023-09-30 01:44:32,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:44:39,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:44:40,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 01:44:43,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 01:44:45,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:44:45,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:44:45,591 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=556813.3333333334, ans=0.1 2023-09-30 01:44:46,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:44:48,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 01:44:53,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 01:44:53,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 01:44:54,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:44:55,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:44:59,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=556880.0, ans=0.04949747468305833 2023-09-30 01:45:01,127 INFO [train.py:1039] (3/4) Epoch 16, batch 3850, loss[loss=0.1572, simple_loss=0.2325, pruned_loss=0.04095, over 24405.00 frames. ], tot_loss[loss=0.1862, simple_loss=0.2605, pruned_loss=0.056, over 4698111.63 frames. ], batch size: 58, lr: 6.41e-03, grad_scale: 16.0 2023-09-30 01:45:02,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:45:02,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:45:08,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:45:10,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 01:45:10,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:45:12,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:45:15,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 01:45:18,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:45:18,836 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.05 vs. limit=15.0 2023-09-30 01:45:19,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 01:45:21,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 01:45:28,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:30,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:45:34,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:45:35,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:45:38,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:40,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:45:40,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:45:40,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:45:41,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:45:43,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:45:43,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:43,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=557013.3333333334, ans=0.0 2023-09-30 01:45:44,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:45:44,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 01:45:45,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 01:45:46,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:45:46,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:49,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:45:49,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:45:49,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 01:45:52,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 01:45:54,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:45:56,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 01:45:59,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 01:46:04,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:46:07,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:46:10,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:46:10,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 01:46:13,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 01:46:15,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:46:16,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:46:19,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:46:19,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 01:46:19,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:21,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:21,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:46:21,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 01:46:22,640 INFO [train.py:1039] (3/4) Epoch 16, batch 3900, loss[loss=0.1766, simple_loss=0.2498, pruned_loss=0.05173, over 19420.00 frames. ], tot_loss[loss=0.1846, simple_loss=0.2582, pruned_loss=0.0555, over 4683438.38 frames. ], batch size: 42, lr: 6.41e-03, grad_scale: 16.0 2023-09-30 01:46:22,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:46:24,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 01:46:24,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=557213.3333333334, ans=0.125 2023-09-30 01:46:25,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:25,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:46:27,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:46:27,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:27,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:46:28,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:46:28,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:46:29,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:46:30,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 01:46:30,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:34,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:46:36,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:46:36,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=557213.3333333334, ans=0.1 2023-09-30 01:46:38,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:46:38,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:46:42,006 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.908e+02 2.170e+02 2.558e+02 5.090e+02, threshold=4.341e+02, percent-clipped=1.0 2023-09-30 01:46:42,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:46:42,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:42,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:46:45,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 01:46:45,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:46:46,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 01:46:46,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:46:48,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 01:46:48,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 01:46:53,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:46:53,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:46:53,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:46:55,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:46:58,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:47:01,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:47:03,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:47:03,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:47:05,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:47:12,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:47:12,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:47:19,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 01:47:21,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:47:27,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=557413.3333333334, ans=0.125 2023-09-30 01:47:33,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:47:34,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:47:36,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 01:47:36,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 01:47:37,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 01:47:40,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 01:47:40,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=557480.0, ans=0.0 2023-09-30 01:47:42,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:47:43,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 01:47:46,588 INFO [train.py:1039] (3/4) Epoch 16, batch 3950, loss[loss=0.1768, simple_loss=0.2411, pruned_loss=0.05623, over 23215.00 frames. ], tot_loss[loss=0.1842, simple_loss=0.2579, pruned_loss=0.05525, over 4684696.99 frames. ], batch size: 119, lr: 6.41e-03, grad_scale: 16.0 2023-09-30 01:47:46,926 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=557546.6666666666, ans=0.0 2023-09-30 01:47:52,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:47:53,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 01:47:53,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:47:58,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:48:00,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:48:05,170 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 01:48:06,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:48:06,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 01:48:06,795 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 01:48:08,276 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:48:09,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:48:09,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 01:48:09,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:48:11,736 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 01:48:15,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:48:16,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:48:16,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:48:17,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:48:18,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 01:48:25,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=557680.0, ans=0.09899494936611666 2023-09-30 01:48:31,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:48:31,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:48:32,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=557680.0, ans=0.1 2023-09-30 01:48:36,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 01:48:38,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=557746.6666666666, ans=0.2 2023-09-30 01:48:41,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=557746.6666666666, ans=0.0 2023-09-30 01:48:43,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 01:48:43,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 01:48:43,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:48:46,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:48:50,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=557746.6666666666, ans=0.0 2023-09-30 01:48:50,964 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=557746.6666666666, ans=0.0 2023-09-30 01:48:53,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:48:53,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 01:48:53,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:48:53,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:48:55,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 01:48:58,757 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:49:00,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:49:00,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:49:02,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=557813.3333333334, ans=0.125 2023-09-30 01:49:03,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=557813.3333333334, ans=0.2 2023-09-30 01:49:06,135 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.86 vs. limit=10.0 2023-09-30 01:49:07,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 01:49:10,298 INFO [train.py:1039] (3/4) Epoch 16, batch 4000, loss[loss=0.1937, simple_loss=0.2822, pruned_loss=0.05258, over 24466.00 frames. ], tot_loss[loss=0.1846, simple_loss=0.2583, pruned_loss=0.0555, over 4694127.65 frames. ], batch size: 69, lr: 6.41e-03, grad_scale: 32.0 2023-09-30 01:49:15,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:49:19,415 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.91 vs. limit=6.0 2023-09-30 01:49:21,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:49:27,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:49:28,524 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.918e+02 2.123e+02 2.513e+02 3.159e+02, threshold=4.246e+02, percent-clipped=0.0 2023-09-30 01:49:28,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:49:28,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:49:28,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 01:49:30,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:49:31,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 01:49:31,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:49:31,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 01:49:33,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:49:37,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:49:37,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:49:37,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:49:37,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:49:37,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 01:49:40,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:49:40,242 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 01:49:40,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:49:42,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:49:43,880 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 01:49:45,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:49:45,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:49:49,437 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=4.89 vs. limit=10.0 2023-09-30 01:49:51,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 01:49:53,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:49:54,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=558013.3333333334, ans=0.1 2023-09-30 01:49:55,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:49:57,465 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 01:49:57,706 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=558013.3333333334, ans=0.125 2023-09-30 01:49:59,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:49:59,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 01:49:59,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:50:00,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:50:02,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:50:03,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:50:03,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:50:03,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:50:06,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 01:50:07,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:50:10,269 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 01:50:15,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 01:50:18,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 01:50:20,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:50:21,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:50:22,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:50:23,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:50:28,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:50:31,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 01:50:31,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 01:50:33,779 INFO [train.py:1039] (3/4) Epoch 16, batch 4050, loss[loss=0.2421, simple_loss=0.2953, pruned_loss=0.0945, over 19312.00 frames. ], tot_loss[loss=0.1856, simple_loss=0.259, pruned_loss=0.05606, over 4688876.32 frames. ], batch size: 388, lr: 6.41e-03, grad_scale: 8.0 2023-09-30 01:50:35,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:50:35,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:50:36,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 01:50:37,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:50:38,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:50:43,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:50:46,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:50:48,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 01:50:49,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:50:51,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:50:53,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:50:55,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:50:58,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=558280.0, ans=0.0 2023-09-30 01:50:59,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 01:51:03,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 01:51:03,917 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 01:51:06,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:51:11,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=558346.6666666666, ans=0.125 2023-09-30 01:51:14,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 01:51:15,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:51:19,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:51:20,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=558413.3333333334, ans=0.125 2023-09-30 01:51:22,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:51:22,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=558413.3333333334, ans=0.0 2023-09-30 01:51:23,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:51:23,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:51:25,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:51:30,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 01:51:30,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 01:51:32,406 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:51:34,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 01:51:37,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=558480.0, ans=0.125 2023-09-30 01:51:40,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:51:47,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 01:51:48,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:51:48,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:51:49,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=558480.0, ans=0.2 2023-09-30 01:51:50,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 01:51:50,511 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 01:51:50,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:51:54,816 INFO [train.py:1039] (3/4) Epoch 16, batch 4100, loss[loss=0.2135, simple_loss=0.2797, pruned_loss=0.07366, over 23376.00 frames. ], tot_loss[loss=0.1863, simple_loss=0.2599, pruned_loss=0.05632, over 4703523.88 frames. ], batch size: 285, lr: 6.41e-03, grad_scale: 8.0 2023-09-30 01:51:54,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:51:55,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:51:55,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:51:57,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=558546.6666666666, ans=0.0 2023-09-30 01:52:03,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 01:52:06,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 01:52:07,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 01:52:09,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 01:52:09,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:52:10,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=558613.3333333334, ans=0.125 2023-09-30 01:52:11,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:52:11,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:52:11,465 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:52:13,060 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 01:52:14,559 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.72 vs. limit=22.5 2023-09-30 01:52:15,807 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=558613.3333333334, ans=0.0 2023-09-30 01:52:16,682 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.940e+02 2.145e+02 2.448e+02 4.292e+02, threshold=4.289e+02, percent-clipped=1.0 2023-09-30 01:52:16,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:52:18,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:52:18,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:52:20,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:52:23,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 01:52:23,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:52:23,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:52:23,995 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:52:25,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 01:52:25,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:52:25,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:52:25,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:52:25,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:52:27,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 01:52:30,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:52:33,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 01:52:35,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:52:36,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:52:36,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 01:52:39,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:52:39,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:52:39,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 01:52:41,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 01:52:43,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 01:52:43,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 01:52:46,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 01:52:46,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:52:48,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:52:50,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:52:56,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:52:59,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:53:00,452 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:53:08,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=558813.3333333334, ans=0.125 2023-09-30 01:53:10,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:53:10,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:53:13,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 01:53:16,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:53:16,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=558880.0, ans=0.0 2023-09-30 01:53:18,245 INFO [train.py:1039] (3/4) Epoch 16, batch 4150, loss[loss=0.1791, simple_loss=0.2564, pruned_loss=0.05085, over 21132.00 frames. ], tot_loss[loss=0.186, simple_loss=0.2603, pruned_loss=0.05583, over 4702442.95 frames. ], batch size: 46, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:53:20,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:53:21,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:53:21,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:53:21,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:53:25,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 01:53:25,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:53:26,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 01:53:27,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 01:53:27,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 01:53:29,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:53:33,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:53:33,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:53:37,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:53:39,267 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:53:39,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 01:53:39,749 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=558946.6666666666, ans=0.0 2023-09-30 01:53:42,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 01:53:42,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 01:53:42,726 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 01:53:42,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=558946.6666666666, ans=0.07 2023-09-30 01:53:49,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:53:51,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=559013.3333333334, ans=0.125 2023-09-30 01:53:52,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:53:54,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 01:53:55,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 01:53:55,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 01:53:57,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 01:53:57,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:53:58,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:54:01,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:54:02,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:54:08,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 01:54:10,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:54:11,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:54:13,423 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 01:54:13,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:54:13,852 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=559080.0, ans=0.0 2023-09-30 01:54:15,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 01:54:19,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:54:20,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:54:22,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:54:22,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 01:54:22,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:54:22,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 01:54:23,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 01:54:26,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 01:54:26,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:54:26,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 01:54:26,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 01:54:28,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 01:54:28,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:54:28,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 01:54:30,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:54:32,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:54:32,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 01:54:32,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 01:54:32,555 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=559146.6666666666, ans=0.1 2023-09-30 01:54:38,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:54:40,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 01:54:41,597 INFO [train.py:1039] (3/4) Epoch 16, batch 4200, loss[loss=0.1587, simple_loss=0.236, pruned_loss=0.04071, over 24696.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.258, pruned_loss=0.05532, over 4701361.06 frames. ], batch size: 65, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:54:41,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:54:43,406 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:54:44,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 01:54:46,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:54:46,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:54:49,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 01:54:53,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 01:54:53,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:54:54,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:54:55,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=559213.3333333334, ans=0.125 2023-09-30 01:54:58,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:55:03,002 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.901e+02 2.194e+02 2.609e+02 4.040e+02, threshold=4.389e+02, percent-clipped=0.0 2023-09-30 01:55:03,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 01:55:04,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:55:05,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:55:05,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 01:55:05,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 01:55:06,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:55:08,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:55:08,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 01:55:09,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 01:55:11,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 01:55:11,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:55:11,828 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=559280.0, ans=0.125 2023-09-30 01:55:16,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 01:55:16,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:55:19,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:55:19,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:55:22,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:55:22,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 01:55:24,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:55:24,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:55:24,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=559346.6666666666, ans=0.125 2023-09-30 01:55:31,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 01:55:32,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:55:39,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:55:41,638 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=559413.3333333334, ans=0.125 2023-09-30 01:55:42,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 01:55:45,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:55:50,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 01:55:51,634 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.76 vs. limit=15.0 2023-09-30 01:55:52,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:55:53,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 01:55:56,912 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 01:56:01,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 01:56:02,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 01:56:03,852 INFO [train.py:1039] (3/4) Epoch 16, batch 4250, loss[loss=0.2125, simple_loss=0.2862, pruned_loss=0.0694, over 24376.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2568, pruned_loss=0.05499, over 4700351.14 frames. ], batch size: 77, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:56:04,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:56:09,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 01:56:09,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 01:56:09,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:56:13,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:56:17,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:56:19,452 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.91 vs. limit=15.0 2023-09-30 01:56:21,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:56:21,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:56:24,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 01:56:24,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:56:25,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=559613.3333333334, ans=0.1 2023-09-30 01:56:26,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:56:27,933 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:56:30,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:56:30,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=559613.3333333334, ans=0.1 2023-09-30 01:56:31,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:56:33,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:56:35,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 01:56:36,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=559680.0, ans=0.125 2023-09-30 01:56:40,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 01:56:40,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:56:40,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=559680.0, ans=0.125 2023-09-30 01:56:42,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:56:42,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:56:43,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 01:56:43,814 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:56:43,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:56:48,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 01:56:49,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 01:56:53,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:56:54,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:56:56,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 01:56:56,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 01:56:56,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 01:56:56,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=559746.6666666666, ans=0.0 2023-09-30 01:56:56,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=559746.6666666666, ans=0.125 2023-09-30 01:56:57,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 01:57:00,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:57:02,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:57:02,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:57:04,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 01:57:06,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 01:57:06,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 01:57:11,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:57:14,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:57:16,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 01:57:19,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:57:21,515 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:57:21,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:57:21,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:57:21,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 01:57:23,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=559813.3333333334, ans=0.1 2023-09-30 01:57:24,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:57:26,118 INFO [train.py:1039] (3/4) Epoch 16, batch 4300, loss[loss=0.2161, simple_loss=0.2655, pruned_loss=0.08333, over 19356.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.2566, pruned_loss=0.05505, over 4699783.19 frames. ], batch size: 388, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:57:29,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:57:30,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:57:33,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:57:34,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=559880.0, ans=0.1 2023-09-30 01:57:40,728 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.38 vs. limit=15.0 2023-09-30 01:57:42,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:57:43,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 01:57:43,621 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 01:57:45,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:57:46,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 01:57:47,243 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.556e+02 1.838e+02 2.077e+02 2.423e+02 4.089e+02, threshold=4.153e+02, percent-clipped=0.0 2023-09-30 01:57:47,374 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 01:57:49,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 01:57:51,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:57:59,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 01:57:59,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 01:57:59,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 01:58:02,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 01:58:03,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 01:58:07,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 01:58:07,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 01:58:07,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 01:58:09,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:58:09,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 01:58:09,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 01:58:10,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 01:58:12,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 01:58:16,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:58:17,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 01:58:17,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:58:17,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 01:58:17,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 01:58:17,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 01:58:18,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 01:58:20,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:58:20,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 01:58:20,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 01:58:25,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:58:27,438 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 01:58:27,559 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 01:58:29,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:58:29,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:58:32,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 01:58:33,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 01:58:33,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:58:35,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:58:35,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:58:36,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 01:58:37,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=560146.6666666666, ans=0.1 2023-09-30 01:58:38,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:58:41,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:58:41,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:58:43,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 01:58:47,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 01:58:49,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 01:58:49,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=560213.3333333334, ans=0.125 2023-09-30 01:58:50,638 INFO [train.py:1039] (3/4) Epoch 16, batch 4350, loss[loss=0.1993, simple_loss=0.2742, pruned_loss=0.06225, over 23722.00 frames. ], tot_loss[loss=0.1847, simple_loss=0.2582, pruned_loss=0.05567, over 4709339.21 frames. ], batch size: 85, lr: 6.40e-03, grad_scale: 8.0 2023-09-30 01:58:52,811 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=560213.3333333334, ans=0.125 2023-09-30 01:58:54,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:58:56,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:59:00,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 01:59:00,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 01:59:06,141 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=12.71 vs. limit=15.0 2023-09-30 01:59:06,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 01:59:07,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=560280.0, ans=0.1 2023-09-30 01:59:07,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=560280.0, ans=0.0 2023-09-30 01:59:10,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 01:59:11,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=560280.0, ans=0.125 2023-09-30 01:59:13,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 01:59:13,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:59:16,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 01:59:20,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 01:59:22,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 01:59:27,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 01:59:27,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:59:28,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:59:36,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:59:38,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=560346.6666666666, ans=0.125 2023-09-30 01:59:39,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 01:59:41,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:59:43,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 01:59:48,172 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 01:59:51,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:59:51,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 01:59:51,561 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=560413.3333333334, ans=0.0 2023-09-30 01:59:52,849 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 01:59:52,997 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 01:59:53,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:59:53,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 01:59:54,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 01:59:54,758 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 01:59:55,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 01:59:56,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 01:59:56,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 01:59:59,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 01:59:59,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 01:59:59,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 01:59:59,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:00,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 02:00:02,425 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 02:00:02,432 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 02:00:02,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 02:00:07,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:00:07,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:00:07,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:00:08,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=560480.0, ans=0.125 2023-09-30 02:00:09,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:00:11,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 02:00:13,438 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 02:00:14,780 INFO [train.py:1039] (3/4) Epoch 16, batch 4400, loss[loss=0.1771, simple_loss=0.2585, pruned_loss=0.0478, over 24299.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2592, pruned_loss=0.05568, over 4717807.11 frames. ], batch size: 61, lr: 6.39e-03, grad_scale: 16.0 2023-09-30 02:00:14,859 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:19,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:00:19,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:22,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:00:24,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 02:00:24,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 02:00:24,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 02:00:24,572 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 02:00:26,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 02:00:26,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:00:26,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=560546.6666666666, ans=0.125 2023-09-30 02:00:27,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 02:00:29,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:30,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:00:30,914 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 02:00:35,163 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.842e+02 2.058e+02 2.254e+02 3.655e+02, threshold=4.116e+02, percent-clipped=0.0 2023-09-30 02:00:35,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:00:35,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 02:00:36,810 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 02:00:38,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 02:00:40,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 02:00:40,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 02:00:40,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:00:41,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:00:41,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:00:43,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:00:45,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 02:00:45,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 02:00:47,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:00:49,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:00:49,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:00:51,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:00:53,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:00:53,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 02:00:53,258 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 02:00:57,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:01:04,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:01:04,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=560746.6666666666, ans=0.1 2023-09-30 02:01:07,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 02:01:08,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=560746.6666666666, ans=0.2 2023-09-30 02:01:10,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:01:13,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:01:16,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:01:16,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 02:01:18,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:01:18,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:01:18,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:01:18,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:01:24,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 02:01:28,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 02:01:29,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 02:01:29,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:01:29,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 02:01:31,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:01:34,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:01:36,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 02:01:37,435 INFO [train.py:1039] (3/4) Epoch 16, batch 4450, loss[loss=0.2166, simple_loss=0.2893, pruned_loss=0.07195, over 23536.00 frames. ], tot_loss[loss=0.1868, simple_loss=0.2607, pruned_loss=0.05641, over 4712754.22 frames. ], batch size: 106, lr: 6.39e-03, grad_scale: 16.0 2023-09-30 02:01:40,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:01:43,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:01:45,204 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:01:51,440 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:01:51,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:01:57,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:01:57,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=560946.6666666666, ans=0.1 2023-09-30 02:02:00,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:02:02,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:02:02,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:02:02,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 02:02:02,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:02:04,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:02:05,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:02:05,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:02:09,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 02:02:13,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:02:15,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:02:15,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:02:15,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:02:17,934 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=14.06 vs. limit=22.5 2023-09-30 02:02:18,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:02:23,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 02:02:24,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 02:02:24,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 02:02:24,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:02:26,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:02:28,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 02:02:31,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=561080.0, ans=0.1 2023-09-30 02:02:32,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:02:37,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:02:37,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 02:02:37,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:02:37,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:02:37,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:02:37,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:02:39,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:02:39,648 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=561080.0, ans=0.125 2023-09-30 02:02:43,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 02:02:44,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 02:02:46,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:02:47,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:02:50,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:02:52,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:02:52,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 02:02:52,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=561146.6666666666, ans=0.125 2023-09-30 02:02:55,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:02:58,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 02:03:00,043 INFO [train.py:1039] (3/4) Epoch 16, batch 4500, loss[loss=0.2264, simple_loss=0.2755, pruned_loss=0.08864, over 19571.00 frames. ], tot_loss[loss=0.1871, simple_loss=0.2614, pruned_loss=0.05634, over 4722126.16 frames. ], batch size: 388, lr: 6.39e-03, grad_scale: 8.0 2023-09-30 02:03:00,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:03:05,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:03:05,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 02:03:05,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 02:03:05,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=561213.3333333334, ans=0.0 2023-09-30 02:03:06,226 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.55 vs. limit=22.5 2023-09-30 02:03:08,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:03:12,986 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.01 vs. limit=22.5 2023-09-30 02:03:14,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:03:15,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:03:15,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:03:17,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:03:17,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:03:18,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:03:20,862 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=561280.0, ans=0.125 2023-09-30 02:03:23,482 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.923e+02 2.201e+02 2.757e+02 3.678e+02, threshold=4.403e+02, percent-clipped=0.0 2023-09-30 02:03:29,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:03:31,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:03:33,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:03:33,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:03:35,599 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:03:43,674 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 02:03:43,884 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=561346.6666666666, ans=0.0 2023-09-30 02:03:48,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:03:51,448 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.63 vs. limit=15.0 2023-09-30 02:03:52,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:03:54,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=561413.3333333334, ans=0.0 2023-09-30 02:03:55,779 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:03:57,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 02:03:58,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:03:58,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:04:01,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:04:01,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:04:03,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:04:03,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 02:04:03,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:04:03,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:04:08,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:04:08,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:04:11,900 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:04:14,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:04:14,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:04:15,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 02:04:17,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 02:04:17,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 02:04:22,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 02:04:24,034 INFO [train.py:1039] (3/4) Epoch 16, batch 4550, loss[loss=0.187, simple_loss=0.2648, pruned_loss=0.05462, over 24500.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.2603, pruned_loss=0.05625, over 4723337.72 frames. ], batch size: 63, lr: 6.39e-03, grad_scale: 8.0 2023-09-30 02:04:26,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 02:04:27,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:04:32,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:04:32,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:04:35,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:04:38,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:04:40,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:04:41,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:04:41,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:04:41,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:04:46,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:04:46,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:04:50,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:04:50,759 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=561613.3333333334, ans=0.125 2023-09-30 02:04:54,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 02:04:54,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 02:04:55,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:04:57,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 02:05:00,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 02:05:01,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:05:05,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 02:05:06,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:05:11,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:11,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:11,465 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:05:13,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 02:05:16,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:05:18,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:19,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:05:19,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:05:22,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 02:05:22,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 02:05:22,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:05:22,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 02:05:25,772 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 02:05:25,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:05:26,154 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=561746.6666666666, ans=0.125 2023-09-30 02:05:27,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:05:27,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:05:28,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:28,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:05:31,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:05:31,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 02:05:34,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:05:34,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 02:05:34,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 02:05:34,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:05:37,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 02:05:37,511 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=561813.3333333334, ans=0.125 2023-09-30 02:05:40,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:05:40,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:05:43,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:05:43,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:05:43,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:05:46,086 INFO [train.py:1039] (3/4) Epoch 16, batch 4600, loss[loss=0.1882, simple_loss=0.2643, pruned_loss=0.05607, over 23451.00 frames. ], tot_loss[loss=0.185, simple_loss=0.2587, pruned_loss=0.05568, over 4717422.09 frames. ], batch size: 93, lr: 6.39e-03, grad_scale: 8.0 2023-09-30 02:05:46,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:05:47,019 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.84 vs. limit=6.0 2023-09-30 02:05:47,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:05:49,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:05:51,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:05:55,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:05:55,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:05:55,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:05:56,771 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 02:05:58,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:05:59,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=561880.0, ans=0.07 2023-09-30 02:06:02,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:06:02,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:06:05,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:09,848 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.564e+02 1.941e+02 2.143e+02 2.418e+02 3.430e+02, threshold=4.285e+02, percent-clipped=0.0 2023-09-30 02:06:12,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 02:06:13,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:15,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:18,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:06:18,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:06:25,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 02:06:25,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 02:06:25,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:06:33,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:34,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:06:35,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:06:40,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 02:06:40,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 02:06:45,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:06:47,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:06:49,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=562080.0, ans=0.125 2023-09-30 02:06:50,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:06:50,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 02:06:50,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:06:50,555 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=562080.0, ans=0.0 2023-09-30 02:06:51,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 02:06:51,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:06:53,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:06:54,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:06:54,824 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:06:58,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:06:58,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 02:06:58,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 02:06:58,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 02:06:59,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:07:01,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:07:01,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:07:03,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:07:09,712 INFO [train.py:1039] (3/4) Epoch 16, batch 4650, loss[loss=0.1944, simple_loss=0.2548, pruned_loss=0.067, over 23800.00 frames. ], tot_loss[loss=0.1849, simple_loss=0.258, pruned_loss=0.05583, over 4710081.04 frames. ], batch size: 212, lr: 6.38e-03, grad_scale: 8.0 2023-09-30 02:07:13,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:07:16,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:07:16,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:07:16,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:07:16,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:07:16,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:07:20,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:07:21,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 02:07:26,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:07:28,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 02:07:29,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:07:29,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 02:07:29,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:07:30,922 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 02:07:30,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 02:07:30,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:07:33,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:07:36,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:07:38,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:07:38,344 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 02:07:41,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:07:44,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 02:07:46,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:07:46,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:07:48,277 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 02:07:48,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:07:51,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:07:56,726 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:08:01,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:08:04,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:08:06,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:08:06,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:08:10,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 02:08:11,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 02:08:13,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 02:08:13,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 02:08:14,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:08:21,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:08:21,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:08:23,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 02:08:23,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:08:23,606 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:08:24,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:08:24,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:08:26,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:08:30,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:08:30,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:08:32,591 INFO [train.py:1039] (3/4) Epoch 16, batch 4700, loss[loss=0.1776, simple_loss=0.2647, pruned_loss=0.04522, over 24542.00 frames. ], tot_loss[loss=0.1847, simple_loss=0.2584, pruned_loss=0.05552, over 4721135.64 frames. ], batch size: 71, lr: 6.38e-03, grad_scale: 8.0 2023-09-30 02:08:32,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:08:35,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:08:35,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:08:35,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:08:37,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 02:08:38,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 02:08:39,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 02:08:39,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=562546.6666666666, ans=0.125 2023-09-30 02:08:39,626 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.65 vs. limit=15.0 2023-09-30 02:08:49,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:08:50,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:08:50,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:08:52,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:08:53,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:08:55,749 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.574e+02 1.890e+02 2.173e+02 2.583e+02 3.915e+02, threshold=4.346e+02, percent-clipped=0.0 2023-09-30 02:08:59,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 02:08:59,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 02:08:59,679 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.80 vs. limit=12.0 2023-09-30 02:09:02,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:09:02,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:09:03,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:09:07,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:09:15,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:09:15,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=562680.0, ans=0.125 2023-09-30 02:09:16,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 02:09:18,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:09:22,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=562746.6666666666, ans=0.125 2023-09-30 02:09:24,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 02:09:26,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:09:27,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:29,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=562746.6666666666, ans=0.2 2023-09-30 02:09:34,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 02:09:35,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:09:39,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:09:39,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 02:09:39,348 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=562813.3333333334, ans=0.125 2023-09-30 02:09:40,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:40,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:09:44,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:09:44,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:09:44,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 02:09:47,334 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 02:09:48,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:09:51,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:51,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:51,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 02:09:53,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:09:54,807 INFO [train.py:1039] (3/4) Epoch 16, batch 4750, loss[loss=0.1661, simple_loss=0.2511, pruned_loss=0.04058, over 24664.00 frames. ], tot_loss[loss=0.1846, simple_loss=0.2587, pruned_loss=0.05525, over 4723180.12 frames. ], batch size: 73, lr: 6.38e-03, grad_scale: 8.0 2023-09-30 02:09:55,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 02:09:58,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:09:58,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:09:59,518 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.73 vs. limit=15.0 2023-09-30 02:10:02,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:10:04,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:10:04,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 02:10:05,022 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.18 vs. limit=15.0 2023-09-30 02:10:05,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:10:08,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 02:10:10,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:10:10,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:10:11,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:10:18,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 02:10:23,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:10:26,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 02:10:27,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:10:29,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:10:29,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:10:29,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:10:31,453 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 02:10:31,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 02:10:38,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 02:10:39,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:10:40,687 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.63 vs. limit=15.0 2023-09-30 02:10:42,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:10:44,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:10:44,453 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 02:10:44,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:10:47,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:10:49,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:10:50,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 02:10:50,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 02:10:53,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:10:53,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:10:54,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:10:54,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 02:10:56,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 02:10:59,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 02:11:02,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:07,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:11:07,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 02:11:07,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:11:08,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:11:08,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=563146.6666666666, ans=0.125 2023-09-30 02:11:11,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:11:12,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:11:13,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:11:16,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:11:17,326 INFO [train.py:1039] (3/4) Epoch 16, batch 4800, loss[loss=0.1855, simple_loss=0.26, pruned_loss=0.05548, over 23444.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.26, pruned_loss=0.05589, over 4726397.44 frames. ], batch size: 134, lr: 6.38e-03, grad_scale: 16.0 2023-09-30 02:11:17,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 02:11:17,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 02:11:19,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 02:11:22,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:11:22,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:11:23,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 02:11:28,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:11:28,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:33,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:11:35,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:11:35,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:11:36,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 02:11:37,558 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=563280.0, ans=0.125 2023-09-30 02:11:38,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:11:38,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:11:40,584 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.887e+02 2.070e+02 2.375e+02 3.869e+02, threshold=4.141e+02, percent-clipped=0.0 2023-09-30 02:11:42,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:11:46,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:11:47,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:11:47,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:11:48,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:11:48,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 02:11:49,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:50,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:11:54,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:11:55,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:57,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:11:57,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:11:58,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 02:11:59,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:12:02,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 02:12:02,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 02:12:04,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:12:04,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:12:05,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:12:05,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:12:05,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:12:07,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:12:07,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:12:11,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:12:16,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:17,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:12:21,864 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.76 vs. limit=15.0 2023-09-30 02:12:22,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 02:12:22,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:12:23,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:24,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:12:25,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:12:27,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=563480.0, ans=0.2 2023-09-30 02:12:29,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:12:30,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:12:30,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:32,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:12:33,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:12:33,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:12:37,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:12:37,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:37,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:12:39,477 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=14.01 vs. limit=15.0 2023-09-30 02:12:39,690 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.77 vs. limit=22.5 2023-09-30 02:12:40,170 INFO [train.py:1039] (3/4) Epoch 16, batch 4850, loss[loss=0.2034, simple_loss=0.2805, pruned_loss=0.0631, over 23707.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.261, pruned_loss=0.05587, over 4728125.42 frames. ], batch size: 85, lr: 6.38e-03, grad_scale: 16.0 2023-09-30 02:12:40,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 02:12:41,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 02:12:41,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:12:41,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:12:43,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:12:43,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:12:46,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:12:53,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 02:12:55,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:13:00,102 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:13:01,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:13:01,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:13:04,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:13:06,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:13:06,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:13:06,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 02:13:11,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:13:14,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:13:14,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 02:13:14,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=563680.0, ans=0.2 2023-09-30 02:13:16,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:13:16,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 02:13:19,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:13:19,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:13:24,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:13:24,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 02:13:25,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 02:13:26,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:13:34,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:13:34,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 02:13:37,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:13:37,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:13:39,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:13:40,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 02:13:40,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:13:42,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 02:13:42,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:13:42,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:13:44,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 02:13:47,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=563813.3333333334, ans=0.0 2023-09-30 02:13:54,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:13:59,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:13:59,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:14:02,880 INFO [train.py:1039] (3/4) Epoch 16, batch 4900, loss[loss=0.1606, simple_loss=0.2358, pruned_loss=0.04271, over 24408.00 frames. ], tot_loss[loss=0.1863, simple_loss=0.2606, pruned_loss=0.05595, over 4717957.62 frames. ], batch size: 58, lr: 6.38e-03, grad_scale: 16.0 2023-09-30 02:14:06,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 02:14:06,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:14:10,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:14:12,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:14:12,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:14:13,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=563880.0, ans=0.125 2023-09-30 02:14:15,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 02:14:20,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 02:14:24,979 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.686e+02 1.946e+02 2.133e+02 2.467e+02 3.436e+02, threshold=4.266e+02, percent-clipped=0.0 2023-09-30 02:14:27,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 02:14:27,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 02:14:28,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:14:28,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:14:28,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:14:28,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:14:28,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:14:30,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 02:14:31,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=563946.6666666666, ans=0.2 2023-09-30 02:14:31,442 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.90 vs. limit=15.0 2023-09-30 02:14:32,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=563946.6666666666, ans=0.125 2023-09-30 02:14:34,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 02:14:34,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=564013.3333333334, ans=0.0 2023-09-30 02:14:36,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:14:37,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:14:39,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:14:41,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:14:42,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:14:42,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:14:42,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 02:14:44,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:14:44,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:14:46,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 02:14:46,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 02:14:49,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 02:14:50,134 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.92 vs. limit=6.0 2023-09-30 02:14:50,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:14:52,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:14:52,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:14:52,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:14:53,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 02:14:53,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:14:54,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 02:14:56,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:14:59,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 02:15:02,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:15:04,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 02:15:06,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:15:06,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 02:15:08,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 02:15:11,702 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=564146.6666666666, ans=0.1 2023-09-30 02:15:13,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:15:14,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:15:14,960 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=564146.6666666666, ans=0.1 2023-09-30 02:15:16,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 02:15:16,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 02:15:16,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:15:16,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=564146.6666666666, ans=0.125 2023-09-30 02:15:18,604 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.51 vs. limit=8.0 2023-09-30 02:15:19,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:15:23,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:15:23,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:15:24,965 INFO [train.py:1039] (3/4) Epoch 16, batch 4950, loss[loss=0.1555, simple_loss=0.2325, pruned_loss=0.03922, over 24354.00 frames. ], tot_loss[loss=0.1844, simple_loss=0.2581, pruned_loss=0.05537, over 4723157.27 frames. ], batch size: 56, lr: 6.37e-03, grad_scale: 16.0 2023-09-30 02:15:25,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:15:25,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 02:15:26,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:15:30,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:15:30,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 02:15:35,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 02:15:35,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 02:15:35,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:15:37,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 02:15:37,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:15:37,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:15:38,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:15:38,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:15:42,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:15:42,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:15:44,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:15:46,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:15:46,781 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.51 vs. limit=12.0 2023-09-30 02:15:47,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:15:47,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:15:50,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:15:52,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=564280.0, ans=0.1 2023-09-30 02:15:55,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:15:57,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:15:58,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=564346.6666666666, ans=0.125 2023-09-30 02:16:00,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:16:00,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:02,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:16:02,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 02:16:03,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 02:16:06,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:08,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:16:08,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:16:10,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:16:10,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:16:11,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:16:13,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:16:17,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:16:20,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:16:22,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:16:23,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:23,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 02:16:23,709 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=564413.3333333334, ans=0.125 2023-09-30 02:16:24,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:16:26,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:16:27,147 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.18 vs. limit=15.0 2023-09-30 02:16:31,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:16:32,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:16:32,515 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:16:32,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:32,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:16:34,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:16:36,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:16:37,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:16:37,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:16:39,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 02:16:39,561 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=564480.0, ans=0.125 2023-09-30 02:16:45,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:16:47,454 INFO [train.py:1039] (3/4) Epoch 16, batch 5000, loss[loss=0.1966, simple_loss=0.2806, pruned_loss=0.05628, over 23629.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.2579, pruned_loss=0.05494, over 4732143.51 frames. ], batch size: 85, lr: 6.37e-03, grad_scale: 16.0 2023-09-30 02:16:50,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 02:16:50,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 02:16:56,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:16:56,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:16:58,552 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=564546.6666666666, ans=0.2 2023-09-30 02:16:59,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 02:16:59,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 02:17:02,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:17:04,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 02:17:04,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:17:05,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 02:17:06,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=564613.3333333334, ans=0.07 2023-09-30 02:17:07,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 02:17:07,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:17:08,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:17:08,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 02:17:08,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:17:09,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:17:10,784 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.922e+02 2.167e+02 2.587e+02 4.159e+02, threshold=4.333e+02, percent-clipped=0.0 2023-09-30 02:17:12,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 02:17:12,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 02:17:12,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=564613.3333333334, ans=0.1 2023-09-30 02:17:13,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:17:13,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 02:17:13,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:17:14,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:17:15,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:17:15,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 02:17:15,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 02:17:17,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 02:17:17,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:17:19,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:17:19,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 02:17:20,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:17:22,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:17:22,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:17:24,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 02:17:26,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 02:17:26,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:17:27,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:17:31,562 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 02:17:36,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:17:36,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:17:36,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:17:40,023 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.99 vs. limit=15.0 2023-09-30 02:17:40,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 02:17:40,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:17:40,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:17:42,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:17:43,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=564746.6666666666, ans=0.0 2023-09-30 02:17:44,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 02:17:44,982 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=564746.6666666666, ans=0.2 2023-09-30 02:17:46,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:17:46,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=564746.6666666666, ans=0.125 2023-09-30 02:17:49,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:17:49,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:17:54,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 02:18:02,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:18:10,335 INFO [train.py:1039] (3/4) Epoch 16, batch 5050, loss[loss=0.1834, simple_loss=0.2612, pruned_loss=0.05278, over 24482.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2579, pruned_loss=0.05469, over 4732357.42 frames. ], batch size: 63, lr: 6.37e-03, grad_scale: 8.0 2023-09-30 02:18:10,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:18:12,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:18:12,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:18:12,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:18:13,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:18:13,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:18:13,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:18:15,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=564880.0, ans=0.2 2023-09-30 02:18:15,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=564880.0, ans=0.125 2023-09-30 02:18:16,985 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=564880.0, ans=0.1 2023-09-30 02:18:18,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:18:18,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 02:18:20,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:18:23,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:18:24,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:18:24,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 02:18:26,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:18:26,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:18:30,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:18:30,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=564946.6666666666, ans=0.2 2023-09-30 02:18:31,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:18:31,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:18:41,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 02:18:43,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 02:18:44,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:18:44,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 02:18:44,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:18:44,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:18:46,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:18:46,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:18:46,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 02:18:46,703 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=565013.3333333334, ans=0.125 2023-09-30 02:18:47,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 02:18:47,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:18:51,025 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=565013.3333333334, ans=0.035 2023-09-30 02:18:51,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=565013.3333333334, ans=0.0 2023-09-30 02:18:52,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:18:54,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:18:55,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 02:18:56,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=565013.3333333334, ans=0.1 2023-09-30 02:18:57,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:19:00,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 02:19:01,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:19:02,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:19:04,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:19:04,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:19:05,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:19:08,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:19:08,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:19:09,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:19:09,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:19:10,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 02:19:11,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:19:13,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:19:17,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:19:17,131 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 02:19:17,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 02:19:18,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:19:20,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:19:20,161 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 02:19:23,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:19:23,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 02:19:23,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:19:26,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:19:26,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:19:27,742 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.38 vs. limit=22.5 2023-09-30 02:19:28,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 02:19:28,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 02:19:31,360 INFO [train.py:1039] (3/4) Epoch 16, batch 5100, loss[loss=0.2016, simple_loss=0.2668, pruned_loss=0.06824, over 23723.00 frames. ], tot_loss[loss=0.1848, simple_loss=0.2593, pruned_loss=0.05522, over 4733540.34 frames. ], batch size: 164, lr: 6.37e-03, grad_scale: 8.0 2023-09-30 02:19:31,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:19:31,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:19:31,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:19:35,408 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 02:19:38,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:19:42,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 02:19:42,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 02:19:44,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:19:45,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:19:48,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:19:50,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 02:19:50,108 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 02:19:54,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:19:54,740 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:19:56,001 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.636e+02 1.868e+02 2.082e+02 2.336e+02 3.756e+02, threshold=4.164e+02, percent-clipped=0.0 2023-09-30 02:19:57,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:19:58,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=565280.0, ans=0.125 2023-09-30 02:20:01,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 02:20:01,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:20:04,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:20:04,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 02:20:08,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:20:08,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:20:09,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 02:20:12,024 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 02:20:12,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:20:14,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 02:20:14,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 02:20:16,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:20:19,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=565346.6666666666, ans=0.125 2023-09-30 02:20:25,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:20:27,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 02:20:28,856 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 02:20:28,869 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 02:20:29,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 02:20:29,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:20:32,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 02:20:35,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 02:20:37,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 02:20:39,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:20:40,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 02:20:44,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:20:45,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 02:20:48,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=565480.0, ans=0.0 2023-09-30 02:20:49,848 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=565480.0, ans=0.0 2023-09-30 02:20:51,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:20:51,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:20:51,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:20:53,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:20:54,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:20:56,253 INFO [train.py:1039] (3/4) Epoch 16, batch 5150, loss[loss=0.1656, simple_loss=0.2408, pruned_loss=0.04518, over 23529.00 frames. ], tot_loss[loss=0.1858, simple_loss=0.2602, pruned_loss=0.05576, over 4735775.76 frames. ], batch size: 106, lr: 6.37e-03, grad_scale: 8.0 2023-09-30 02:20:56,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:20:56,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 02:20:56,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 02:20:57,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 02:20:59,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:20:59,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 02:21:00,948 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:21:01,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 02:21:02,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:21:04,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:21:09,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=565546.6666666666, ans=0.1 2023-09-30 02:21:10,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 02:21:11,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 02:21:12,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:21:12,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:21:15,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:21:15,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:21:15,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:21:17,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:21:17,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:21:17,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 02:21:19,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:21:20,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:21:21,119 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=565613.3333333334, ans=0.125 2023-09-30 02:21:23,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:21:24,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 02:21:24,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=565613.3333333334, ans=0.0 2023-09-30 02:21:27,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:21:32,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:21:35,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 02:21:38,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:21:45,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:21:46,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:21:50,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:21:50,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:21:53,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 02:21:58,856 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:22:00,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:22:00,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:22:02,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:22:04,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:22:06,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 02:22:10,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:22:10,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:22:14,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:22:14,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:22:14,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 02:22:15,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:22:15,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:22:15,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:22:19,286 INFO [train.py:1039] (3/4) Epoch 16, batch 5200, loss[loss=0.1797, simple_loss=0.2455, pruned_loss=0.0569, over 23914.00 frames. ], tot_loss[loss=0.1864, simple_loss=0.261, pruned_loss=0.05593, over 4730986.86 frames. ], batch size: 195, lr: 6.36e-03, grad_scale: 16.0 2023-09-30 02:22:19,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:22:19,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=565880.0, ans=0.1 2023-09-30 02:22:22,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:22:23,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:22:29,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 02:22:29,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:22:30,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:22:31,194 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=565880.0, ans=0.0 2023-09-30 02:22:32,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:22:34,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:22:34,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:22:36,355 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.55 vs. limit=15.0 2023-09-30 02:22:39,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 02:22:41,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:22:41,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:22:44,239 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.323e+02 1.894e+02 2.058e+02 2.319e+02 3.515e+02, threshold=4.116e+02, percent-clipped=0.0 2023-09-30 02:22:44,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 02:22:46,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:22:46,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:22:47,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 02:22:49,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 02:22:52,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 02:22:53,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:22:53,708 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 02:22:53,718 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:22:55,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:22:55,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:22:55,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 02:22:57,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:22:59,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:23:02,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 02:23:02,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 02:23:04,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 02:23:09,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 02:23:11,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:23:16,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:23:16,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:23:16,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=566080.0, ans=0.125 2023-09-30 02:23:18,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 02:23:18,744 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.20 vs. limit=15.0 2023-09-30 02:23:19,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:23:19,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 02:23:19,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:23:19,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:23:22,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:23:22,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:23:27,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:23:29,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:23:29,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:23:34,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:23:35,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 02:23:37,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:23:37,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:23:38,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:23:40,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 02:23:40,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:23:42,262 INFO [train.py:1039] (3/4) Epoch 16, batch 5250, loss[loss=0.1941, simple_loss=0.2787, pruned_loss=0.05474, over 24384.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2593, pruned_loss=0.05629, over 4712484.52 frames. ], batch size: 77, lr: 6.36e-03, grad_scale: 16.0 2023-09-30 02:23:45,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:23:50,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:23:50,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:23:52,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:23:52,557 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=566213.3333333334, ans=0.125 2023-09-30 02:23:58,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:23:58,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:24:01,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:24:03,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:24:05,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 02:24:05,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:24:08,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:24:10,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=566280.0, ans=0.125 2023-09-30 02:24:41,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=566480.0, ans=0.125 2023-09-30 02:24:57,078 INFO [train.py:1039] (3/4) Epoch 16, batch 5300, loss[loss=0.2033, simple_loss=0.2829, pruned_loss=0.06185, over 23573.00 frames. ], tot_loss[loss=0.1852, simple_loss=0.2582, pruned_loss=0.05603, over 4691675.02 frames. ], batch size: 85, lr: 6.36e-03, grad_scale: 16.0 2023-09-30 02:25:12,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:25:12,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 02:25:12,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 02:25:12,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:25:12,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:25:12,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:25:12,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:25:12,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:25:12,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:25:13,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:25:13,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:25:13,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:25:13,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 02:25:13,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 02:25:13,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 02:25:14,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 02:25:14,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 02:25:14,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 02:25:14,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:25:15,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:25:15,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:25:15,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:25:15,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:25:16,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:25:16,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:25:16,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:25:16,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:25:16,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:25:16,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:25:16,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:25:16,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:25:17,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 02:25:17,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:25:18,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:25:18,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 02:25:18,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 02:25:18,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:25:18,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:25:18,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 02:25:18,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 02:25:19,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 02:25:20,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:25:20,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:25:20,481 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 02:25:20,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 02:25:20,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:25:20,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:25:20,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 02:25:20,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 02:25:21,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 02:25:21,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 02:25:29,948 INFO [train.py:1039] (3/4) Epoch 17, batch 0, loss[loss=0.18, simple_loss=0.2542, pruned_loss=0.05291, over 23668.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2542, pruned_loss=0.05291, over 23668.00 frames. ], batch size: 149, lr: 6.17e-03, grad_scale: 32.0 2023-09-30 02:25:29,948 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-30 02:25:40,426 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.5.encoder.layers.0.self_attn_weights, attn_weights_entropy = tensor([1.8980, 1.8359, 4.8178, 4.3315], device='cuda:3') 2023-09-30 02:25:43,990 INFO [train.py:1071] (3/4) Epoch 17, validation: loss=0.3013, simple_loss=0.2697, pruned_loss=0.1665, over 1125622.00 frames. 2023-09-30 02:25:43,991 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-30 02:25:45,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 02:25:47,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:25:49,125 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.597e+02 1.973e+02 2.191e+02 2.524e+02 3.767e+02, threshold=4.382e+02, percent-clipped=0.0 2023-09-30 02:25:49,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:25:56,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:25:56,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:25:56,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:25:58,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 02:26:00,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 02:26:01,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:26:01,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=566693.3333333334, ans=0.0 2023-09-30 02:26:02,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:26:03,616 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.48 vs. limit=15.0 2023-09-30 02:26:05,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:26:05,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:26:07,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:26:07,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:26:09,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 02:26:10,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:26:16,690 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=566760.0, ans=0.125 2023-09-30 02:26:18,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=566760.0, ans=0.2 2023-09-30 02:26:18,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=566760.0, ans=0.95 2023-09-30 02:26:19,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:26:19,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:26:19,446 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=566760.0, ans=0.015 2023-09-30 02:26:22,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 02:26:26,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:26:26,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:26:27,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:26:32,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:26:35,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:26:39,746 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.27 vs. limit=15.0 2023-09-30 02:26:40,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 02:26:40,785 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=566826.6666666666, ans=0.125 2023-09-30 02:26:45,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 02:26:45,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:26:45,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:26:46,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:26:47,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:26:47,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=566826.6666666666, ans=0.125 2023-09-30 02:26:49,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 02:26:52,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:26:53,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:26:58,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:27:02,257 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 02:27:03,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:27:07,322 INFO [train.py:1039] (3/4) Epoch 17, batch 50, loss[loss=0.1564, simple_loss=0.2329, pruned_loss=0.03993, over 24306.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.2601, pruned_loss=0.0539, over 1077762.21 frames. ], batch size: 56, lr: 6.17e-03, grad_scale: 16.0 2023-09-30 02:27:07,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:27:10,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:27:10,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 02:27:10,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:27:12,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:27:13,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:27:15,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:27:18,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:27:22,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 02:27:22,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:27:27,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 02:27:30,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 02:27:31,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 02:27:33,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=567026.6666666666, ans=0.0 2023-09-30 02:27:34,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:27:36,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:27:36,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:27:36,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:27:38,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:27:38,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:27:38,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:27:44,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=567093.3333333334, ans=0.125 2023-09-30 02:27:46,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:27:47,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:27:48,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 02:27:48,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 02:27:49,064 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.21 vs. limit=15.0 2023-09-30 02:27:50,472 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=567093.3333333334, ans=0.1 2023-09-30 02:27:50,494 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=567093.3333333334, ans=0.0 2023-09-30 02:27:52,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:27:53,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:27:53,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 02:27:53,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:27:56,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 02:28:05,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:28:05,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:28:05,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:28:06,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:28:06,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:28:07,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=567160.0, ans=0.125 2023-09-30 02:28:09,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 02:28:09,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 02:28:12,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:28:12,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:28:13,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:28:13,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:28:15,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 02:28:16,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 02:28:16,921 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=567226.6666666666, ans=0.125 2023-09-30 02:28:18,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 02:28:18,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:28:18,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:28:19,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 02:28:19,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 02:28:19,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:28:21,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:28:24,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 02:28:24,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:28:28,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:28:29,442 INFO [train.py:1039] (3/4) Epoch 17, batch 100, loss[loss=0.186, simple_loss=0.2747, pruned_loss=0.04862, over 24452.00 frames. ], tot_loss[loss=0.1859, simple_loss=0.2615, pruned_loss=0.05511, over 1892528.42 frames. ], batch size: 69, lr: 6.16e-03, grad_scale: 16.0 2023-09-30 02:28:31,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:28:34,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:28:36,142 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.907e+02 2.184e+02 2.612e+02 4.946e+02, threshold=4.368e+02, percent-clipped=2.0 2023-09-30 02:28:36,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 02:28:36,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:28:39,597 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:28:39,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:28:41,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:28:41,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:28:41,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:28:42,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 02:28:44,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:28:44,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:28:45,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:28:45,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:28:48,335 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=567360.0, ans=0.125 2023-09-30 02:28:49,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 02:28:51,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:28:52,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:28:54,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 02:28:55,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:28:59,514 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 02:28:59,538 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 02:28:59,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:28:59,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:29:02,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 02:29:05,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:29:07,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:12,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:12,540 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 02:29:15,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 02:29:19,405 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.84 vs. limit=15.0 2023-09-30 02:29:20,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:29:20,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:29:22,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:27,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:29:28,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:29:31,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:29:32,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:34,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:29:36,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=567560.0, ans=0.2 2023-09-30 02:29:37,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:29:37,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:29:37,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:29:38,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 02:29:38,682 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 02:29:38,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:29:38,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:29:40,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:29:40,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:29:40,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 02:29:41,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 02:29:41,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:29:41,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:29:41,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:29:44,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:29:44,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:29:45,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:29:46,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:29:49,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:29:49,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:29:50,101 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=567626.6666666666, ans=0.125 2023-09-30 02:29:51,250 INFO [train.py:1039] (3/4) Epoch 17, batch 150, loss[loss=0.199, simple_loss=0.265, pruned_loss=0.06652, over 23539.00 frames. ], tot_loss[loss=0.1856, simple_loss=0.2607, pruned_loss=0.05523, over 2527085.65 frames. ], batch size: 256, lr: 6.16e-03, grad_scale: 8.0 2023-09-30 02:29:51,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:29:53,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:29:55,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:29:57,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:29:58,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:30:03,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 02:30:03,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 02:30:03,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 02:30:05,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:30:06,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:30:08,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:30:09,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:30:09,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:30:09,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:30:11,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:30:12,757 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 02:30:14,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:30:21,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:30:25,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:30:25,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 02:30:29,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:30:29,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:30:29,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:30:29,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=567760.0, ans=0.125 2023-09-30 02:30:33,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:30:36,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:30:36,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:30:36,943 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=567760.0, ans=0.125 2023-09-30 02:30:38,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:30:39,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 02:30:46,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:30:47,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:30:48,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:30:48,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:30:51,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:30:52,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 02:30:54,471 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=567893.3333333334, ans=0.125 2023-09-30 02:30:56,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:30:57,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:30:59,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:31:01,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=567893.3333333334, ans=0.0 2023-09-30 02:31:02,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:31:02,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 02:31:02,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:31:03,818 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 02:31:05,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=567893.3333333334, ans=0.0 2023-09-30 02:31:06,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:31:12,040 INFO [train.py:1039] (3/4) Epoch 17, batch 200, loss[loss=0.1721, simple_loss=0.2483, pruned_loss=0.04798, over 20266.00 frames. ], tot_loss[loss=0.185, simple_loss=0.2607, pruned_loss=0.05469, over 3020780.04 frames. ], batch size: 44, lr: 6.16e-03, grad_scale: 8.0 2023-09-30 02:31:12,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:31:12,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:31:14,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=567960.0, ans=0.125 2023-09-30 02:31:15,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 02:31:15,384 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:31:16,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:31:18,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:31:19,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=567960.0, ans=0.125 2023-09-30 02:31:20,235 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.902e+02 2.115e+02 2.489e+02 3.841e+02, threshold=4.230e+02, percent-clipped=0.0 2023-09-30 02:31:22,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 02:31:23,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 02:31:23,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:31:25,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:31:28,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:31:28,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:31:28,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:31:32,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=568026.6666666666, ans=0.95 2023-09-30 02:31:35,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=568026.6666666666, ans=0.1 2023-09-30 02:31:42,104 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.99 vs. limit=15.0 2023-09-30 02:31:43,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=568093.3333333334, ans=0.0 2023-09-30 02:31:50,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:31:51,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:31:53,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:31:53,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:31:55,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 02:31:55,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:31:56,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:31:57,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:31:57,689 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.44 vs. limit=15.0 2023-09-30 02:31:58,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:31:58,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:32:00,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 02:32:00,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 02:32:00,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:32:03,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:32:09,705 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=16.83 vs. limit=22.5 2023-09-30 02:32:10,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:32:12,579 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=568160.0, ans=0.125 2023-09-30 02:32:12,629 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=568160.0, ans=0.125 2023-09-30 02:32:15,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:15,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:32:22,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:26,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 02:32:26,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:32:27,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:32:27,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:32:29,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:32:30,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 02:32:32,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:32:32,396 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 02:32:33,170 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.01 vs. limit=15.0 2023-09-30 02:32:35,453 INFO [train.py:1039] (3/4) Epoch 17, batch 250, loss[loss=0.1661, simple_loss=0.2408, pruned_loss=0.04571, over 24621.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.2606, pruned_loss=0.05584, over 3395753.50 frames. ], batch size: 60, lr: 6.16e-03, grad_scale: 8.0 2023-09-30 02:32:35,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:37,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:32:39,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:40,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:32:42,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:32:43,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:32:45,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:32:48,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:32:54,035 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.43 vs. limit=15.0 2023-09-30 02:33:00,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:33:02,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:33:03,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:33:10,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:33:12,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:33:14,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:33:14,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:33:15,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:33:15,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:33:15,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:33:18,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:33:19,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=568426.6666666666, ans=0.125 2023-09-30 02:33:23,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 02:33:23,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:33:24,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:33:24,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:33:24,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:33:25,507 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.16 vs. limit=22.5 2023-09-30 02:33:26,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:33:27,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:33:28,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:33:31,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:33:31,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:33:31,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:33:35,468 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:33:40,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:33:43,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:33:50,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:33:51,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:33:54,980 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 02:33:57,773 INFO [train.py:1039] (3/4) Epoch 17, batch 300, loss[loss=0.182, simple_loss=0.2448, pruned_loss=0.05954, over 23837.00 frames. ], tot_loss[loss=0.1845, simple_loss=0.2585, pruned_loss=0.05527, over 3686023.85 frames. ], batch size: 195, lr: 6.16e-03, grad_scale: 8.0 2023-09-30 02:33:57,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:33:59,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 02:34:00,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 02:34:00,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 02:34:02,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:34:02,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 02:34:05,240 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.942e+02 2.251e+02 2.659e+02 4.378e+02, threshold=4.502e+02, percent-clipped=1.0 2023-09-30 02:34:07,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:34:07,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:34:09,580 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=568626.6666666666, ans=0.1 2023-09-30 02:34:10,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:34:11,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 02:34:14,417 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:34:14,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:34:14,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 02:34:14,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:34:17,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:34:22,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:34:24,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 02:34:24,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=568693.3333333334, ans=0.125 2023-09-30 02:34:27,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 02:34:27,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:34:29,368 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=568760.0, ans=0.125 2023-09-30 02:34:30,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:34:33,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:34:33,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 02:34:33,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:34:35,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:34:36,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:34:37,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:34:42,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 02:34:42,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 02:34:44,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:34:44,858 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.12 vs. limit=15.0 2023-09-30 02:34:47,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:34:48,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 02:34:49,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:34:51,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=568826.6666666666, ans=0.125 2023-09-30 02:34:52,209 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.44 vs. limit=15.0 2023-09-30 02:34:54,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:34:59,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:35:00,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 02:35:03,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:35:04,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:35:06,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:35:07,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=568893.3333333334, ans=0.0 2023-09-30 02:35:08,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:35:08,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 02:35:08,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:35:08,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:35:10,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 02:35:13,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:35:13,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:15,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:35:15,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:35:15,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:16,196 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.51 vs. limit=15.0 2023-09-30 02:35:20,310 INFO [train.py:1039] (3/4) Epoch 17, batch 350, loss[loss=0.1695, simple_loss=0.2445, pruned_loss=0.04728, over 23706.00 frames. ], tot_loss[loss=0.1826, simple_loss=0.257, pruned_loss=0.05405, over 3910837.11 frames. ], batch size: 149, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:35:22,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:35:22,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 02:35:25,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:31,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:35:35,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:35:35,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:35,833 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=569026.6666666666, ans=0.0 2023-09-30 02:35:38,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 02:35:41,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:35:41,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 02:35:43,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:35:43,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 02:35:44,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:35:48,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 02:35:50,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=569026.6666666666, ans=0.125 2023-09-30 02:35:52,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:35:53,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:35:55,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:35:55,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:35:55,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:35:57,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:35:57,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:35:58,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:36:00,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:36:00,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:36:00,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=569093.3333333334, ans=0.1 2023-09-30 02:36:07,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:36:09,039 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:36:09,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:36:09,915 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.00 vs. limit=15.0 2023-09-30 02:36:10,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:36:16,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 02:36:16,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:36:20,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:36:20,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:36:20,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:36:21,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 02:36:23,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:36:23,999 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 02:36:27,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 02:36:27,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:36:28,188 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.52 vs. limit=15.0 2023-09-30 02:36:29,396 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=569226.6666666666, ans=0.0 2023-09-30 02:36:30,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:36:30,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 02:36:32,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:36:33,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:36:35,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:36:37,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:36:37,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:36:41,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:36:41,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=569226.6666666666, ans=0.07 2023-09-30 02:36:43,979 INFO [train.py:1039] (3/4) Epoch 17, batch 400, loss[loss=0.1729, simple_loss=0.2536, pruned_loss=0.04609, over 24682.00 frames. ], tot_loss[loss=0.1819, simple_loss=0.2559, pruned_loss=0.05394, over 4076685.12 frames. ], batch size: 65, lr: 6.15e-03, grad_scale: 16.0 2023-09-30 02:36:44,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:36:45,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:36:47,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 02:36:47,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:36:47,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:36:49,266 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=569293.3333333334, ans=0.1 2023-09-30 02:36:50,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:36:50,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:36:51,859 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.862e+02 1.986e+02 2.213e+02 4.165e+02, threshold=3.971e+02, percent-clipped=0.0 2023-09-30 02:36:52,402 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=569293.3333333334, ans=0.125 2023-09-30 02:36:55,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:36:57,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:36:58,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 02:37:00,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 02:37:00,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:37:02,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 02:37:04,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:37:05,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:37:05,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:37:05,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 02:37:07,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:37:07,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:37:08,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:37:08,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:37:12,406 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 02:37:12,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=569360.0, ans=0.0 2023-09-30 02:37:13,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 02:37:18,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:37:18,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:37:19,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 02:37:22,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 02:37:25,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:37:29,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:37:35,406 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 02:37:38,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 02:37:40,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 02:37:40,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:37:43,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:37:43,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 02:37:48,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:37:51,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:37:52,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:37:54,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:37:55,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 02:37:56,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=569560.0, ans=0.125 2023-09-30 02:37:57,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 02:37:58,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 02:37:59,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:37:59,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:38:02,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 02:38:05,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 02:38:05,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:38:07,167 INFO [train.py:1039] (3/4) Epoch 17, batch 450, loss[loss=0.1733, simple_loss=0.2498, pruned_loss=0.04841, over 24571.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2568, pruned_loss=0.05449, over 4217074.89 frames. ], batch size: 60, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:38:07,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 02:38:07,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 02:38:07,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:38:08,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:38:09,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:38:09,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 02:38:09,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:38:11,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:38:14,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:38:21,159 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.07 vs. limit=10.0 2023-09-30 02:38:21,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=569626.6666666666, ans=0.125 2023-09-30 02:38:24,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:38:24,753 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:38:25,642 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.55 vs. limit=12.0 2023-09-30 02:38:26,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 02:38:28,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 02:38:32,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:38:36,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:38:37,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:38:42,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:38:42,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:38:45,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 02:38:45,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 02:38:47,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 02:38:47,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=569760.0, ans=0.2 2023-09-30 02:38:48,831 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:38:48,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:38:50,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:38:52,619 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 02:38:52,633 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 02:38:52,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:38:54,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:38:57,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 02:39:00,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:39:02,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:39:02,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 02:39:03,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 02:39:05,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:39:08,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 02:39:09,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:39:11,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 02:39:16,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:39:16,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 02:39:18,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 02:39:19,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 02:39:23,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:39:26,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:39:27,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:39:27,915 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 02:39:29,812 INFO [train.py:1039] (3/4) Epoch 17, batch 500, loss[loss=0.1909, simple_loss=0.2599, pruned_loss=0.06095, over 23477.00 frames. ], tot_loss[loss=0.1837, simple_loss=0.2578, pruned_loss=0.05478, over 4323972.82 frames. ], batch size: 285, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:39:33,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:39:33,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:39:33,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:39:35,230 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 02:39:36,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 02:39:36,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:39:38,687 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=569960.0, ans=10.0 2023-09-30 02:39:39,659 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.531e+02 1.860e+02 2.180e+02 2.486e+02 3.417e+02, threshold=4.360e+02, percent-clipped=0.0 2023-09-30 02:39:39,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:39:44,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 02:39:46,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:39:47,856 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:39:47,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:39:47,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:39:59,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:39:59,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 02:40:00,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=570026.6666666666, ans=0.07 2023-09-30 02:40:01,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 02:40:01,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:40:01,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 02:40:01,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 02:40:04,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=570093.3333333334, ans=0.2 2023-09-30 02:40:05,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:40:06,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:40:06,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:40:06,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:40:07,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 02:40:10,513 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 02:40:13,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:40:15,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:40:16,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:40:16,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:40:17,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 02:40:20,305 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.33 vs. limit=22.5 2023-09-30 02:40:20,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 02:40:21,802 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.24 vs. limit=15.0 2023-09-30 02:40:24,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:40:25,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:40:25,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=570160.0, ans=0.0 2023-09-30 02:40:31,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:40:34,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:40:40,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:40:45,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 02:40:45,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:40:45,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:40:47,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 02:40:48,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 02:40:48,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:40:53,241 INFO [train.py:1039] (3/4) Epoch 17, batch 550, loss[loss=0.2669, simple_loss=0.3168, pruned_loss=0.1085, over 19619.00 frames. ], tot_loss[loss=0.1854, simple_loss=0.2595, pruned_loss=0.05571, over 4409226.51 frames. ], batch size: 388, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:40:54,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 02:40:56,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 02:40:56,355 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:40:56,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 02:40:57,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:40:57,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:40:59,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:00,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:00,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:41:02,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:41:05,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:41:06,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 02:41:06,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:41:08,275 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=570360.0, ans=0.125 2023-09-30 02:41:11,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:41:11,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:13,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:41:14,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:20,298 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 02:41:20,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 02:41:22,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=570360.0, ans=0.0 2023-09-30 02:41:23,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:41:27,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:41:27,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:41:29,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:41:34,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:41:34,630 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 02:41:36,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:41:36,229 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=570426.6666666666, ans=0.125 2023-09-30 02:41:37,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 02:41:42,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:41:42,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 02:41:42,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:41:44,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:41:45,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 02:41:46,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=570493.3333333334, ans=0.125 2023-09-30 02:41:47,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 02:41:47,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:41:47,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:41:47,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:41:47,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:41:51,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:41:54,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:41:57,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:41:57,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:41:59,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 02:42:00,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:42:02,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:42:02,247 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:42:03,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:42:03,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 02:42:03,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 02:42:12,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 02:42:15,727 INFO [train.py:1039] (3/4) Epoch 17, batch 600, loss[loss=0.2035, simple_loss=0.2703, pruned_loss=0.06832, over 23794.00 frames. ], tot_loss[loss=0.1857, simple_loss=0.2595, pruned_loss=0.05589, over 4477341.60 frames. ], batch size: 164, lr: 6.15e-03, grad_scale: 8.0 2023-09-30 02:42:15,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 02:42:16,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:42:17,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:42:17,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:42:26,658 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.830e+02 1.971e+02 2.265e+02 3.697e+02, threshold=3.941e+02, percent-clipped=0.0 2023-09-30 02:42:26,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:42:28,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 02:42:29,995 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 02:42:31,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 02:42:33,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:42:36,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:42:37,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 02:42:38,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=570693.3333333334, ans=0.125 2023-09-30 02:42:39,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:42:39,916 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.27 vs. limit=22.5 2023-09-30 02:42:45,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 02:42:48,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:42:48,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:42:48,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:42:57,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:42:57,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:42:57,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:43:03,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=570760.0, ans=0.0 2023-09-30 02:43:04,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:43:09,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:43:09,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:43:09,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:43:18,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 02:43:24,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:43:24,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:43:29,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 02:43:31,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:43:33,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 02:43:33,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:43:34,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:43:35,631 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:43:39,680 INFO [train.py:1039] (3/4) Epoch 17, batch 650, loss[loss=0.1824, simple_loss=0.2374, pruned_loss=0.06376, over 22729.00 frames. ], tot_loss[loss=0.1844, simple_loss=0.2581, pruned_loss=0.05535, over 4534062.28 frames. ], batch size: 322, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:43:42,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 02:43:42,933 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 02:43:43,420 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=570960.0, ans=0.1 2023-09-30 02:43:44,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:43:46,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:43:49,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:43:49,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=570960.0, ans=0.125 2023-09-30 02:43:51,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 02:43:52,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:43:57,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:43:57,746 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:44:00,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=571026.6666666666, ans=0.0 2023-09-30 02:44:01,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:44:06,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 02:44:07,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:44:07,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:44:11,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:44:11,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 02:44:14,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:44:14,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:14,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:44:16,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:17,764 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:44:18,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:44:19,455 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 02:44:19,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:44:19,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:44:24,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:24,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:44:26,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:44:26,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:44:27,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 02:44:29,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:44:29,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:44:30,023 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=5.17 vs. limit=12.0 2023-09-30 02:44:30,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 02:44:30,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:44:31,131 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=571160.0, ans=0.5 2023-09-30 02:44:32,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 02:44:34,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 02:44:36,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 02:44:37,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:37,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:44:38,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:44:39,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:44:41,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:44:47,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:44:47,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:44:49,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:44:53,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:44:53,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 02:44:53,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:44:58,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=571226.6666666666, ans=0.1 2023-09-30 02:45:01,941 INFO [train.py:1039] (3/4) Epoch 17, batch 700, loss[loss=0.1924, simple_loss=0.2763, pruned_loss=0.05426, over 24468.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2571, pruned_loss=0.05479, over 4578179.89 frames. ], batch size: 69, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:45:02,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:45:02,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:45:02,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:45:02,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:45:06,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 02:45:08,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 02:45:12,194 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.891e+02 2.090e+02 2.521e+02 3.567e+02, threshold=4.179e+02, percent-clipped=0.0 2023-09-30 02:45:12,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 02:45:13,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:45:15,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:45:17,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 02:45:20,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:45:23,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:45:25,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=571360.0, ans=0.0 2023-09-30 02:45:26,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:45:28,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:45:28,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:45:30,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:45:33,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 02:45:33,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:45:33,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=571426.6666666666, ans=0.125 2023-09-30 02:45:34,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 02:45:38,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 02:45:42,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:45:42,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:45:45,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:45:48,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:45:50,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 02:45:55,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:45:55,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:45:55,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 02:45:58,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:46:00,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:46:03,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:46:08,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:46:09,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 02:46:10,312 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=571560.0, ans=0.0 2023-09-30 02:46:13,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 02:46:13,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 02:46:15,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:46:15,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=571560.0, ans=0.2 2023-09-30 02:46:17,847 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.98 vs. limit=15.0 2023-09-30 02:46:19,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:46:20,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:46:22,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:46:22,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 02:46:25,713 INFO [train.py:1039] (3/4) Epoch 17, batch 750, loss[loss=0.1867, simple_loss=0.2679, pruned_loss=0.05278, over 24334.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2567, pruned_loss=0.05477, over 4599351.52 frames. ], batch size: 77, lr: 6.14e-03, grad_scale: 4.0 2023-09-30 02:46:28,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 02:46:28,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 02:46:28,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 02:46:30,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 02:46:30,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 02:46:31,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:46:32,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 02:46:33,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:46:35,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:46:35,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=571626.6666666666, ans=0.125 2023-09-30 02:46:36,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:46:38,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:46:38,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:46:38,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:46:42,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:46:42,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:46:45,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:46:47,546 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.48 vs. limit=10.0 2023-09-30 02:46:48,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:46:48,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:46:49,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 02:46:51,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:46:52,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:46:54,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:46:55,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=571693.3333333334, ans=0.1 2023-09-30 02:46:57,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 02:46:59,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 02:46:59,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:47:00,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 02:47:00,805 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 02:47:00,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 02:47:00,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:47:00,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 02:47:03,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:47:04,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=571760.0, ans=0.0 2023-09-30 02:47:09,298 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.91 vs. limit=15.0 2023-09-30 02:47:10,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:47:11,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:47:11,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:47:13,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:47:17,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:47:17,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 02:47:17,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:47:18,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 02:47:20,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:47:23,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:47:24,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 02:47:24,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:47:32,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:47:33,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:47:33,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:47:35,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:47:39,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 02:47:40,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:47:40,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:47:43,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:47:43,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:47:46,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:47:48,749 INFO [train.py:1039] (3/4) Epoch 17, batch 800, loss[loss=0.1907, simple_loss=0.2643, pruned_loss=0.05858, over 23318.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.2573, pruned_loss=0.05462, over 4621194.21 frames. ], batch size: 105, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:47:48,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 02:47:50,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=571960.0, ans=0.2 2023-09-30 02:47:55,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:47:55,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:47:56,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:47:56,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:47:58,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:48:00,275 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.542e+02 1.820e+02 2.048e+02 2.346e+02 3.292e+02, threshold=4.096e+02, percent-clipped=0.0 2023-09-30 02:48:00,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:48:01,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:48:05,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:48:07,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:48:07,588 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=572026.6666666666, ans=0.0 2023-09-30 02:48:10,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 02:48:12,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:48:12,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=572026.6666666666, ans=0.1 2023-09-30 02:48:13,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:48:15,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:48:15,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:48:16,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 02:48:16,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:48:18,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 02:48:21,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:48:22,132 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.10 vs. limit=15.0 2023-09-30 02:48:24,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:48:26,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:48:26,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:48:28,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:48:28,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:48:32,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:48:34,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:48:34,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 02:48:36,588 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 02:48:36,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 02:48:36,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 02:48:36,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:48:39,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:48:40,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:48:44,913 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 02:48:46,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 02:48:47,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=572160.0, ans=0.1 2023-09-30 02:48:48,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:48:51,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 02:48:54,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:48:58,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:49:00,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 02:49:00,597 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:49:04,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 02:49:09,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:49:11,245 INFO [train.py:1039] (3/4) Epoch 17, batch 850, loss[loss=0.206, simple_loss=0.2889, pruned_loss=0.06153, over 24365.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2581, pruned_loss=0.05523, over 4641158.22 frames. ], batch size: 77, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:49:11,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:49:12,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys.whitening_limit, batch_count=572293.3333333334, ans=6.0 2023-09-30 02:49:12,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 02:49:12,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:49:14,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:49:15,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 02:49:15,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:49:16,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:49:18,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:49:18,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=572293.3333333334, ans=0.0 2023-09-30 02:49:19,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:49:21,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:49:23,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 02:49:24,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 02:49:24,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 02:49:26,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:49:26,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:49:27,141 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.35 vs. limit=15.0 2023-09-30 02:49:29,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:49:29,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:49:29,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:49:35,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:49:35,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:49:37,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 02:49:40,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 02:49:44,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:49:44,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 02:49:47,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 02:49:48,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=572426.6666666666, ans=0.125 2023-09-30 02:49:49,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 02:49:51,614 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 02:49:51,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:49:51,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:49:52,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 02:49:54,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:49:55,007 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=572426.6666666666, ans=0.125 2023-09-30 02:49:56,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:49:56,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 02:49:56,947 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.85 vs. limit=15.0 2023-09-30 02:49:58,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=572426.6666666666, ans=0.1 2023-09-30 02:49:59,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 02:50:00,752 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.76 vs. limit=12.0 2023-09-30 02:50:01,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:50:02,687 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:50:02,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 02:50:03,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=572493.3333333334, ans=0.125 2023-09-30 02:50:04,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 02:50:04,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=572493.3333333334, ans=0.125 2023-09-30 02:50:06,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 02:50:06,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 02:50:11,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:50:11,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:50:12,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:50:12,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:50:14,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:50:15,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:50:19,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 02:50:21,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 02:50:21,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:50:22,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 02:50:29,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 02:50:31,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:50:32,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 02:50:33,927 INFO [train.py:1039] (3/4) Epoch 17, batch 900, loss[loss=0.1948, simple_loss=0.259, pruned_loss=0.06528, over 23656.00 frames. ], tot_loss[loss=0.1849, simple_loss=0.2588, pruned_loss=0.05553, over 4666105.85 frames. ], batch size: 232, lr: 6.14e-03, grad_scale: 8.0 2023-09-30 02:50:34,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:50:34,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:50:34,765 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=5.13 vs. limit=12.0 2023-09-30 02:50:37,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 02:50:44,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:50:44,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=572626.6666666666, ans=0.1 2023-09-30 02:50:45,731 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 1.971e+02 2.244e+02 2.720e+02 3.662e+02, threshold=4.487e+02, percent-clipped=0.0 2023-09-30 02:50:46,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=572626.6666666666, ans=0.125 2023-09-30 02:50:47,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:50:49,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 02:50:52,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:50:52,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 02:50:53,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 02:50:54,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=572693.3333333334, ans=0.2 2023-09-30 02:50:55,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:50:55,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:50:55,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 02:50:55,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:51:05,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:51:06,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:51:06,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=572760.0, ans=0.0 2023-09-30 02:51:07,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:51:11,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:51:15,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 02:51:18,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:51:20,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 02:51:21,034 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=572760.0, ans=0.125 2023-09-30 02:51:22,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 02:51:22,396 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 02:51:23,975 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 02:51:30,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 02:51:30,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:51:32,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 02:51:36,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=572826.6666666666, ans=0.0 2023-09-30 02:51:40,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:51:40,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:51:42,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 02:51:42,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:51:45,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 02:51:46,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:51:46,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:51:49,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:51:49,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:51:53,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 02:51:55,183 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 02:51:55,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 02:51:55,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 02:51:56,791 INFO [train.py:1039] (3/4) Epoch 17, batch 950, loss[loss=0.2001, simple_loss=0.2777, pruned_loss=0.06122, over 23275.00 frames. ], tot_loss[loss=0.1844, simple_loss=0.2587, pruned_loss=0.05511, over 4685996.91 frames. ], batch size: 93, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:51:58,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:52:03,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 02:52:08,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:52:08,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=572960.0, ans=0.125 2023-09-30 02:52:10,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:52:10,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:52:10,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 02:52:12,546 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 02:52:16,090 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=9.04 vs. limit=15.0 2023-09-30 02:52:18,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:52:18,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:52:18,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:52:18,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:52:20,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 02:52:22,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 02:52:23,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:52:26,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=573026.6666666666, ans=0.125 2023-09-30 02:52:27,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 02:52:28,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:52:30,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=573093.3333333334, ans=0.0 2023-09-30 02:52:33,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:52:33,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:52:34,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:52:34,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 02:52:37,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 02:52:38,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:52:40,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:52:44,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:52:44,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:52:48,599 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 02:52:51,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 02:52:51,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 02:52:51,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:52:51,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:52:51,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 02:52:56,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 02:52:58,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:53:00,095 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=573160.0, ans=0.125 2023-09-30 02:53:01,825 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:53:03,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:53:03,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 02:53:03,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:53:03,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:53:03,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 02:53:08,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:53:10,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:53:14,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:53:16,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 02:53:16,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 02:53:20,120 INFO [train.py:1039] (3/4) Epoch 17, batch 1000, loss[loss=0.1904, simple_loss=0.2754, pruned_loss=0.05272, over 24617.00 frames. ], tot_loss[loss=0.1841, simple_loss=0.2579, pruned_loss=0.05519, over 4676861.08 frames. ], batch size: 68, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:53:20,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:53:23,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 02:53:23,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:53:27,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=573293.3333333334, ans=0.0 2023-09-30 02:53:28,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:53:30,713 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 02:53:30,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 02:53:32,098 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.898e+02 2.133e+02 2.497e+02 3.739e+02, threshold=4.265e+02, percent-clipped=0.0 2023-09-30 02:53:34,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=573293.3333333334, ans=0.125 2023-09-30 02:53:36,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:53:36,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:53:37,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:53:42,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 02:53:45,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 02:53:47,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 02:53:47,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:53:51,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 02:53:51,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=573360.0, ans=0.125 2023-09-30 02:53:52,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 02:53:53,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 02:53:54,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:53:55,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:03,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:54:05,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:54:05,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:06,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:54:06,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 02:54:06,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:54:08,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 02:54:08,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:54:08,624 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 02:54:12,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 02:54:14,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 02:54:17,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 02:54:18,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:54:27,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:27,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 02:54:27,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:27,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:54:28,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 02:54:30,518 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:54:30,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 02:54:32,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 02:54:33,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:54:33,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:54:34,223 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=573560.0, ans=0.125 2023-09-30 02:54:37,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:54:39,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:54:39,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:54:41,437 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=573560.0, ans=15.0 2023-09-30 02:54:44,360 INFO [train.py:1039] (3/4) Epoch 17, batch 1050, loss[loss=0.2024, simple_loss=0.2902, pruned_loss=0.05723, over 23910.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2564, pruned_loss=0.05461, over 4677177.08 frames. ], batch size: 80, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:54:44,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:54:44,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:54:46,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=573626.6666666666, ans=0.125 2023-09-30 02:54:48,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 02:54:49,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:54:51,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:54:52,982 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=573626.6666666666, ans=0.0 2023-09-30 02:54:55,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 02:54:57,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 02:54:59,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:55:01,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:55:01,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 02:55:02,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 02:55:02,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 02:55:04,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:55:05,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 02:55:08,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:55:08,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 02:55:08,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 02:55:15,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:55:17,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 02:55:17,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:55:21,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 02:55:21,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 02:55:21,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 02:55:24,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 02:55:28,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 02:55:28,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:55:30,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=573760.0, ans=0.1 2023-09-30 02:55:31,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 02:55:33,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 02:55:33,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:55:35,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:55:39,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 02:55:44,922 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 02:55:45,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 02:55:46,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 02:55:46,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:55:46,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 02:55:48,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 02:55:51,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:55:55,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 02:55:55,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:55:56,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:55:56,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:00,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:00,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 02:56:00,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=573893.3333333334, ans=0.0 2023-09-30 02:56:01,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 02:56:01,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 02:56:01,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 02:56:03,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:56:06,402 INFO [train.py:1039] (3/4) Epoch 17, batch 1100, loss[loss=0.1802, simple_loss=0.2609, pruned_loss=0.04975, over 24412.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.2561, pruned_loss=0.05463, over 4700072.27 frames. ], batch size: 77, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:56:07,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:56:07,523 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=573960.0, ans=0.1 2023-09-30 02:56:13,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:56:18,396 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.563e+02 1.918e+02 2.180e+02 2.425e+02 3.491e+02, threshold=4.360e+02, percent-clipped=0.0 2023-09-30 02:56:20,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 02:56:20,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 02:56:20,336 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:56:20,549 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=573960.0, ans=0.125 2023-09-30 02:56:21,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 02:56:23,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:56:25,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 02:56:28,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:56:31,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 02:56:31,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 02:56:35,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 02:56:36,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:56:36,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 02:56:38,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:56:40,683 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.30 vs. limit=6.0 2023-09-30 02:56:40,791 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.80 vs. limit=6.0 2023-09-30 02:56:42,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 02:56:48,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:56:50,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 02:56:51,448 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 02:56:51,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:55,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:55,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:56:55,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:56:57,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=574160.0, ans=0.125 2023-09-30 02:56:58,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 02:56:58,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:56:58,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 02:56:58,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:56:58,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=574160.0, ans=0.125 2023-09-30 02:56:59,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:56:59,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 02:57:06,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 02:57:06,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 02:57:09,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 02:57:09,898 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=574160.0, ans=0.125 2023-09-30 02:57:13,186 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.36 vs. limit=10.0 2023-09-30 02:57:14,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 02:57:18,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 02:57:18,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 02:57:18,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:57:21,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:57:22,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:57:22,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 02:57:24,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 02:57:24,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:57:25,987 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=574226.6666666666, ans=0.125 2023-09-30 02:57:27,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 02:57:27,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 02:57:27,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 02:57:29,005 INFO [train.py:1039] (3/4) Epoch 17, batch 1150, loss[loss=0.2191, simple_loss=0.2717, pruned_loss=0.08323, over 19588.00 frames. ], tot_loss[loss=0.1833, simple_loss=0.2571, pruned_loss=0.05478, over 4699295.41 frames. ], batch size: 388, lr: 6.13e-03, grad_scale: 8.0 2023-09-30 02:57:29,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:57:29,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:57:30,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:57:36,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:57:39,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 02:57:42,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:57:42,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:57:44,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 02:57:44,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:57:47,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 02:57:48,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:57:48,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:57:50,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=574360.0, ans=0.125 2023-09-30 02:57:50,887 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=574360.0, ans=0.1 2023-09-30 02:57:52,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 02:57:55,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:57:59,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:58:00,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:58:01,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 02:58:01,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 02:58:01,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:58:04,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 02:58:04,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:58:05,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:58:07,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=574426.6666666666, ans=0.0 2023-09-30 02:58:12,916 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=574426.6666666666, ans=0.0 2023-09-30 02:58:17,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:58:22,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 02:58:24,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 02:58:24,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:58:24,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:58:30,931 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 02:58:33,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:58:42,130 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 02:58:46,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:58:48,143 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 02:58:48,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 02:58:48,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 02:58:51,675 INFO [train.py:1039] (3/4) Epoch 17, batch 1200, loss[loss=0.1998, simple_loss=0.2628, pruned_loss=0.06833, over 23372.00 frames. ], tot_loss[loss=0.1841, simple_loss=0.2581, pruned_loss=0.05503, over 4700252.01 frames. ], batch size: 285, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 02:58:51,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:58:57,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 02:58:59,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 02:59:00,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:59:00,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:59:01,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 02:59:01,647 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.26 vs. limit=10.0 2023-09-30 02:59:02,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=574626.6666666666, ans=0.1 2023-09-30 02:59:03,932 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.927e+02 2.114e+02 2.514e+02 4.321e+02, threshold=4.228e+02, percent-clipped=0.0 2023-09-30 02:59:04,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 02:59:05,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 02:59:07,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:59:07,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:59:11,707 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 02:59:13,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 02:59:17,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 02:59:18,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=574693.3333333334, ans=0.0 2023-09-30 02:59:20,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 02:59:21,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:59:25,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 02:59:25,296 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 02:59:25,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:59:31,798 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:59:35,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 02:59:35,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 02:59:35,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 02:59:37,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 02:59:40,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 02:59:43,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 02:59:45,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 02:59:45,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 02:59:46,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:59:46,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 02:59:48,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 02:59:48,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 02:59:50,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 02:59:50,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 02:59:50,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 02:59:51,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 02:59:51,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 02:59:53,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 02:59:53,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 02:59:55,262 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 02:59:58,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 02:59:59,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=574893.3333333334, ans=0.125 2023-09-30 03:00:01,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:00:06,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 03:00:10,310 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 03:00:13,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:00:14,777 INFO [train.py:1039] (3/4) Epoch 17, batch 1250, loss[loss=0.1925, simple_loss=0.2616, pruned_loss=0.06167, over 23231.00 frames. ], tot_loss[loss=0.1853, simple_loss=0.2595, pruned_loss=0.05548, over 4703405.06 frames. ], batch size: 105, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:00:14,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:00:15,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:00:16,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:00:21,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 03:00:23,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=574960.0, ans=0.125 2023-09-30 03:00:24,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:00:26,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:00:27,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 03:00:30,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:00:32,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:00:36,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:00:37,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:00:37,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:00:37,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:00:40,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:00:45,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 03:00:46,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 03:00:46,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:00:47,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:00:48,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:00:51,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:00:52,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 03:00:57,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 03:00:59,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:01:02,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:01:02,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 03:01:02,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:01:02,501 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 03:01:04,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:01:04,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:01:07,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:01:07,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=575160.0, ans=0.125 2023-09-30 03:01:12,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:01:12,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:01:13,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 03:01:15,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 03:01:15,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 03:01:18,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:01:21,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 03:01:21,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:01:23,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 03:01:23,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:01:24,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 03:01:24,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 03:01:24,772 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:01:24,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:01:26,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:01:27,337 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.06 vs. limit=5.0 2023-09-30 03:01:27,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 03:01:30,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:01:33,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:01:33,887 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.62 vs. limit=15.0 2023-09-30 03:01:34,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 03:01:37,369 INFO [train.py:1039] (3/4) Epoch 17, batch 1300, loss[loss=0.1785, simple_loss=0.2397, pruned_loss=0.05864, over 22746.00 frames. ], tot_loss[loss=0.1861, simple_loss=0.2602, pruned_loss=0.05596, over 4701452.13 frames. ], batch size: 322, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:01:38,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 03:01:42,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:01:42,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 03:01:44,843 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.74 vs. limit=15.0 2023-09-30 03:01:48,566 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.803e+02 1.977e+02 2.132e+02 2.913e+02, threshold=3.954e+02, percent-clipped=0.0 2023-09-30 03:01:48,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:01:50,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:01:50,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:01:51,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=575293.3333333334, ans=0.125 2023-09-30 03:01:53,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=575360.0, ans=0.0 2023-09-30 03:01:54,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:01:54,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:01:54,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 03:01:59,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:02:00,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:02:02,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 03:02:05,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:02:09,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:02:10,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:02:12,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:02:15,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:02:15,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:02:16,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 03:02:16,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 03:02:22,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:02:23,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:02:24,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 03:02:26,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 03:02:26,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:02:29,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:02:31,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 03:02:31,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:02:32,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 03:02:34,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:02:37,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:02:37,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:02:42,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 03:02:42,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 03:02:44,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 03:02:46,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=575560.0, ans=0.0 2023-09-30 03:02:48,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=575560.0, ans=0.125 2023-09-30 03:02:49,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:02:52,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 03:02:52,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:03:01,428 INFO [train.py:1039] (3/4) Epoch 17, batch 1350, loss[loss=0.1906, simple_loss=0.2565, pruned_loss=0.0623, over 23872.00 frames. ], tot_loss[loss=0.1855, simple_loss=0.2596, pruned_loss=0.0557, over 4700089.24 frames. ], batch size: 195, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:03:03,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 03:03:03,520 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=575626.6666666666, ans=0.125 2023-09-30 03:03:07,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:03:08,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=575626.6666666666, ans=0.125 2023-09-30 03:03:09,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:03:11,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:03:11,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:03:14,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:03:14,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:03:20,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:03:20,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 03:03:22,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 03:03:23,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:03:26,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 03:03:27,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:03:27,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:03:27,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 03:03:29,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 03:03:31,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 03:03:34,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:03:34,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 03:03:38,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=575760.0, ans=0.125 2023-09-30 03:03:46,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:03:56,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:03:56,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:03:56,893 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.60 vs. limit=10.0 2023-09-30 03:03:57,097 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.06 vs. limit=22.5 2023-09-30 03:03:58,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 03:04:01,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:04:02,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 03:04:02,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 03:04:04,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:04:06,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:04:10,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 03:04:11,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:04:17,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 03:04:19,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 03:04:24,863 INFO [train.py:1039] (3/4) Epoch 17, batch 1400, loss[loss=0.1669, simple_loss=0.2326, pruned_loss=0.05054, over 23524.00 frames. ], tot_loss[loss=0.1841, simple_loss=0.258, pruned_loss=0.05509, over 4700375.24 frames. ], batch size: 256, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:04:24,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 03:04:25,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=575960.0, ans=0.0 2023-09-30 03:04:26,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:04:29,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:04:31,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:04:33,775 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 03:04:35,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 03:04:36,885 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.561e+02 1.880e+02 2.143e+02 2.482e+02 5.482e+02, threshold=4.285e+02, percent-clipped=2.0 2023-09-30 03:04:43,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=576026.6666666666, ans=0.1 2023-09-30 03:04:48,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:04:50,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:04:52,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:04:52,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:04:55,771 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:04:57,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 03:05:00,834 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.94 vs. limit=10.0 2023-09-30 03:05:06,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:05:08,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:05:11,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 03:05:11,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:05:13,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:05:13,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:05:15,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:05:15,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:05:17,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:05:17,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:05:17,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 03:05:17,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:05:22,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:05:25,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=576160.0, ans=0.1 2023-09-30 03:05:27,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:05:35,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=576226.6666666666, ans=0.0 2023-09-30 03:05:36,445 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 03:05:37,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 03:05:37,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:05:41,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 03:05:43,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:05:44,220 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.67 vs. limit=15.0 2023-09-30 03:05:44,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:05:46,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:05:48,363 INFO [train.py:1039] (3/4) Epoch 17, batch 1450, loss[loss=0.1831, simple_loss=0.2692, pruned_loss=0.04849, over 24561.00 frames. ], tot_loss[loss=0.1832, simple_loss=0.2575, pruned_loss=0.05449, over 4702450.59 frames. ], batch size: 71, lr: 6.12e-03, grad_scale: 16.0 2023-09-30 03:05:49,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:05:49,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:05:50,134 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.86 vs. limit=10.0 2023-09-30 03:05:50,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 03:05:55,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:05:56,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:05:58,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:05:58,432 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 03:05:59,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:06:02,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 03:06:02,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:06:05,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:06:05,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 03:06:05,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:06:06,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:06:06,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 03:06:08,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:06:08,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:06:08,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=576360.0, ans=0.2 2023-09-30 03:06:11,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:06:12,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:06:17,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:06:17,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:06:19,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:06:21,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:06:22,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:06:22,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:06:24,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:06:24,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:06:25,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=576426.6666666666, ans=0.125 2023-09-30 03:06:28,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 03:06:31,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:06:31,823 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=576426.6666666666, ans=0.1 2023-09-30 03:06:35,265 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 03:06:35,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:06:37,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:06:38,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:06:41,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 03:06:44,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:06:46,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 03:06:46,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 03:06:47,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:06:51,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:06:51,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:06:53,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 03:06:56,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 03:06:56,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 03:06:58,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:06:58,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:07:10,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=576626.6666666666, ans=0.07 2023-09-30 03:07:11,609 INFO [train.py:1039] (3/4) Epoch 17, batch 1500, loss[loss=0.1662, simple_loss=0.2341, pruned_loss=0.04914, over 23576.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2577, pruned_loss=0.05426, over 4715698.09 frames. ], batch size: 256, lr: 6.11e-03, grad_scale: 16.0 2023-09-30 03:07:11,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 03:07:11,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:07:11,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:07:13,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:07:13,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:07:14,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:07:16,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 03:07:16,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:07:16,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 03:07:18,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:07:19,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:07:21,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:07:21,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:07:22,947 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.695e+02 1.876e+02 2.186e+02 2.555e+02 3.680e+02, threshold=4.372e+02, percent-clipped=0.0 2023-09-30 03:07:26,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:07:26,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 03:07:27,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:07:27,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:07:29,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:07:35,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 03:07:39,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 03:07:42,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:07:42,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 03:07:45,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 03:07:48,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:07:49,743 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:07:49,765 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:07:51,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 03:07:52,139 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.35 vs. limit=22.5 2023-09-30 03:07:52,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:07:54,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:07:54,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 03:07:54,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:08:01,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:08:01,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 03:08:03,404 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:08:07,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 03:08:09,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:08:13,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=576826.6666666666, ans=0.125 2023-09-30 03:08:14,808 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 03:08:14,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:08:14,903 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 03:08:17,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:08:19,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:08:21,493 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 03:08:21,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:08:24,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 03:08:26,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:08:26,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=576893.3333333334, ans=0.09899494936611666 2023-09-30 03:08:26,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=576893.3333333334, ans=0.125 2023-09-30 03:08:29,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:08:30,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:08:30,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:08:30,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:08:32,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:08:34,395 INFO [train.py:1039] (3/4) Epoch 17, batch 1550, loss[loss=0.1888, simple_loss=0.2715, pruned_loss=0.05307, over 24665.00 frames. ], tot_loss[loss=0.1837, simple_loss=0.2584, pruned_loss=0.05448, over 4715272.14 frames. ], batch size: 68, lr: 6.11e-03, grad_scale: 16.0 2023-09-30 03:08:34,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 03:08:36,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 03:08:36,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:08:36,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 03:08:37,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 03:08:39,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:08:40,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:08:41,139 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=576960.0, ans=0.0 2023-09-30 03:08:42,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:08:42,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:08:43,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:08:45,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:08:49,567 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 03:08:49,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:08:49,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:08:51,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 03:08:53,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:08:53,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 03:08:56,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:08:56,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 03:08:58,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 03:08:58,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 03:08:58,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:08:59,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:09:02,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:09:05,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 03:09:05,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 03:09:16,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:09:18,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:09:19,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:09:19,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:09:19,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 03:09:25,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 03:09:27,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:09:31,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:09:34,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:09:34,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:09:34,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 03:09:34,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:09:35,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten.whitening_limit, batch_count=577160.0, ans=15.0 2023-09-30 03:09:36,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:09:36,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=577160.0, ans=0.1 2023-09-30 03:09:38,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:09:38,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 03:09:38,161 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 03:09:41,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:09:41,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=577226.6666666666, ans=0.05 2023-09-30 03:09:41,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=577226.6666666666, ans=0.0 2023-09-30 03:09:47,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 03:09:52,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:09:54,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:09:54,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 03:09:55,698 INFO [train.py:1039] (3/4) Epoch 17, batch 1600, loss[loss=0.1813, simple_loss=0.2478, pruned_loss=0.05747, over 23703.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.2587, pruned_loss=0.05456, over 4714498.82 frames. ], batch size: 149, lr: 6.11e-03, grad_scale: 32.0 2023-09-30 03:09:55,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:09:56,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:09:56,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:09:56,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:09:57,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:10:01,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:10:01,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=577293.3333333334, ans=0.1 2023-09-30 03:10:02,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 03:10:04,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 03:10:04,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=577293.3333333334, ans=0.0 2023-09-30 03:10:06,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 03:10:06,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:10:07,298 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.07 vs. limit=15.0 2023-09-30 03:10:07,919 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.883e+02 2.101e+02 2.421e+02 4.828e+02, threshold=4.202e+02, percent-clipped=4.0 2023-09-30 03:10:08,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 03:10:09,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:10:11,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:10:16,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:10:17,320 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.71 vs. limit=10.0 2023-09-30 03:10:19,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 03:10:20,017 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=577360.0, ans=0.2 2023-09-30 03:10:22,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:10:22,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 03:10:24,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:10:24,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 03:10:31,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 03:10:40,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:10:40,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 03:10:42,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:10:42,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:10:42,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:10:44,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=577493.3333333334, ans=0.0 2023-09-30 03:10:45,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 03:10:47,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=577493.3333333334, ans=0.0 2023-09-30 03:10:49,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=577493.3333333334, ans=0.1 2023-09-30 03:10:50,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 03:10:53,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:10:53,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:10:55,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:10:55,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:10:58,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:10:58,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:11:01,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:11:07,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:11:09,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:11:11,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 03:11:11,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:11:12,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 03:11:18,147 INFO [train.py:1039] (3/4) Epoch 17, batch 1650, loss[loss=0.1918, simple_loss=0.2681, pruned_loss=0.05772, over 23499.00 frames. ], tot_loss[loss=0.184, simple_loss=0.2588, pruned_loss=0.0546, over 4714916.92 frames. ], batch size: 93, lr: 6.11e-03, grad_scale: 32.0 2023-09-30 03:11:18,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:11:18,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=577626.6666666666, ans=0.0 2023-09-30 03:11:21,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:11:21,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:11:21,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 03:11:23,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 03:11:23,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 03:11:23,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 03:11:25,743 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.87 vs. limit=15.0 2023-09-30 03:11:27,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:11:29,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:11:29,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:11:29,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:11:31,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:11:32,954 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 03:11:34,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=577693.3333333334, ans=0.0 2023-09-30 03:11:36,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:11:36,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:11:36,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:11:36,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:11:37,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 03:11:37,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 03:11:44,644 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=577693.3333333334, ans=0.2 2023-09-30 03:11:45,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:11:48,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:11:58,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 03:11:59,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:12:01,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 03:12:04,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:12:07,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:12:07,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:12:07,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:12:08,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:12:08,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:12:11,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:12:11,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:12:13,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:12:13,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:12:15,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:12:15,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:12:19,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:12:19,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 03:12:21,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:12:22,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 03:12:22,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 03:12:22,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 03:12:22,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:12:24,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:12:24,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:12:24,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:12:24,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 03:12:29,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:12:31,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:12:31,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:12:32,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 03:12:32,850 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:12:38,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:12:38,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:12:39,906 INFO [train.py:1039] (3/4) Epoch 17, batch 1700, loss[loss=0.1945, simple_loss=0.2728, pruned_loss=0.05805, over 23532.00 frames. ], tot_loss[loss=0.1847, simple_loss=0.2585, pruned_loss=0.05541, over 4698270.94 frames. ], batch size: 93, lr: 6.11e-03, grad_scale: 32.0 2023-09-30 03:12:39,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 03:12:40,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:12:40,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:12:40,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:12:43,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:12:43,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:12:44,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 03:12:44,245 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=577960.0, ans=0.2 2023-09-30 03:12:48,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:12:51,394 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 1.846e+02 2.021e+02 2.222e+02 3.253e+02, threshold=4.041e+02, percent-clipped=0.0 2023-09-30 03:12:56,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:13:00,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:13:07,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:13:07,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:13:08,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:13:08,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:13:10,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 03:13:10,674 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=578026.6666666666, ans=0.1 2023-09-30 03:13:11,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:13:13,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:13:13,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:13:14,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 03:13:17,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 03:13:17,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 03:13:20,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:13:21,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=578093.3333333334, ans=0.0 2023-09-30 03:13:23,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 03:13:24,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:13:31,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:13:35,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:13:35,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:13:37,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 03:13:37,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 03:13:38,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:13:40,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:13:40,298 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 03:13:41,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:13:41,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:13:41,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:13:41,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:13:43,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:13:43,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:13:45,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:13:45,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:13:45,469 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=578226.6666666666, ans=0.0 2023-09-30 03:13:46,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:13:53,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:13:53,204 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 03:13:54,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:13:56,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:13:58,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 03:14:02,728 INFO [train.py:1039] (3/4) Epoch 17, batch 1750, loss[loss=0.1708, simple_loss=0.2139, pruned_loss=0.06383, over 19041.00 frames. ], tot_loss[loss=0.1837, simple_loss=0.2569, pruned_loss=0.05523, over 4689317.72 frames. ], batch size: 389, lr: 6.11e-03, grad_scale: 32.0 2023-09-30 03:14:04,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:14:06,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:14:06,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:14:08,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 03:14:08,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=578293.3333333334, ans=0.0 2023-09-30 03:14:09,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:14:12,895 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=578293.3333333334, ans=0.0 2023-09-30 03:14:13,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:14:14,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:14:15,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=578293.3333333334, ans=0.125 2023-09-30 03:14:17,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 03:14:20,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:14:24,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 03:14:24,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:14:27,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:14:30,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 03:14:30,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 03:14:31,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:14:31,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 03:14:41,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:14:43,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:14:43,932 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:14:48,383 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:14:48,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:14:48,830 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=578426.6666666666, ans=0.0 2023-09-30 03:14:50,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:14:51,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:14:54,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:14:54,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:14:56,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 03:14:59,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:15:00,187 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=578493.3333333334, ans=0.125 2023-09-30 03:15:02,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 03:15:04,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:15:04,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:15:05,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:15:08,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=578560.0, ans=0.125 2023-09-30 03:15:09,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 03:15:09,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 03:15:11,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:15:12,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:15:12,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=578560.0, ans=0.125 2023-09-30 03:15:17,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:15:20,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:15:21,812 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:15:23,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 03:15:23,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:15:24,643 INFO [train.py:1039] (3/4) Epoch 17, batch 1800, loss[loss=0.187, simple_loss=0.2565, pruned_loss=0.0588, over 23792.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2567, pruned_loss=0.0548, over 4707108.00 frames. ], batch size: 212, lr: 6.10e-03, grad_scale: 32.0 2023-09-30 03:15:24,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=578626.6666666666, ans=0.125 2023-09-30 03:15:26,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:15:26,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:15:26,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:15:26,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:15:27,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:15:30,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:15:30,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:15:31,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=578626.6666666666, ans=0.125 2023-09-30 03:15:32,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 03:15:35,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:15:36,409 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.427e+02 1.847e+02 2.029e+02 2.247e+02 3.215e+02, threshold=4.058e+02, percent-clipped=0.0 2023-09-30 03:15:36,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:15:39,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:15:41,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:15:44,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:15:44,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:15:45,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:15:48,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:15:48,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 03:15:50,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:15:54,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:15:57,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 03:16:01,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 03:16:01,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 03:16:01,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:16:02,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:16:02,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:16:04,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:16:07,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=578760.0, ans=0.1 2023-09-30 03:16:11,399 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 03:16:12,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:16:14,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:16:16,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 03:16:17,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 03:16:17,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:16:18,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:16:20,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:16:26,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 03:16:31,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:16:31,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 03:16:32,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:16:32,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:16:32,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:16:33,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 03:16:37,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:16:37,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:16:39,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 03:16:39,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:16:41,673 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=578893.3333333334, ans=0.1 2023-09-30 03:16:42,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:16:42,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:16:42,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:16:43,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=578893.3333333334, ans=0.2 2023-09-30 03:16:44,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:16:44,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=578893.3333333334, ans=0.0 2023-09-30 03:16:45,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:16:47,386 INFO [train.py:1039] (3/4) Epoch 17, batch 1850, loss[loss=0.1881, simple_loss=0.2725, pruned_loss=0.05185, over 24065.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2569, pruned_loss=0.05497, over 4697599.81 frames. ], batch size: 80, lr: 6.10e-03, grad_scale: 32.0 2023-09-30 03:16:47,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:16:47,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:16:50,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:16:52,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:17:00,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=578960.0, ans=0.0 2023-09-30 03:17:00,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=578960.0, ans=0.125 2023-09-30 03:17:01,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:17:01,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 03:17:06,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 03:17:08,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 03:17:12,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:17:12,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 03:17:12,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 03:17:23,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:17:24,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 03:17:27,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:17:29,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:17:32,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 03:17:34,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:17:34,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:17:36,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:17:38,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:17:41,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:17:45,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:17:45,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:17:45,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 03:17:45,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:17:47,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=579160.0, ans=0.125 2023-09-30 03:17:48,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:17:48,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:17:54,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 03:17:54,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:17:57,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:17:57,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:17:57,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 03:17:57,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 03:18:00,350 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 03:18:00,473 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 03:18:00,727 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=579226.6666666666, ans=0.0 2023-09-30 03:18:02,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:18:02,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:18:03,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:18:03,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:18:03,547 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 03:18:03,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:18:05,421 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:18:07,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:18:09,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 03:18:10,970 INFO [train.py:1039] (3/4) Epoch 17, batch 1900, loss[loss=0.1957, simple_loss=0.2679, pruned_loss=0.0618, over 23415.00 frames. ], tot_loss[loss=0.1839, simple_loss=0.2579, pruned_loss=0.05502, over 4697837.26 frames. ], batch size: 285, lr: 6.10e-03, grad_scale: 16.0 2023-09-30 03:18:11,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:18:11,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 03:18:14,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:18:14,159 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 03:18:14,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:18:15,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:18:20,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:18:23,279 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 1.816e+02 1.988e+02 2.233e+02 2.900e+02, threshold=3.976e+02, percent-clipped=0.0 2023-09-30 03:18:23,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:18:24,963 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 03:18:26,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 03:18:27,640 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.53 vs. limit=15.0 2023-09-30 03:18:28,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:18:30,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:18:30,151 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 03:18:30,242 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 03:18:33,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 03:18:35,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:18:40,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 03:18:42,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 03:18:52,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 03:18:55,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 03:18:55,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:18:55,159 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 03:18:56,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 03:18:56,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 03:18:56,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 03:18:56,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:19:01,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 03:19:02,626 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.69 vs. limit=10.0 2023-09-30 03:19:05,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:19:08,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:19:08,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 03:19:11,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:19:11,898 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.62 vs. limit=15.0 2023-09-30 03:19:14,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 03:19:14,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:19:23,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:19:23,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:19:23,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:19:24,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:19:26,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 03:19:26,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 03:19:27,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:19:28,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=579560.0, ans=0.125 2023-09-30 03:19:29,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:19:29,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:19:29,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=579560.0, ans=0.1 2023-09-30 03:19:32,267 INFO [train.py:1039] (3/4) Epoch 17, batch 1950, loss[loss=0.1679, simple_loss=0.2575, pruned_loss=0.03918, over 24661.00 frames. ], tot_loss[loss=0.1837, simple_loss=0.2581, pruned_loss=0.05459, over 4709123.79 frames. ], batch size: 68, lr: 6.10e-03, grad_scale: 16.0 2023-09-30 03:19:32,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:19:32,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:19:32,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:19:34,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:19:37,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:19:40,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:19:40,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:19:41,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:19:42,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 03:19:42,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 03:19:44,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:19:45,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:19:48,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:19:48,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:19:50,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:19:52,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:19:55,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:19:55,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:19:55,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:19:55,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:20:00,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:20:03,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:20:03,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:20:03,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 03:20:03,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 03:20:04,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:20:05,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:20:06,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:20:11,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:20:14,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:20:19,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:20:20,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:20:22,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:20:22,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 03:20:22,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:20:27,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:20:28,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:20:30,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:20:38,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:20:38,903 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=579893.3333333334, ans=0.125 2023-09-30 03:20:40,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:20:43,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:20:44,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:20:46,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=579893.3333333334, ans=0.125 2023-09-30 03:20:48,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:20:48,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:20:48,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 03:20:48,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:20:48,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=579893.3333333334, ans=0.1 2023-09-30 03:20:50,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:20:51,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 03:20:51,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=579960.0, ans=0.1 2023-09-30 03:20:52,936 INFO [train.py:1039] (3/4) Epoch 17, batch 2000, loss[loss=0.1615, simple_loss=0.2331, pruned_loss=0.04493, over 24309.00 frames. ], tot_loss[loss=0.1851, simple_loss=0.2596, pruned_loss=0.05532, over 4716342.82 frames. ], batch size: 56, lr: 6.10e-03, grad_scale: 32.0 2023-09-30 03:20:54,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:20:57,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:20:59,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:20:59,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:21:01,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:21:03,440 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:21:07,542 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.864e+02 2.151e+02 2.440e+02 3.319e+02, threshold=4.303e+02, percent-clipped=0.0 2023-09-30 03:21:07,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 03:21:07,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:21:12,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:21:13,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 03:21:13,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:21:13,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:21:14,670 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.87 vs. limit=22.5 2023-09-30 03:21:18,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:21:19,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 03:21:22,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:25,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:25,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:26,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 03:21:26,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:21:28,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 03:21:28,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:21:33,373 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:21:33,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 03:21:33,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:33,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:21:33,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=580093.3333333334, ans=0.1 2023-09-30 03:21:36,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:21:36,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 03:21:40,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 03:21:40,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:21:40,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:21:45,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:21:47,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:21:47,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:21:48,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:21:48,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:21:48,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=580160.0, ans=0.0 2023-09-30 03:21:50,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:21:50,267 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:21:50,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:21:51,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:21:55,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:21:57,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 03:22:03,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:22:04,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:08,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:08,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:22:12,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:15,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:22:15,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:17,181 INFO [train.py:1039] (3/4) Epoch 17, batch 2050, loss[loss=0.196, simple_loss=0.2415, pruned_loss=0.07522, over 19466.00 frames. ], tot_loss[loss=0.184, simple_loss=0.2585, pruned_loss=0.05476, over 4721140.76 frames. ], batch size: 388, lr: 6.09e-03, grad_scale: 32.0 2023-09-30 03:22:17,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:22:17,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:22:18,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:20,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:22,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:22:23,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:25,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=580293.3333333334, ans=0.2 2023-09-30 03:22:28,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:22:30,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=580293.3333333334, ans=0.0 2023-09-30 03:22:31,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:22:31,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:22:33,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:22:33,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 03:22:34,332 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.48 vs. limit=22.5 2023-09-30 03:22:35,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:22:35,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:22:36,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:22:45,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:22:45,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:47,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=580360.0, ans=0.125 2023-09-30 03:22:49,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 03:22:52,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:22:52,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=580426.6666666666, ans=0.2 2023-09-30 03:22:53,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 03:22:53,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:22:57,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:22:58,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:23:00,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:23:00,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:23:01,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:23:03,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:23:05,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:23:06,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:23:08,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:23:12,051 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:23:13,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:23:13,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=580493.3333333334, ans=0.1 2023-09-30 03:23:18,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:23:22,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:23:23,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 03:23:24,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=580560.0, ans=0.125 2023-09-30 03:23:27,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=580560.0, ans=0.0 2023-09-30 03:23:30,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:23:30,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:23:30,654 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=580560.0, ans=0.125 2023-09-30 03:23:33,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:23:33,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=580560.0, ans=0.125 2023-09-30 03:23:35,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 03:23:40,064 INFO [train.py:1039] (3/4) Epoch 17, batch 2100, loss[loss=0.195, simple_loss=0.2739, pruned_loss=0.05802, over 24476.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2575, pruned_loss=0.05435, over 4717385.88 frames. ], batch size: 77, lr: 6.09e-03, grad_scale: 32.0 2023-09-30 03:23:40,323 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 03:23:40,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:23:40,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:23:41,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:23:42,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:23:42,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 03:23:43,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 03:23:44,108 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:23:48,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:23:48,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:23:51,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:23:51,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:23:51,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 03:23:53,681 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.830e+02 2.027e+02 2.301e+02 3.593e+02, threshold=4.054e+02, percent-clipped=0.0 2023-09-30 03:23:53,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:23:54,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 03:23:54,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 03:23:55,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:23:55,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:23:55,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 03:23:55,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=580693.3333333334, ans=0.0 2023-09-30 03:23:57,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 03:23:57,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=580693.3333333334, ans=0.0 2023-09-30 03:24:04,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 03:24:04,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:24:07,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:24:08,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:24:11,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:24:11,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 03:24:13,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:24:13,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 03:24:13,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 03:24:15,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:24:15,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 03:24:15,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 03:24:15,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 03:24:16,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:24:19,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:24:21,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:24:23,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:24:24,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:24:26,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:24:26,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 03:24:28,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:24:28,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:24:29,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:24:29,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 03:24:32,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 03:24:33,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 03:24:34,098 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=580826.6666666666, ans=0.2 2023-09-30 03:24:35,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=580826.6666666666, ans=0.0 2023-09-30 03:24:36,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:24:37,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=580826.6666666666, ans=0.2 2023-09-30 03:24:39,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:24:39,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 03:24:40,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=580826.6666666666, ans=0.125 2023-09-30 03:24:40,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=580826.6666666666, ans=0.04949747468305833 2023-09-30 03:24:47,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:24:48,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:24:50,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:24:50,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:24:50,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 03:24:51,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:24:51,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:24:53,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:24:53,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:24:53,406 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:24:54,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 03:24:56,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 03:24:56,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:24:58,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:24:58,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:25:00,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:25:00,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:25:03,747 INFO [train.py:1039] (3/4) Epoch 17, batch 2150, loss[loss=0.1813, simple_loss=0.2567, pruned_loss=0.05296, over 23424.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.256, pruned_loss=0.0541, over 4714630.87 frames. ], batch size: 134, lr: 6.09e-03, grad_scale: 16.0 2023-09-30 03:25:04,571 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.35 vs. limit=22.5 2023-09-30 03:25:05,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 03:25:07,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:25:07,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=580960.0, ans=0.125 2023-09-30 03:25:08,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:25:08,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:25:08,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:10,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:25:13,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:25:15,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:25:15,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:25:16,446 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.09 vs. limit=12.0 2023-09-30 03:25:20,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:20,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 03:25:20,642 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:25:21,121 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.45 vs. limit=15.0 2023-09-30 03:25:23,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:25:24,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:25:26,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:26,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:25:26,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:26,879 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=581026.6666666666, ans=0.125 2023-09-30 03:25:28,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:25:28,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:25:28,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:25:29,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:25:31,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 03:25:33,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:25:33,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=581026.6666666666, ans=0.125 2023-09-30 03:25:34,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:25:34,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:25:34,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:25:35,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:25:38,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:25:40,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:25:41,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:25:41,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 03:25:43,052 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 03:25:46,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:25:46,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:48,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:25:49,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:25:50,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:25:52,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:25:52,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 03:25:53,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 03:25:53,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:25:55,194 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 03:25:55,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:25:55,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:25:56,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 03:25:56,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:25:56,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 03:25:56,881 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 03:25:56,881 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 03:25:56,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 03:25:57,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=581160.0, ans=0.0 2023-09-30 03:25:59,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:26:01,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:26:01,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:26:02,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:26:03,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 03:26:06,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:26:06,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:26:16,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:26:16,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 03:26:19,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:26:24,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:26:25,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:26:26,987 INFO [train.py:1039] (3/4) Epoch 17, batch 2200, loss[loss=0.1675, simple_loss=0.2502, pruned_loss=0.04243, over 24480.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2566, pruned_loss=0.05449, over 4713703.20 frames. ], batch size: 66, lr: 6.09e-03, grad_scale: 16.0 2023-09-30 03:26:27,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:26:27,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:26:28,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=581293.3333333334, ans=0.0 2023-09-30 03:26:30,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:26:30,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:26:30,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 03:26:35,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=581293.3333333334, ans=0.125 2023-09-30 03:26:36,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 03:26:39,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:26:41,770 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.818e+02 1.974e+02 2.282e+02 3.535e+02, threshold=3.948e+02, percent-clipped=0.0 2023-09-30 03:26:47,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 03:26:50,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:26:51,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:26:51,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:26:55,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:26:56,429 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 03:27:01,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:27:01,728 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:27:03,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 03:27:05,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:27:07,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:27:08,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:27:10,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:27:13,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 03:27:14,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:27:16,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 03:27:19,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:27:19,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 03:27:19,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:27:21,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:27:23,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:27:23,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:27:23,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:27:23,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=581493.3333333334, ans=0.1 2023-09-30 03:27:26,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:27:26,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:27:28,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:27:32,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 03:27:32,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:27:36,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:27:37,385 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.93 vs. limit=15.0 2023-09-30 03:27:37,927 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 03:27:40,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:27:40,277 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 03:27:41,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:27:41,943 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 03:27:44,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:27:44,889 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 03:27:46,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:27:47,840 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 03:27:49,189 INFO [train.py:1039] (3/4) Epoch 17, batch 2250, loss[loss=0.1808, simple_loss=0.263, pruned_loss=0.04935, over 24309.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2568, pruned_loss=0.05451, over 4716545.03 frames. ], batch size: 61, lr: 6.09e-03, grad_scale: 16.0 2023-09-30 03:27:50,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:27:54,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:27:58,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:28:01,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:28:03,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:28:04,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:28:05,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:28:07,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 03:28:07,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:28:07,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:28:11,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 03:28:11,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:28:11,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten.whitening_limit, batch_count=581693.3333333334, ans=22.5 2023-09-30 03:28:12,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:28:15,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:28:22,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:28:23,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 03:28:23,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:28:25,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 03:28:25,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:28:29,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:28:32,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:28:34,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:28:34,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=581760.0, ans=0.04949747468305833 2023-09-30 03:28:35,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:28:35,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:28:37,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:28:40,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:28:45,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:28:50,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 03:28:55,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 03:28:55,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:28:58,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:28:58,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=581893.3333333334, ans=0.125 2023-09-30 03:29:03,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 03:29:05,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 03:29:05,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 03:29:06,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:29:06,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:29:09,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 03:29:11,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:29:11,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:29:12,837 INFO [train.py:1039] (3/4) Epoch 17, batch 2300, loss[loss=0.2501, simple_loss=0.3039, pruned_loss=0.09818, over 19320.00 frames. ], tot_loss[loss=0.1845, simple_loss=0.2589, pruned_loss=0.05506, over 4710238.48 frames. ], batch size: 388, lr: 6.09e-03, grad_scale: 16.0 2023-09-30 03:29:19,216 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.47 vs. limit=15.0 2023-09-30 03:29:19,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:29:19,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:29:21,334 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 03:29:22,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:29:27,888 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.573e+02 1.892e+02 2.152e+02 2.503e+02 3.822e+02, threshold=4.305e+02, percent-clipped=0.0 2023-09-30 03:29:28,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:29:29,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 03:29:29,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:29:29,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:29:29,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 03:29:31,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:29:32,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:29:34,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:29:38,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:29:42,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:29:46,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:29:46,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=582093.3333333334, ans=0.125 2023-09-30 03:29:51,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:29:53,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:29:57,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:30:00,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:30:04,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:30:04,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:30:06,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:30:06,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 03:30:07,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=582160.0, ans=0.95 2023-09-30 03:30:11,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 03:30:11,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:30:12,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:30:12,590 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:30:12,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:30:14,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 03:30:14,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:30:14,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 03:30:14,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:30:14,277 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:30:15,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 03:30:23,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:30:25,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:30:27,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=582226.6666666666, ans=0.015 2023-09-30 03:30:29,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:30:29,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:30:30,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:30:33,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 03:30:33,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:30:33,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:30:35,039 INFO [train.py:1039] (3/4) Epoch 17, batch 2350, loss[loss=0.2078, simple_loss=0.2807, pruned_loss=0.0674, over 23422.00 frames. ], tot_loss[loss=0.1857, simple_loss=0.2597, pruned_loss=0.05588, over 4699657.32 frames. ], batch size: 93, lr: 6.08e-03, grad_scale: 16.0 2023-09-30 03:30:35,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 03:30:40,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:30:41,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 03:30:47,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 03:30:50,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:30:54,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:30:54,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:30:54,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:30:54,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:30:56,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 03:30:56,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=582360.0, ans=0.2 2023-09-30 03:30:58,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=582360.0, ans=0.0 2023-09-30 03:31:00,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:31:08,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 03:31:09,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:31:12,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:31:12,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:31:14,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:31:16,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 03:31:18,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:31:20,708 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=8.09 vs. limit=15.0 2023-09-30 03:31:21,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:31:21,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:31:21,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:31:21,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=582426.6666666666, ans=0.125 2023-09-30 03:31:24,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:31:26,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 03:31:26,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:31:29,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:31:29,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:31:31,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 03:31:32,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:31:36,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 03:31:36,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:31:41,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=582560.0, ans=0.0 2023-09-30 03:31:42,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 03:31:47,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 03:31:47,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:31:47,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 03:31:49,232 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 03:31:49,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 03:31:51,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 03:31:53,735 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.76 vs. limit=10.0 2023-09-30 03:31:54,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:31:57,603 INFO [train.py:1039] (3/4) Epoch 17, batch 2400, loss[loss=0.1779, simple_loss=0.2425, pruned_loss=0.05667, over 23215.00 frames. ], tot_loss[loss=0.1852, simple_loss=0.2586, pruned_loss=0.05589, over 4689223.14 frames. ], batch size: 119, lr: 6.08e-03, grad_scale: 32.0 2023-09-30 03:31:57,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:32:01,793 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=582626.6666666666, ans=0.1 2023-09-30 03:32:02,840 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:32:03,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=582626.6666666666, ans=0.1 2023-09-30 03:32:04,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:32:06,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 03:32:06,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 03:32:09,966 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.78 vs. limit=22.5 2023-09-30 03:32:13,859 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.898e+02 2.064e+02 2.332e+02 3.496e+02, threshold=4.129e+02, percent-clipped=0.0 2023-09-30 03:32:14,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 03:32:14,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:32:17,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 03:32:18,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:32:18,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:32:18,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 03:32:26,081 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=4.89 vs. limit=15.0 2023-09-30 03:32:27,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:32:27,970 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.43 vs. limit=15.0 2023-09-30 03:32:28,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 03:32:35,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:32:36,359 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.87 vs. limit=22.5 2023-09-30 03:32:37,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=582760.0, ans=0.125 2023-09-30 03:32:38,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 03:32:42,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:32:42,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=582760.0, ans=0.125 2023-09-30 03:32:43,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:32:48,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:32:48,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 03:32:50,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:32:53,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=582826.6666666666, ans=0.0 2023-09-30 03:32:56,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:32:59,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:33:03,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:33:03,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:33:03,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 03:33:03,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:33:03,566 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=582893.3333333334, ans=0.125 2023-09-30 03:33:04,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:33:04,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:33:05,004 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 03:33:05,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=582893.3333333334, ans=0.0 2023-09-30 03:33:11,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:33:11,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:33:12,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 03:33:13,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 03:33:16,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:33:16,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:33:16,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 03:33:18,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 03:33:18,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 03:33:18,538 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 03:33:19,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 03:33:20,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:33:21,461 INFO [train.py:1039] (3/4) Epoch 17, batch 2450, loss[loss=0.1746, simple_loss=0.2384, pruned_loss=0.0554, over 23785.00 frames. ], tot_loss[loss=0.1843, simple_loss=0.2572, pruned_loss=0.05566, over 4682778.41 frames. ], batch size: 212, lr: 6.08e-03, grad_scale: 32.0 2023-09-30 03:33:21,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:33:23,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:33:23,731 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 03:33:25,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:33:25,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 03:33:28,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:33:28,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:33:33,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:33:33,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:33:35,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 03:33:41,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:33:41,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:33:42,325 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.12 vs. limit=6.0 2023-09-30 03:33:42,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:33:44,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:33:44,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:33:44,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 03:33:50,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:33:51,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:33:53,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:33:56,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 03:33:58,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:33:58,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:34:00,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:34:01,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 03:34:03,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:34:11,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:34:12,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:34:13,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:34:13,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:34:13,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:34:14,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:34:14,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 03:34:18,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:34:18,468 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:34:23,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:34:23,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:34:24,094 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=11.73 vs. limit=15.0 2023-09-30 03:34:28,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:34:28,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 03:34:28,317 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:34:29,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:34:29,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 03:34:31,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:34:33,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:34:36,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:34:38,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:34:39,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:34:41,724 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.74 vs. limit=15.0 2023-09-30 03:34:44,645 INFO [train.py:1039] (3/4) Epoch 17, batch 2500, loss[loss=0.1634, simple_loss=0.2068, pruned_loss=0.06003, over 19056.00 frames. ], tot_loss[loss=0.183, simple_loss=0.2558, pruned_loss=0.05512, over 4679932.25 frames. ], batch size: 388, lr: 6.08e-03, grad_scale: 8.0 2023-09-30 03:34:44,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 03:34:44,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:34:50,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:34:53,087 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=583293.3333333334, ans=0.125 2023-09-30 03:35:00,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:35:00,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:35:01,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:35:01,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 03:35:03,354 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.839e+02 2.064e+02 2.322e+02 3.484e+02, threshold=4.127e+02, percent-clipped=0.0 2023-09-30 03:35:09,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:35:11,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:35:11,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 03:35:11,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 03:35:11,743 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 03:35:13,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:35:14,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:35:14,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 03:35:14,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:35:16,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 03:35:16,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:35:21,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:35:23,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:35:25,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:35:27,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 03:35:27,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:35:30,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:35:34,978 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:35:38,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=583493.3333333334, ans=0.125 2023-09-30 03:35:38,673 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.23 vs. limit=10.0 2023-09-30 03:35:39,481 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:35:43,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:35:47,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 03:35:52,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 03:35:52,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:35:52,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 03:35:52,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=583560.0, ans=0.1 2023-09-30 03:35:54,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:35:54,462 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 03:35:55,956 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 03:35:55,957 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 03:35:55,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 03:35:57,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:36:01,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 03:36:01,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 03:36:02,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:36:03,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 03:36:05,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 03:36:05,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=583560.0, ans=0.0 2023-09-30 03:36:08,047 INFO [train.py:1039] (3/4) Epoch 17, batch 2550, loss[loss=0.1893, simple_loss=0.263, pruned_loss=0.05776, over 23554.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2567, pruned_loss=0.05525, over 4689933.90 frames. ], batch size: 93, lr: 6.08e-03, grad_scale: 8.0 2023-09-30 03:36:08,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:36:11,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:36:12,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:36:14,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:36:15,812 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 03:36:15,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:36:20,351 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.47 vs. limit=22.5 2023-09-30 03:36:21,019 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 03:36:22,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:36:24,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:36:27,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:36:27,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 03:36:29,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 03:36:29,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:36:29,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:36:31,311 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 03:36:33,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:36:33,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 03:36:33,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 03:36:33,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:36:33,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 03:36:47,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:36:52,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:36:53,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:36:53,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:36:54,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:36:54,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=583760.0, ans=0.125 2023-09-30 03:37:00,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:37:03,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 03:37:03,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:37:03,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:37:03,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:37:05,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:37:09,587 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.31 vs. limit=10.0 2023-09-30 03:37:10,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:37:10,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:37:15,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:37:15,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 03:37:15,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:37:16,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:37:18,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:37:18,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:37:19,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:37:23,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=583893.3333333334, ans=0.125 2023-09-30 03:37:24,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:37:28,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:37:31,096 INFO [train.py:1039] (3/4) Epoch 17, batch 2600, loss[loss=0.1784, simple_loss=0.2543, pruned_loss=0.05129, over 23385.00 frames. ], tot_loss[loss=0.1842, simple_loss=0.2575, pruned_loss=0.05551, over 4688839.84 frames. ], batch size: 119, lr: 6.08e-03, grad_scale: 8.0 2023-09-30 03:37:31,282 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 03:37:36,315 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 03:37:36,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:37:36,413 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 03:37:36,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 03:37:36,588 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 03:37:36,832 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=583960.0, ans=0.125 2023-09-30 03:37:38,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=583960.0, ans=0.2 2023-09-30 03:37:42,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:37:42,052 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 03:37:43,451 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 03:37:44,879 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 03:37:46,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:37:48,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 03:37:49,474 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 2.083e+02 2.505e+02 2.924e+02 4.278e+02, threshold=5.011e+02, percent-clipped=1.0 2023-09-30 03:37:51,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 03:37:52,728 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:37:52,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 03:37:55,971 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 03:37:56,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 03:38:01,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=584026.6666666666, ans=0.0 2023-09-30 03:38:02,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:38:02,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:38:04,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:38:04,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 03:38:07,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:38:12,662 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 03:38:14,557 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=584093.3333333334, ans=0.1 2023-09-30 03:38:18,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:38:18,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:38:18,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 03:38:18,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:38:18,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:38:20,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 03:38:22,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:38:23,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:38:25,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:38:29,559 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 03:38:29,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:38:29,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:38:33,183 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=584160.0, ans=0.2 2023-09-30 03:38:34,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:38:36,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:38:36,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 03:38:37,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:38:38,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=584226.6666666666, ans=0.125 2023-09-30 03:38:39,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:38:39,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:38:39,785 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=584226.6666666666, ans=0.125 2023-09-30 03:38:46,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 03:38:47,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:38:50,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:38:53,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 03:38:53,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:38:55,188 INFO [train.py:1039] (3/4) Epoch 17, batch 2650, loss[loss=0.172, simple_loss=0.2614, pruned_loss=0.04125, over 24328.00 frames. ], tot_loss[loss=0.184, simple_loss=0.2576, pruned_loss=0.05521, over 4689002.40 frames. ], batch size: 74, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:38:55,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 03:38:55,395 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 03:38:55,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:38:58,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:39:01,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 03:39:02,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:39:04,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=584293.3333333334, ans=0.07 2023-09-30 03:39:05,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:39:07,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 03:39:07,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:39:08,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:39:11,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 03:39:12,949 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 03:39:15,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:39:20,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 03:39:20,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:39:21,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 03:39:24,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=584360.0, ans=0.125 2023-09-30 03:39:25,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=584360.0, ans=0.125 2023-09-30 03:39:26,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:39:27,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 03:39:27,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:39:27,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:39:27,812 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.81 vs. limit=15.0 2023-09-30 03:39:30,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 03:39:30,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 03:39:33,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:39:38,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 03:39:38,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:39:41,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:39:41,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:39:41,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:39:42,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:39:42,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:39:45,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:39:47,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:39:49,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:39:51,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:39:52,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:39:53,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:39:53,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:39:55,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:39:56,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 03:39:58,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:39:59,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:39:59,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:40:00,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 03:40:02,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:40:04,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=584560.0, ans=0.1 2023-09-30 03:40:05,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:40:06,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:40:06,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:07,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=584560.0, ans=0.125 2023-09-30 03:40:08,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 03:40:09,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:12,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:40:12,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 03:40:15,281 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=12.28 vs. limit=15.0 2023-09-30 03:40:15,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:40:16,862 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=20.70 vs. limit=22.5 2023-09-30 03:40:17,476 INFO [train.py:1039] (3/4) Epoch 17, batch 2700, loss[loss=0.1594, simple_loss=0.2356, pruned_loss=0.04161, over 24596.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2575, pruned_loss=0.05483, over 4699027.46 frames. ], batch size: 60, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:40:17,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 03:40:19,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:40:19,948 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.75 vs. limit=15.0 2023-09-30 03:40:20,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:20,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:22,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:40:22,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:40:22,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:40:22,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 03:40:22,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 03:40:24,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:40:26,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:40:27,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:40:28,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:40:31,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:40:32,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 03:40:34,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:40:36,042 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.817e+02 2.004e+02 2.215e+02 2.992e+02, threshold=4.008e+02, percent-clipped=0.0 2023-09-30 03:40:40,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:40:40,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:40:45,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:40:45,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:40:45,918 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=584693.3333333334, ans=0.125 2023-09-30 03:40:47,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:40:47,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:40:50,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:40:53,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:40:53,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:40:53,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:40:58,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:40:58,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:41:02,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=584760.0, ans=0.125 2023-09-30 03:41:04,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=584760.0, ans=0.125 2023-09-30 03:41:07,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:41:09,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:41:12,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:41:12,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:41:18,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:41:18,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:41:18,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:41:18,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=584826.6666666666, ans=0.2 2023-09-30 03:41:19,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:41:21,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:41:22,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:41:23,265 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=584893.3333333334, ans=0.0 2023-09-30 03:41:24,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:41:27,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:41:27,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:41:30,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 03:41:32,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:41:34,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:41:34,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 03:41:36,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 03:41:36,823 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.08 vs. limit=15.0 2023-09-30 03:41:37,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:41:39,339 INFO [train.py:1039] (3/4) Epoch 17, batch 2750, loss[loss=0.2005, simple_loss=0.2585, pruned_loss=0.07122, over 23886.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2577, pruned_loss=0.05474, over 4707821.36 frames. ], batch size: 195, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:41:39,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=584960.0, ans=0.0 2023-09-30 03:41:40,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:41:41,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:41:45,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:41:45,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:41:47,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:41:50,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:41:50,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 03:41:52,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:41:52,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:41:52,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 03:41:52,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:41:52,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:41:57,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 03:42:00,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:42:00,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:42:00,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:42:01,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 03:42:01,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:42:03,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:42:03,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:42:05,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:42:09,643 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.76 vs. limit=15.0 2023-09-30 03:42:10,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 03:42:12,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 03:42:12,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:42:13,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:42:15,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:42:21,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:42:23,785 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=585093.3333333334, ans=0.2 2023-09-30 03:42:25,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:42:25,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:42:26,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=585160.0, ans=0.125 2023-09-30 03:42:28,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:42:28,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:42:29,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:42:35,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:42:35,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:42:35,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 03:42:41,920 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=585160.0, ans=0.125 2023-09-30 03:42:42,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:42:44,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 03:42:50,039 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 03:42:53,117 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:42:53,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 03:42:54,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:42:56,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:42:56,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 03:42:57,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:42:58,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 03:43:00,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:43:00,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:43:00,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 03:43:00,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:43:01,848 INFO [train.py:1039] (3/4) Epoch 17, batch 2800, loss[loss=0.1772, simple_loss=0.2625, pruned_loss=0.046, over 24473.00 frames. ], tot_loss[loss=0.1826, simple_loss=0.257, pruned_loss=0.05412, over 4710200.14 frames. ], batch size: 66, lr: 6.07e-03, grad_scale: 16.0 2023-09-30 03:43:02,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:43:03,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:43:03,967 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.04 vs. limit=12.0 2023-09-30 03:43:04,928 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 03:43:04,929 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 03:43:08,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:43:09,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:43:11,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:43:15,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:43:18,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 03:43:20,140 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.888e+02 2.133e+02 2.598e+02 4.037e+02, threshold=4.266e+02, percent-clipped=1.0 2023-09-30 03:43:20,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 03:43:22,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 03:43:23,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:43:24,399 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.49 vs. limit=15.0 2023-09-30 03:43:24,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:43:24,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:43:25,275 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=585360.0, ans=0.125 2023-09-30 03:43:28,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:43:28,435 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=585360.0, ans=0.125 2023-09-30 03:43:30,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:43:30,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:43:31,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:43:39,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:43:42,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:43:45,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:43:45,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:43:46,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:43:52,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:43:52,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 03:43:52,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:43:54,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:43:54,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:43:59,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:43:59,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:44:01,136 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=585493.3333333334, ans=0.125 2023-09-30 03:44:04,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:44:06,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:44:07,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:44:07,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:44:07,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 03:44:09,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:44:10,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:44:10,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 03:44:10,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:44:11,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=585560.0, ans=0.0 2023-09-30 03:44:12,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:44:12,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:44:15,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 03:44:15,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:44:16,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:44:16,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:44:18,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 03:44:20,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=585560.0, ans=0.125 2023-09-30 03:44:22,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=585626.6666666666, ans=0.0 2023-09-30 03:44:23,668 INFO [train.py:1039] (3/4) Epoch 17, batch 2850, loss[loss=0.1868, simple_loss=0.2559, pruned_loss=0.05889, over 23635.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.256, pruned_loss=0.05366, over 4712988.03 frames. ], batch size: 134, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:44:23,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:44:23,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:44:25,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:44:28,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:44:32,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:44:32,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:44:32,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:44:35,011 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.89 vs. limit=15.0 2023-09-30 03:44:35,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:44:37,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:44:39,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:44:40,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 03:44:45,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 03:44:45,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:44:48,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 03:44:49,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:44:51,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 03:44:51,598 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=585693.3333333334, ans=0.125 2023-09-30 03:44:52,824 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 03:44:54,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:45:06,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:45:08,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:45:10,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:45:11,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 03:45:11,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 03:45:11,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:45:14,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:45:15,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 03:45:16,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:45:18,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:45:18,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:45:18,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:45:21,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:45:21,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:45:23,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:45:23,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:45:26,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:45:26,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:45:27,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:45:31,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:45:35,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:45:36,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 03:45:37,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=585893.3333333334, ans=0.125 2023-09-30 03:45:38,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 03:45:39,825 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 03:45:41,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:45:41,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 03:45:41,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:45:41,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:45:41,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:45:43,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:45:43,484 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 03:45:43,561 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 03:45:43,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:45:43,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:45:46,528 INFO [train.py:1039] (3/4) Epoch 17, batch 2900, loss[loss=0.1928, simple_loss=0.2575, pruned_loss=0.0641, over 23842.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.2558, pruned_loss=0.05359, over 4704985.13 frames. ], batch size: 212, lr: 6.07e-03, grad_scale: 8.0 2023-09-30 03:45:50,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:45:50,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:45:50,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:45:53,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 03:45:56,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:45:56,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 03:45:58,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 03:45:59,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 03:45:59,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:46:01,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:46:03,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:46:06,440 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.447e+02 1.832e+02 2.093e+02 2.427e+02 4.261e+02, threshold=4.186e+02, percent-clipped=0.0 2023-09-30 03:46:06,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 03:46:08,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:46:11,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:46:11,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 03:46:13,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:46:13,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:46:16,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 03:46:18,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 03:46:21,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:46:21,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 03:46:21,845 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:46:24,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:46:24,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 03:46:26,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:46:27,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:46:31,691 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.15 vs. limit=12.0 2023-09-30 03:46:32,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:46:35,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:46:38,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 03:46:38,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 03:46:38,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:46:42,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:46:44,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 03:46:46,592 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.57 vs. limit=15.0 2023-09-30 03:46:47,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 03:46:51,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:46:59,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=586226.6666666666, ans=0.125 2023-09-30 03:47:01,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=586226.6666666666, ans=0.1 2023-09-30 03:47:02,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:47:02,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 03:47:02,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 03:47:02,852 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=586226.6666666666, ans=0.1 2023-09-30 03:47:05,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:47:05,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 03:47:07,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:47:07,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:47:08,575 INFO [train.py:1039] (3/4) Epoch 17, batch 2950, loss[loss=0.186, simple_loss=0.2638, pruned_loss=0.05409, over 24133.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.2571, pruned_loss=0.05419, over 4704829.47 frames. ], batch size: 80, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:47:12,043 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=586293.3333333334, ans=0.95 2023-09-30 03:47:15,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:47:15,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 03:47:17,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:47:17,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:47:19,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:47:19,937 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.87 vs. limit=15.0 2023-09-30 03:47:20,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:47:20,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 03:47:22,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 03:47:22,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 03:47:22,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:47:30,931 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:47:31,105 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=586360.0, ans=0.0 2023-09-30 03:47:33,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:47:35,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:47:36,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:47:38,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:47:38,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:47:40,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=586426.6666666666, ans=0.125 2023-09-30 03:47:41,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:47:43,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:47:43,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:47:44,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 03:47:45,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=586426.6666666666, ans=0.1 2023-09-30 03:47:47,145 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.23 vs. limit=15.0 2023-09-30 03:47:50,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 03:47:50,131 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 03:47:51,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:47:53,615 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 03:47:55,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 03:47:55,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:47:55,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:47:55,182 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 03:47:55,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 03:47:55,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=586426.6666666666, ans=0.125 2023-09-30 03:47:58,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 03:47:59,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:48:00,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 03:48:02,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:48:02,977 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.93 vs. limit=10.0 2023-09-30 03:48:03,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:48:05,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:48:05,784 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 03:48:05,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:48:05,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 03:48:13,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:48:15,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:48:16,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 03:48:16,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:48:18,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=586560.0, ans=0.1 2023-09-30 03:48:19,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 03:48:19,940 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=586560.0, ans=0.125 2023-09-30 03:48:21,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:48:23,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:48:24,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:48:24,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=586560.0, ans=0.1 2023-09-30 03:48:26,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:48:26,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 03:48:27,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:48:27,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:48:27,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:48:30,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 03:48:32,002 INFO [train.py:1039] (3/4) Epoch 17, batch 3000, loss[loss=0.1939, simple_loss=0.2603, pruned_loss=0.06371, over 23920.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2582, pruned_loss=0.05385, over 4723410.00 frames. ], batch size: 165, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:48:32,002 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-30 03:48:41,557 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.2.encoder.layers.2.self_attn_weights, attn_weights_entropy = tensor([4.0262, 3.7080, 3.0876, 3.2969], device='cuda:3') 2023-09-30 03:48:47,119 INFO [train.py:1071] (3/4) Epoch 17, validation: loss=0.2916, simple_loss=0.2691, pruned_loss=0.1571, over 1125622.00 frames. 2023-09-30 03:48:47,120 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-30 03:48:47,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:48:47,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=586626.6666666666, ans=0.0 2023-09-30 03:48:48,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:48:50,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:48:50,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 03:48:51,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:48:53,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:48:54,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 03:49:01,546 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 03:49:01,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 03:49:05,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:49:05,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:49:07,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 03:49:07,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:49:10,632 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.539e+02 1.835e+02 1.995e+02 2.224e+02 3.286e+02, threshold=3.989e+02, percent-clipped=0.0 2023-09-30 03:49:15,652 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:49:25,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:49:31,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 03:49:33,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:49:34,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:49:36,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:49:38,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:49:38,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:49:38,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 03:49:42,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 03:49:42,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:49:43,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 03:49:47,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:49:47,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:49:47,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:49:47,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:49:50,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 03:49:50,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:49:50,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:49:52,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=586826.6666666666, ans=0.125 2023-09-30 03:49:53,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:49:57,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 03:49:57,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 03:49:58,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:49:59,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:50:02,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:50:02,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:50:03,775 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 03:50:04,376 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.29 vs. limit=15.0 2023-09-30 03:50:05,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 03:50:05,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:50:06,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 03:50:06,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:50:08,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 03:50:13,257 INFO [train.py:1039] (3/4) Epoch 17, batch 3050, loss[loss=0.1828, simple_loss=0.2516, pruned_loss=0.05695, over 23580.00 frames. ], tot_loss[loss=0.1838, simple_loss=0.2587, pruned_loss=0.05447, over 4715711.36 frames. ], batch size: 134, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:50:13,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 03:50:13,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 03:50:13,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 03:50:15,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 03:50:15,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 03:50:15,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=586960.0, ans=0.2 2023-09-30 03:50:16,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:50:16,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:50:18,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 03:50:18,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:50:18,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:50:21,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 03:50:23,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:50:26,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:50:26,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:50:31,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:50:34,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 03:50:36,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=587026.6666666666, ans=0.2 2023-09-30 03:50:39,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 03:50:39,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 03:50:40,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:50:43,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 03:50:45,759 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.87 vs. limit=6.0 2023-09-30 03:50:49,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:50:49,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:50:51,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:50:52,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:50:54,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 03:50:54,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:50:54,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:50:54,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:50:55,814 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:50:58,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:51:00,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:51:00,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 03:51:00,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:51:02,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 03:51:05,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:51:06,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 03:51:07,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:51:07,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:51:09,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=587160.0, ans=0.125 2023-09-30 03:51:12,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:51:13,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:51:20,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:51:21,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:51:21,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:51:21,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:51:23,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 03:51:23,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 03:51:25,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 03:51:27,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:51:27,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:51:27,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 03:51:27,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=587226.6666666666, ans=0.5 2023-09-30 03:51:30,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:51:33,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:51:35,819 INFO [train.py:1039] (3/4) Epoch 17, batch 3100, loss[loss=0.1703, simple_loss=0.2433, pruned_loss=0.04862, over 22744.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2579, pruned_loss=0.05447, over 4708736.44 frames. ], batch size: 50, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:51:36,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 03:51:37,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 03:51:40,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 03:51:43,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 03:51:44,598 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.12 vs. limit=22.5 2023-09-30 03:51:45,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 03:51:45,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 03:51:48,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:51:48,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:51:53,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 03:51:55,421 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.922e+02 2.326e+02 2.796e+02 3.777e+02, threshold=4.651e+02, percent-clipped=0.0 2023-09-30 03:51:55,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:52:02,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 03:52:05,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 03:52:07,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:08,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:52:08,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:52:10,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 03:52:11,319 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.51 vs. limit=15.0 2023-09-30 03:52:12,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:52:12,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 03:52:12,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:52:15,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:52:15,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 03:52:18,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:52:20,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:52:21,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 03:52:23,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 03:52:25,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:25,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:52:28,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:52:28,869 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:30,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:52:31,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=587493.3333333334, ans=0.1 2023-09-30 03:52:32,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:52:32,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:52:33,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:52:33,894 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:52:33,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:33,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 03:52:37,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:52:38,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 03:52:41,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:52:43,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 03:52:44,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:52:45,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:52:45,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 03:52:56,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 03:52:58,043 INFO [train.py:1039] (3/4) Epoch 17, batch 3150, loss[loss=0.1617, simple_loss=0.2347, pruned_loss=0.04437, over 24468.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.2564, pruned_loss=0.05391, over 4707557.63 frames. ], batch size: 58, lr: 6.06e-03, grad_scale: 8.0 2023-09-30 03:52:58,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:52:59,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:53:03,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:53:03,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:53:04,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 03:53:04,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:53:06,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 03:53:07,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 03:53:10,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:53:11,753 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 03:53:14,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 03:53:14,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:53:16,386 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 03:53:17,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 03:53:18,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 03:53:18,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 03:53:18,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 03:53:18,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:53:18,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:53:20,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:53:20,853 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=587693.3333333334, ans=0.125 2023-09-30 03:53:22,028 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 03:53:23,733 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=587693.3333333334, ans=0.125 2023-09-30 03:53:24,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:53:25,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:53:25,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:53:25,437 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=587693.3333333334, ans=0.125 2023-09-30 03:53:28,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 03:53:31,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 03:53:31,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:53:36,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 03:53:37,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:53:37,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=587760.0, ans=0.1 2023-09-30 03:53:38,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 03:53:42,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 03:53:42,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:53:42,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 03:53:42,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=587760.0, ans=0.1 2023-09-30 03:53:43,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 03:53:43,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:53:43,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 03:53:45,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 03:53:45,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 03:53:45,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 03:53:46,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 03:53:47,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:53:50,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:53:50,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:53:50,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 03:53:52,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:53:55,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 03:53:55,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:53:55,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 03:53:56,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 03:53:58,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:53:59,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:53:59,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 03:54:01,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 03:54:02,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:54:06,450 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.08 vs. limit=15.0 2023-09-30 03:54:07,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:54:07,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:54:08,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:54:14,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:54:16,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:54:16,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=587893.3333333334, ans=0.125 2023-09-30 03:54:18,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 03:54:20,671 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=587960.0, ans=0.1 2023-09-30 03:54:21,673 INFO [train.py:1039] (3/4) Epoch 17, batch 3200, loss[loss=0.1767, simple_loss=0.241, pruned_loss=0.05614, over 23844.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.256, pruned_loss=0.05365, over 4713984.35 frames. ], batch size: 164, lr: 6.06e-03, grad_scale: 16.0 2023-09-30 03:54:23,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:54:23,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 03:54:26,160 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.84 vs. limit=10.0 2023-09-30 03:54:26,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:54:28,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:54:28,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 03:54:31,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:54:33,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=587960.0, ans=0.0 2023-09-30 03:54:35,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=587960.0, ans=0.0 2023-09-30 03:54:37,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:54:39,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=588026.6666666666, ans=0.125 2023-09-30 03:54:39,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=588026.6666666666, ans=0.1 2023-09-30 03:54:40,745 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.992e+02 2.258e+02 2.770e+02 4.284e+02, threshold=4.516e+02, percent-clipped=0.0 2023-09-30 03:54:40,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:54:49,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 03:54:56,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=588093.3333333334, ans=0.125 2023-09-30 03:55:00,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 03:55:00,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:55:04,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 03:55:05,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 03:55:08,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:55:08,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 03:55:10,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:55:14,195 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.97 vs. limit=15.0 2023-09-30 03:55:15,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 03:55:17,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 03:55:20,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 03:55:23,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 03:55:25,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 03:55:30,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:55:31,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 03:55:31,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:55:31,906 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 03:55:31,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 03:55:34,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:55:37,377 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 03:55:37,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 03:55:38,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 03:55:40,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 03:55:41,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 03:55:43,577 INFO [train.py:1039] (3/4) Epoch 17, batch 3250, loss[loss=0.178, simple_loss=0.2549, pruned_loss=0.05053, over 18372.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.2554, pruned_loss=0.05344, over 4712663.32 frames. ], batch size: 39, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 03:55:43,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:55:43,857 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 03:55:43,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:55:45,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:55:46,748 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 03:55:50,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 03:55:53,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:55:54,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=588293.3333333334, ans=0.125 2023-09-30 03:56:01,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:56:01,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 03:56:03,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:56:04,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:56:04,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:56:04,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:56:04,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 03:56:08,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:56:08,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 03:56:08,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:56:10,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:56:10,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:56:10,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:56:13,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:56:14,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 03:56:17,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:56:17,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:56:20,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:56:20,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:56:20,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:56:25,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 03:56:26,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:56:26,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 03:56:26,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=588426.6666666666, ans=0.2 2023-09-30 03:56:28,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:56:28,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 03:56:32,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=588493.3333333334, ans=0.125 2023-09-30 03:56:33,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=588493.3333333334, ans=0.2 2023-09-30 03:56:36,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:56:42,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:56:42,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:56:42,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 03:56:42,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 03:56:42,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 03:56:43,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:56:45,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=588493.3333333334, ans=0.125 2023-09-30 03:56:47,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 03:56:47,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 03:56:49,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:56:50,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:56:52,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:56:52,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 03:56:52,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:56:57,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:56:57,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:56:59,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 03:56:59,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:57:02,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 03:57:02,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 03:57:05,637 INFO [train.py:1039] (3/4) Epoch 17, batch 3300, loss[loss=0.2042, simple_loss=0.2704, pruned_loss=0.06897, over 23807.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.2563, pruned_loss=0.05355, over 4724129.79 frames. ], batch size: 179, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 03:57:05,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:57:05,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 03:57:09,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 03:57:11,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 03:57:11,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:57:15,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:57:17,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:57:17,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:57:17,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 03:57:18,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 03:57:22,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:57:23,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:57:25,151 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.863e+02 2.079e+02 2.237e+02 3.389e+02, threshold=4.158e+02, percent-clipped=0.0 2023-09-30 03:57:26,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 03:57:27,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=588693.3333333334, ans=0.025 2023-09-30 03:57:28,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:57:28,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:57:29,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:57:30,535 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 03:57:30,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:57:32,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 03:57:33,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 03:57:33,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:57:33,666 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 03:57:34,729 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.29 vs. limit=15.0 2023-09-30 03:57:38,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:57:38,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 03:57:40,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:57:40,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 03:57:42,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 03:57:42,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:57:44,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 03:57:45,742 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 03:57:45,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 03:57:47,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:57:50,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 03:57:52,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:57:54,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 03:57:54,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:57:57,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:57:58,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:57:58,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:57:58,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 03:57:59,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=588826.6666666666, ans=15.0 2023-09-30 03:57:59,548 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.77 vs. limit=22.5 2023-09-30 03:58:01,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 03:58:02,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:58:02,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 03:58:05,420 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 03:58:06,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 03:58:08,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 03:58:09,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:58:09,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:58:12,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:58:12,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:58:13,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:58:15,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:15,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 03:58:15,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:58:18,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 03:58:21,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 03:58:21,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:58:23,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:25,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 03:58:25,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:58:26,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:58:26,838 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=588960.0, ans=0.2 2023-09-30 03:58:28,041 INFO [train.py:1039] (3/4) Epoch 17, batch 3350, loss[loss=0.1735, simple_loss=0.2554, pruned_loss=0.04581, over 24482.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2569, pruned_loss=0.05356, over 4731289.02 frames. ], batch size: 63, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 03:58:30,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 03:58:30,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:58:34,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:58:34,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:58:36,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 03:58:39,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:41,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 03:58:43,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:58:43,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 03:58:44,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 03:58:46,313 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 03:58:46,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 03:58:48,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=589026.6666666666, ans=0.1 2023-09-30 03:58:51,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 03:58:51,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 03:58:53,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 03:58:53,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 03:58:54,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:58:54,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 03:58:54,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:54,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 03:58:56,629 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:58:59,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:58:59,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:59:01,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 03:59:04,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:59:05,141 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.93 vs. limit=22.5 2023-09-30 03:59:06,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:59:07,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:59:12,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 03:59:14,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 03:59:14,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:59:14,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:59:15,406 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.87 vs. limit=15.0 2023-09-30 03:59:17,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:59:20,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 03:59:20,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 03:59:20,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 03:59:20,738 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 03:59:20,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=589160.0, ans=0.125 2023-09-30 03:59:24,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 03:59:25,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:59:27,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 03:59:35,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:59:35,499 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 03:59:36,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 03:59:37,051 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 03:59:39,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 03:59:42,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 03:59:46,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 03:59:46,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 03:59:46,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 03:59:49,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 03:59:49,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 03:59:50,546 INFO [train.py:1039] (3/4) Epoch 17, batch 3400, loss[loss=0.2098, simple_loss=0.2723, pruned_loss=0.07361, over 23745.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.2579, pruned_loss=0.0537, over 4726951.44 frames. ], batch size: 179, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 03:59:50,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 03:59:50,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 03:59:52,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:59:52,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 03:59:53,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 03:59:54,379 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.91 vs. limit=15.0 2023-09-30 03:59:55,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 03:59:55,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 03:59:58,166 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.88 vs. limit=15.0 2023-09-30 03:59:58,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 03:59:58,804 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 03:59:58,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:00:04,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:00:04,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 04:00:05,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:00:07,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 04:00:10,131 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.623e+02 1.964e+02 2.182e+02 2.544e+02 4.408e+02, threshold=4.365e+02, percent-clipped=1.0 2023-09-30 04:00:14,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:00:17,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 04:00:22,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:00:25,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:00:25,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:00:26,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 04:00:27,612 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.59 vs. limit=15.0 2023-09-30 04:00:32,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:00:36,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 04:00:40,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=589493.3333333334, ans=0.125 2023-09-30 04:00:43,261 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:00:43,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=589493.3333333334, ans=0.125 2023-09-30 04:00:43,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=589493.3333333334, ans=0.05 2023-09-30 04:00:44,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:00:44,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 04:00:46,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:00:46,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:00:46,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:00:47,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:00:51,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:00:55,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:00:55,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:01:00,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:01:02,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 04:01:07,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:01:12,420 INFO [train.py:1039] (3/4) Epoch 17, batch 3450, loss[loss=0.1972, simple_loss=0.2764, pruned_loss=0.05898, over 23971.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2589, pruned_loss=0.05402, over 4721797.72 frames. ], batch size: 80, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 04:01:14,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 04:01:19,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 04:01:19,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:01:19,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=589626.6666666666, ans=0.125 2023-09-30 04:01:19,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=589626.6666666666, ans=0.1 2023-09-30 04:01:20,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:01:20,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 04:01:22,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:01:22,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=589626.6666666666, ans=0.125 2023-09-30 04:01:27,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:01:32,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:01:32,386 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=589693.3333333334, ans=0.125 2023-09-30 04:01:33,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:01:33,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:01:33,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:01:35,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:01:37,094 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=589693.3333333334, ans=0.0 2023-09-30 04:01:39,237 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=14.00 vs. limit=22.5 2023-09-30 04:01:41,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 04:01:48,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 04:01:48,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 04:01:48,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:01:50,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:01:56,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 04:01:58,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:02:00,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:02:00,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:02:01,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:02:02,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=589826.6666666666, ans=0.125 2023-09-30 04:02:04,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:02:06,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 04:02:06,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:02:06,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=589826.6666666666, ans=0.2 2023-09-30 04:02:07,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:02:09,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:02:14,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 04:02:17,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:02:22,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:02:23,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:02:26,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:02:30,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=589893.3333333334, ans=0.2 2023-09-30 04:02:30,701 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=16.13 vs. limit=15.0 2023-09-30 04:02:31,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:02:31,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:02:33,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:02:33,518 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:02:34,849 INFO [train.py:1039] (3/4) Epoch 17, batch 3500, loss[loss=0.1751, simple_loss=0.2656, pruned_loss=0.04228, over 24699.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.2578, pruned_loss=0.05358, over 4720189.67 frames. ], batch size: 73, lr: 6.05e-03, grad_scale: 16.0 2023-09-30 04:02:36,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=589960.0, ans=0.125 2023-09-30 04:02:38,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:02:41,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:02:42,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 04:02:45,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:02:47,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 04:02:52,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:02:52,267 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 04:02:53,688 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.589e+02 1.944e+02 2.191e+02 2.533e+02 4.328e+02, threshold=4.383e+02, percent-clipped=0.0 2023-09-30 04:02:57,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:02:59,032 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:02:59,637 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.73 vs. limit=12.0 2023-09-30 04:03:00,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:03:00,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:03:00,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 04:03:00,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:00,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:03:02,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 04:03:02,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=590026.6666666666, ans=0.0 2023-09-30 04:03:05,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:05,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:03:08,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:03:10,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:11,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 04:03:12,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:03:13,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=590093.3333333334, ans=10.0 2023-09-30 04:03:14,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:03:17,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:03:17,902 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:18,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=590093.3333333334, ans=0.125 2023-09-30 04:03:19,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:03:19,511 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:03:21,052 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 04:03:21,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 04:03:21,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 04:03:22,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:03:24,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:24,655 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=590160.0, ans=0.125 2023-09-30 04:03:26,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:03:26,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:03:30,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 04:03:31,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:03:35,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=590160.0, ans=0.125 2023-09-30 04:03:36,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:03:38,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 04:03:38,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 04:03:38,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:03:39,991 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.71 vs. limit=15.0 2023-09-30 04:03:42,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:03:42,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:03:44,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:44,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=590226.6666666666, ans=0.0 2023-09-30 04:03:47,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 04:03:48,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:03:48,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:03:50,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 04:03:53,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 04:03:54,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:03:56,841 INFO [train.py:1039] (3/4) Epoch 17, batch 3550, loss[loss=0.1696, simple_loss=0.2524, pruned_loss=0.04342, over 24672.00 frames. ], tot_loss[loss=0.1811, simple_loss=0.2562, pruned_loss=0.053, over 4717528.84 frames. ], batch size: 65, lr: 6.04e-03, grad_scale: 8.0 2023-09-30 04:03:56,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:03:56,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:03:57,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:04:02,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:04:02,980 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.29 vs. limit=15.0 2023-09-30 04:04:09,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:04:13,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 04:04:16,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:04:18,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:04:21,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:04:21,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:04:21,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:04:23,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=590360.0, ans=0.125 2023-09-30 04:04:24,974 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=590360.0, ans=0.2 2023-09-30 04:04:26,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:04:26,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:04:26,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:04:26,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 04:04:26,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=590360.0, ans=0.04949747468305833 2023-09-30 04:04:27,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:04:33,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:04:33,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:04:34,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:04:34,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:04:34,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:04:34,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 04:04:34,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:04:37,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:04:37,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 04:04:40,691 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=590426.6666666666, ans=0.0 2023-09-30 04:04:43,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:04:45,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:04:47,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:04:48,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 04:04:50,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:04:51,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 04:04:52,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:04:53,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=590493.3333333334, ans=0.2 2023-09-30 04:04:54,483 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.57 vs. limit=22.5 2023-09-30 04:04:54,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:04:56,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:04:58,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 04:04:58,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:05:04,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:05:04,753 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 04:05:06,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:05:11,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:05:13,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 04:05:18,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 04:05:19,789 INFO [train.py:1039] (3/4) Epoch 17, batch 3600, loss[loss=0.2025, simple_loss=0.2635, pruned_loss=0.07077, over 22857.00 frames. ], tot_loss[loss=0.1819, simple_loss=0.2562, pruned_loss=0.05376, over 4714807.62 frames. ], batch size: 322, lr: 6.04e-03, grad_scale: 16.0 2023-09-30 04:05:19,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:05:21,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:05:23,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:05:24,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:05:25,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:05:30,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:05:31,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:05:33,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:05:33,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:05:33,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:05:33,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 04:05:38,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:05:39,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:05:41,049 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.895e+02 2.111e+02 2.493e+02 3.633e+02, threshold=4.223e+02, percent-clipped=0.0 2023-09-30 04:05:42,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:05:44,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:05:46,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:05:47,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:05:47,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 04:05:49,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:05:51,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=590760.0, ans=0.125 2023-09-30 04:05:52,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:05:54,050 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:05:57,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:05:59,462 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:06:01,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:06:02,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 04:06:10,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:06:10,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:06:11,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 04:06:15,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:06:18,839 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=590826.6666666666, ans=0.125 2023-09-30 04:06:20,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:06:23,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:06:29,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 04:06:29,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:06:29,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 04:06:33,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 04:06:35,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 04:06:36,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:06:38,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:06:39,142 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.29 vs. limit=15.0 2023-09-30 04:06:39,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 04:06:39,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:06:40,867 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.44 vs. limit=15.0 2023-09-30 04:06:41,293 INFO [train.py:1039] (3/4) Epoch 17, batch 3650, loss[loss=0.1634, simple_loss=0.2497, pruned_loss=0.03852, over 24669.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.2567, pruned_loss=0.05336, over 4716796.14 frames. ], batch size: 73, lr: 6.04e-03, grad_scale: 16.0 2023-09-30 04:06:41,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:06:41,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:06:41,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 04:06:42,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 04:06:46,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:06:47,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 04:06:52,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 04:06:55,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:06:59,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 04:07:00,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 04:07:05,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:07:05,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:07:06,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:07:10,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 04:07:10,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:07:12,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 04:07:13,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:07:13,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:07:13,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 04:07:15,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 04:07:16,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:07:16,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:07:18,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:07:19,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 04:07:21,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 04:07:21,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:07:24,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 04:07:24,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:07:24,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:07:31,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 04:07:33,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:07:33,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:07:34,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:07:34,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:07:35,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=591160.0, ans=0.125 2023-09-30 04:07:38,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:07:41,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:07:43,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:07:43,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:07:44,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 04:07:44,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:07:46,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:07:51,378 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 04:07:54,390 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:07:54,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:07:56,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 04:07:56,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:07:58,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 04:07:58,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:07:58,965 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.34 vs. limit=15.0 2023-09-30 04:07:59,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 04:07:59,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:08:00,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=591226.6666666666, ans=0.1 2023-09-30 04:08:03,224 INFO [train.py:1039] (3/4) Epoch 17, batch 3700, loss[loss=0.164, simple_loss=0.2376, pruned_loss=0.04519, over 24509.00 frames. ], tot_loss[loss=0.1824, simple_loss=0.2577, pruned_loss=0.05355, over 4713468.61 frames. ], batch size: 58, lr: 6.04e-03, grad_scale: 16.0 2023-09-30 04:08:03,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 04:08:04,859 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:08:05,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=591293.3333333334, ans=0.0 2023-09-30 04:08:06,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:08:10,536 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:08:10,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 04:08:10,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:08:10,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:08:10,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:08:10,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=591293.3333333334, ans=0.125 2023-09-30 04:08:16,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:08:16,700 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=591293.3333333334, ans=0.0 2023-09-30 04:08:20,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:08:20,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:08:22,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:08:23,657 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 2.066e+02 2.372e+02 2.822e+02 4.453e+02, threshold=4.744e+02, percent-clipped=1.0 2023-09-30 04:08:23,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:08:23,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 04:08:25,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:08:29,064 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 04:08:29,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=591360.0, ans=0.125 2023-09-30 04:08:35,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:08:35,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 04:08:35,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:08:35,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 04:08:37,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:08:40,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:08:42,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 04:08:42,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:08:45,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:08:47,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:08:47,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:08:49,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:08:52,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:08:53,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 04:08:54,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:08:55,173 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.80 vs. limit=15.0 2023-09-30 04:08:55,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 04:09:02,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:09:02,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:09:05,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:09:06,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 04:09:07,823 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=8.81 vs. limit=15.0 2023-09-30 04:09:08,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:09:08,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 04:09:08,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:09:08,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:09:13,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:09:15,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 04:09:16,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 04:09:16,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:09:16,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:09:19,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:09:19,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:09:22,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:09:23,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:09:24,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:09:25,789 INFO [train.py:1039] (3/4) Epoch 17, batch 3750, loss[loss=0.1805, simple_loss=0.2522, pruned_loss=0.05438, over 23686.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.258, pruned_loss=0.05353, over 4719790.38 frames. ], batch size: 149, lr: 6.04e-03, grad_scale: 16.0 2023-09-30 04:09:26,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 04:09:28,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 04:09:31,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:09:31,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 04:09:33,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:09:34,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:09:35,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:09:38,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:09:38,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=591626.6666666666, ans=0.2 2023-09-30 04:09:41,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:09:46,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:09:46,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:09:49,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:09:52,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:09:53,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 04:09:55,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:09:56,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:09:56,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:10:00,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 04:10:06,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 04:10:06,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:10:06,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:10:09,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:10:14,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:10:15,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 04:10:16,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=591826.6666666666, ans=0.125 2023-09-30 04:10:20,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=591826.6666666666, ans=0.1 2023-09-30 04:10:21,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 04:10:25,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:10:26,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:10:28,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:10:31,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:10:33,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=591893.3333333334, ans=0.125 2023-09-30 04:10:36,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 04:10:38,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:10:40,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:10:41,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:10:44,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 04:10:44,936 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=591960.0, ans=0.1 2023-09-30 04:10:46,643 INFO [train.py:1039] (3/4) Epoch 17, batch 3800, loss[loss=0.17, simple_loss=0.2598, pruned_loss=0.04015, over 24444.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2577, pruned_loss=0.05334, over 4727771.73 frames. ], batch size: 69, lr: 6.03e-03, grad_scale: 16.0 2023-09-30 04:10:52,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:10:57,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:10:58,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 04:11:00,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 04:11:01,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:11:03,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:11:03,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=592026.6666666666, ans=0.035 2023-09-30 04:11:05,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 04:11:06,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 04:11:06,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:11:06,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:11:08,671 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.516e+02 1.935e+02 2.193e+02 2.576e+02 3.771e+02, threshold=4.386e+02, percent-clipped=0.0 2023-09-30 04:11:10,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:11:11,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:11:11,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:11:11,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 04:11:13,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=592026.6666666666, ans=0.125 2023-09-30 04:11:16,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 04:11:16,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:11:18,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:11:20,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:11:20,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:11:23,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 04:11:23,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:11:26,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:11:28,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:11:31,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=592093.3333333334, ans=0.0 2023-09-30 04:11:34,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 04:11:34,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 04:11:36,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:11:43,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:11:48,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:11:50,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 04:11:52,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 04:11:52,238 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=592226.6666666666, ans=0.125 2023-09-30 04:11:53,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:11:53,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=592226.6666666666, ans=0.125 2023-09-30 04:11:55,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:11:55,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:11:58,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 04:12:00,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=592226.6666666666, ans=0.015 2023-09-30 04:12:01,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 04:12:01,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 04:12:01,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:12:03,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:12:08,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:12:10,440 INFO [train.py:1039] (3/4) Epoch 17, batch 3850, loss[loss=0.1675, simple_loss=0.2469, pruned_loss=0.04401, over 24298.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.2571, pruned_loss=0.05362, over 4711980.28 frames. ], batch size: 61, lr: 6.03e-03, grad_scale: 16.0 2023-09-30 04:12:10,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:12:15,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:12:16,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=592293.3333333334, ans=0.0 2023-09-30 04:12:18,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 04:12:18,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:12:20,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:12:23,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 04:12:27,170 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:12:30,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 04:12:30,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 04:12:35,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:36,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:12:40,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:12:40,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:12:42,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:43,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:12:44,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:12:44,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:12:45,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:12:49,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:12:50,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:50,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:12:52,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 04:12:52,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 04:12:53,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:12:55,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:57,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:12:58,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:12:59,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 04:13:02,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 04:13:03,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:13:05,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 04:13:08,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 04:13:11,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:13:13,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:13:18,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:13:18,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 04:13:20,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 04:13:23,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:13:23,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:13:26,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:13:26,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 04:13:27,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:27,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:27,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:13:27,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 04:13:29,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:13:29,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 04:13:29,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:29,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:13:31,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:13:31,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:31,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:13:33,359 INFO [train.py:1039] (3/4) Epoch 17, batch 3900, loss[loss=0.1545, simple_loss=0.2327, pruned_loss=0.03813, over 24611.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2555, pruned_loss=0.05309, over 4713007.92 frames. ], batch size: 60, lr: 6.03e-03, grad_scale: 16.0 2023-09-30 04:13:33,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:13:33,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:13:35,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:13:35,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 04:13:35,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:37,567 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.88 vs. limit=15.0 2023-09-30 04:13:38,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:13:39,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 04:13:39,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:13:41,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:13:44,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 04:13:44,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:48,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:13:49,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 04:13:49,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:13:51,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 04:13:51,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:13:53,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 04:13:54,483 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.932e+02 2.197e+02 2.548e+02 3.814e+02, threshold=4.393e+02, percent-clipped=0.0 2023-09-30 04:13:55,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 04:13:59,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:14:03,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:14:03,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:14:04,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:14:08,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:14:09,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:14:11,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:14:11,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:14:12,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:14:19,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:14:19,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:14:24,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=592826.6666666666, ans=0.2 2023-09-30 04:14:27,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=592826.6666666666, ans=0.0 2023-09-30 04:14:28,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:14:30,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:14:40,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:14:43,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:14:45,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 04:14:45,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 04:14:45,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:14:46,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 04:14:48,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:14:49,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 04:14:54,806 INFO [train.py:1039] (3/4) Epoch 17, batch 3950, loss[loss=0.1918, simple_loss=0.2524, pruned_loss=0.06565, over 23764.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2555, pruned_loss=0.05317, over 4714121.40 frames. ], batch size: 212, lr: 6.03e-03, grad_scale: 16.0 2023-09-30 04:14:55,258 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=592960.0, ans=0.2 2023-09-30 04:14:56,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=592960.0, ans=0.125 2023-09-30 04:14:58,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:14:58,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 04:14:58,957 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=592960.0, ans=0.1 2023-09-30 04:15:00,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:15:03,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:15:03,390 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=592960.0, ans=0.0 2023-09-30 04:15:03,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=592960.0, ans=0.125 2023-09-30 04:15:04,133 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.13 vs. limit=22.5 2023-09-30 04:15:04,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:15:06,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=592960.0, ans=0.0 2023-09-30 04:15:11,392 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 04:15:11,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:15:13,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 04:15:13,655 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 04:15:15,078 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:15:18,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:15:18,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:15:19,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:15:19,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=593026.6666666666, ans=0.2 2023-09-30 04:15:22,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 04:15:24,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:15:24,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:15:24,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:15:25,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:15:25,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:15:36,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:15:36,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:15:43,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 04:15:43,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=593160.0, ans=0.125 2023-09-30 04:15:48,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 04:15:48,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 04:15:48,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:15:48,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:15:56,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:15:56,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:15:56,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:15:57,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:15:57,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 04:16:05,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:16:06,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:16:08,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=593226.6666666666, ans=0.125 2023-09-30 04:16:11,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 04:16:17,998 INFO [train.py:1039] (3/4) Epoch 17, batch 4000, loss[loss=0.1949, simple_loss=0.2669, pruned_loss=0.06146, over 23105.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2562, pruned_loss=0.05351, over 4712422.33 frames. ], batch size: 93, lr: 6.03e-03, grad_scale: 32.0 2023-09-30 04:16:19,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:16:23,559 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=593293.3333333334, ans=0.1 2023-09-30 04:16:28,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:16:32,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:16:34,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:16:34,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:16:34,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 04:16:36,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 04:16:38,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 04:16:38,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:16:38,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 04:16:40,234 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.594e+02 1.856e+02 2.093e+02 2.279e+02 3.915e+02, threshold=4.185e+02, percent-clipped=0.0 2023-09-30 04:16:40,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:16:43,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:16:43,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:16:43,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:16:43,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:16:43,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 04:16:46,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:16:47,850 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 04:16:49,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:16:51,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:16:51,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=593426.6666666666, ans=0.125 2023-09-30 04:16:54,386 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 04:16:54,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=593426.6666666666, ans=0.125 2023-09-30 04:16:55,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 04:16:55,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:17:04,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 04:17:05,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:17:08,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:17:10,184 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 04:17:10,358 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:17:10,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 04:17:12,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:17:12,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:17:13,376 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.41 vs. limit=15.0 2023-09-30 04:17:14,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=593493.3333333334, ans=0.1 2023-09-30 04:17:15,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:17:16,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:17:16,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:17:16,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:17:18,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 04:17:18,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:17:22,314 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 04:17:25,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=593560.0, ans=0.125 2023-09-30 04:17:26,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:17:30,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 04:17:32,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:17:34,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:17:35,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:17:35,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:17:39,951 INFO [train.py:1039] (3/4) Epoch 17, batch 4050, loss[loss=0.188, simple_loss=0.259, pruned_loss=0.05854, over 18606.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2562, pruned_loss=0.05348, over 4705197.67 frames. ], batch size: 40, lr: 6.03e-03, grad_scale: 32.0 2023-09-30 04:17:41,629 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:17:44,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 04:17:45,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 04:17:47,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:17:47,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:17:49,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:17:51,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:17:52,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:17:56,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:17:59,455 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:18:00,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 04:18:01,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=593693.3333333334, ans=0.125 2023-09-30 04:18:02,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:18:02,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=593693.3333333334, ans=0.125 2023-09-30 04:18:03,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:18:07,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:18:09,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:18:12,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 04:18:13,712 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.80 vs. limit=22.5 2023-09-30 04:18:14,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 04:18:14,530 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 04:18:14,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=593760.0, ans=0.0 2023-09-30 04:18:17,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:18:22,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 04:18:25,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:18:27,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:18:29,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=593826.6666666666, ans=0.0 2023-09-30 04:18:32,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:18:32,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:18:32,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:18:35,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:18:38,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 04:18:38,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 04:18:42,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:18:43,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 04:18:46,323 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=593893.3333333334, ans=0.07 2023-09-30 04:18:47,004 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.62 vs. limit=15.0 2023-09-30 04:18:49,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:18:50,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=593893.3333333334, ans=0.1 2023-09-30 04:18:51,328 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=8.02 vs. limit=15.0 2023-09-30 04:18:55,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 04:18:56,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:18:56,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:19:00,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 04:19:00,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 04:19:00,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:19:01,729 INFO [train.py:1039] (3/4) Epoch 17, batch 4100, loss[loss=0.1785, simple_loss=0.2472, pruned_loss=0.0549, over 23640.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.2565, pruned_loss=0.05321, over 4720492.04 frames. ], batch size: 149, lr: 6.02e-03, grad_scale: 32.0 2023-09-30 04:19:01,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:19:03,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:03,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:19:09,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 04:19:09,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 04:19:11,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 04:19:12,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 04:19:12,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:19:13,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:13,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:13,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:19:15,181 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 04:19:15,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=593960.0, ans=0.125 2023-09-30 04:19:20,261 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:19:20,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:19:20,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:19:21,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:19:24,780 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.800e+02 1.954e+02 2.140e+02 3.094e+02, threshold=3.909e+02, percent-clipped=0.0 2023-09-30 04:19:25,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:19:26,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:19:26,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:19:28,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 04:19:30,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:30,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:19:30,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:19:32,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:19:32,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 04:19:35,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:19:35,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 04:19:38,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:19:40,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:19:40,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 04:19:41,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:19:43,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:19:43,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:19:46,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 04:19:46,560 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=594093.3333333334, ans=0.1 2023-09-30 04:19:47,340 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.32 vs. limit=15.0 2023-09-30 04:19:49,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:19:51,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:19:53,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 04:19:55,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:19:55,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:19:58,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:20:01,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:20:01,973 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=594160.0, ans=0.1 2023-09-30 04:20:05,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:20:07,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:20:13,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:20:13,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:20:14,425 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.93 vs. limit=22.5 2023-09-30 04:20:14,476 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=22.43 vs. limit=22.5 2023-09-30 04:20:16,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:20:20,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:20:23,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:20:25,116 INFO [train.py:1039] (3/4) Epoch 17, batch 4150, loss[loss=0.1822, simple_loss=0.2672, pruned_loss=0.04858, over 24287.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2573, pruned_loss=0.05356, over 4716245.65 frames. ], batch size: 74, lr: 6.02e-03, grad_scale: 16.0 2023-09-30 04:20:25,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:20:25,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:20:25,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:20:29,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 04:20:29,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:20:29,521 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=594293.3333333334, ans=0.2 2023-09-30 04:20:30,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 04:20:30,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 04:20:30,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 04:20:33,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:20:37,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:20:37,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:20:41,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:20:41,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:20:42,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 04:20:43,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:20:44,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:20:46,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 04:20:49,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=594360.0, ans=10.0 2023-09-30 04:20:50,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:20:55,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:20:57,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 04:20:59,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 04:20:59,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:21:01,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 04:21:01,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:21:01,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:21:04,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:21:05,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:21:10,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 04:21:14,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 04:21:15,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:21:15,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=594493.3333333334, ans=0.125 2023-09-30 04:21:17,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 04:21:17,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:21:20,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 04:21:20,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:21:22,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:21:25,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:21:26,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 04:21:26,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:21:26,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 04:21:28,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 04:21:29,195 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=594493.3333333334, ans=0.0 2023-09-30 04:21:30,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 04:21:30,485 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:21:30,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:21:32,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:21:34,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 04:21:34,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:21:34,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 04:21:35,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:21:36,445 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.54 vs. limit=12.0 2023-09-30 04:21:37,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:21:37,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 04:21:38,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 04:21:45,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:21:45,878 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:21:47,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 04:21:48,365 INFO [train.py:1039] (3/4) Epoch 17, batch 4200, loss[loss=0.1801, simple_loss=0.2616, pruned_loss=0.04934, over 24470.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.257, pruned_loss=0.05356, over 4724294.72 frames. ], batch size: 66, lr: 6.02e-03, grad_scale: 16.0 2023-09-30 04:21:48,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:21:52,226 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:21:53,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:21:55,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:21:55,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:21:56,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 04:22:00,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 04:22:00,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:22:03,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:22:08,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:22:12,242 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.960e+02 2.323e+02 2.795e+02 4.279e+02, threshold=4.646e+02, percent-clipped=2.0 2023-09-30 04:22:12,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 04:22:15,417 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:22:15,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:22:16,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 04:22:16,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:22:18,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:22:18,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:22:18,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:22:20,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:22:23,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 04:22:23,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:22:27,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 04:22:27,968 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=594760.0, ans=0.1 2023-09-30 04:22:29,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:22:29,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:22:32,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:22:34,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:22:34,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 04:22:35,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:22:35,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:22:41,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:22:41,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=594826.6666666666, ans=0.125 2023-09-30 04:22:42,726 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=594826.6666666666, ans=0.0 2023-09-30 04:22:44,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:22:49,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:22:52,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 04:22:54,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:23:00,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 04:23:00,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:23:00,733 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=594893.3333333334, ans=0.1 2023-09-30 04:23:03,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 04:23:10,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:23:12,279 INFO [train.py:1039] (3/4) Epoch 17, batch 4250, loss[loss=0.169, simple_loss=0.2341, pruned_loss=0.05189, over 23326.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2548, pruned_loss=0.05333, over 4708145.92 frames. ], batch size: 285, lr: 6.02e-03, grad_scale: 16.0 2023-09-30 04:23:15,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:23:15,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 04:23:17,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:23:19,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=594960.0, ans=0.1 2023-09-30 04:23:23,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:23:25,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 04:23:25,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:23:28,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:23:32,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:23:36,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:23:36,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:23:40,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:23:40,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:23:40,732 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=595026.6666666666, ans=0.125 2023-09-30 04:23:41,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:23:41,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:23:44,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:23:46,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:23:48,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:23:50,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 04:23:52,079 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:23:53,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 04:23:53,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:23:53,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:23:53,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:23:55,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:23:55,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:23:55,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:23:58,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 04:24:00,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:24:04,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=595160.0, ans=0.125 2023-09-30 04:24:05,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:24:07,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:24:07,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 04:24:07,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:24:09,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 04:24:09,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=595160.0, ans=0.125 2023-09-30 04:24:11,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:24:12,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:24:14,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:24:15,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:24:18,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 04:24:19,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 04:24:19,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:24:25,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:24:28,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:24:30,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:24:30,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:24:31,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:24:33,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:24:34,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:24:34,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 04:24:35,510 INFO [train.py:1039] (3/4) Epoch 17, batch 4300, loss[loss=0.1975, simple_loss=0.2685, pruned_loss=0.0632, over 23551.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2549, pruned_loss=0.05296, over 4712910.12 frames. ], batch size: 106, lr: 6.02e-03, grad_scale: 8.0 2023-09-30 04:24:35,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:24:39,761 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=595293.3333333334, ans=0.125 2023-09-30 04:24:40,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:24:40,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:24:41,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=595293.3333333334, ans=0.125 2023-09-30 04:24:46,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:24:52,947 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=595360.0, ans=0.125 2023-09-30 04:24:55,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:24:55,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 04:24:57,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:24:57,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=595360.0, ans=0.0 2023-09-30 04:24:59,971 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.465e+02 1.855e+02 2.115e+02 2.472e+02 4.142e+02, threshold=4.230e+02, percent-clipped=0.0 2023-09-30 04:25:00,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:25:00,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:25:00,163 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 04:25:01,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.min_abs, batch_count=595360.0, ans=0.5 2023-09-30 04:25:03,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 04:25:04,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:25:08,350 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 04:25:08,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:25:09,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 04:25:11,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 04:25:15,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:25:15,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=595426.6666666666, ans=0.0 2023-09-30 04:25:18,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:25:18,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:25:18,805 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=595426.6666666666, ans=0.125 2023-09-30 04:25:19,840 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:25:21,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:25:23,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:25:23,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 04:25:24,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 04:25:26,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:25:27,480 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=14.83 vs. limit=15.0 2023-09-30 04:25:28,401 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=595493.3333333334, ans=0.125 2023-09-30 04:25:29,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:25:29,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 04:25:30,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:25:30,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:25:31,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 04:25:31,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 04:25:33,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 04:25:34,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:25:34,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 04:25:36,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 04:25:38,122 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=595493.3333333334, ans=0.2 2023-09-30 04:25:39,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:25:41,063 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 04:25:41,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:25:44,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:25:44,716 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:25:46,418 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 04:25:47,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:25:47,831 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:25:49,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:25:49,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:25:51,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:25:53,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:25:53,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:25:54,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:25:54,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:25:57,828 INFO [train.py:1039] (3/4) Epoch 17, batch 4350, loss[loss=0.1836, simple_loss=0.2503, pruned_loss=0.05842, over 23415.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2557, pruned_loss=0.05305, over 4722394.37 frames. ], batch size: 119, lr: 6.02e-03, grad_scale: 8.0 2023-09-30 04:26:00,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 04:26:01,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:26:06,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:26:08,472 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=595626.6666666666, ans=0.125 2023-09-30 04:26:09,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:26:11,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:26:11,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:26:16,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:26:20,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:26:24,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:26:24,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:26:27,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:26:30,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:26:32,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:26:38,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 04:26:39,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:26:40,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=595760.0, ans=0.04949747468305833 2023-09-30 04:26:41,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:26:44,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:26:46,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=595826.6666666666, ans=0.125 2023-09-30 04:26:47,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 04:26:51,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:26:52,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:26:59,272 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 04:27:00,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:27:00,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:27:02,903 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.05 vs. limit=15.0 2023-09-30 04:27:03,330 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=5.26 vs. limit=15.0 2023-09-30 04:27:03,811 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 04:27:03,943 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 04:27:03,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:27:03,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:27:05,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:27:05,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:27:07,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:27:07,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:27:10,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 04:27:10,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:10,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:27:10,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:10,538 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=595893.3333333334, ans=0.07 2023-09-30 04:27:11,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 04:27:13,939 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 04:27:13,947 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 04:27:13,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 04:27:18,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:27:19,871 INFO [train.py:1039] (3/4) Epoch 17, batch 4400, loss[loss=0.1695, simple_loss=0.2441, pruned_loss=0.04742, over 19932.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2567, pruned_loss=0.05379, over 4711722.28 frames. ], batch size: 43, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:27:19,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:27:19,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:27:21,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:27:23,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 04:27:24,632 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 04:27:24,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:28,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:27:28,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:30,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:27:31,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 04:27:33,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 04:27:33,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 04:27:33,902 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 04:27:35,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:27:35,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:27:38,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 04:27:40,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:41,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:27:41,843 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 04:27:44,837 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.901e+02 2.144e+02 2.567e+02 3.604e+02, threshold=4.289e+02, percent-clipped=0.0 2023-09-30 04:27:44,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:27:45,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 04:27:46,400 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 04:27:48,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 04:27:48,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 04:27:50,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 04:27:50,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:27:52,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:27:52,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:27:52,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:27:54,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 04:27:54,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 04:27:55,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:27:57,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:27:57,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:27:59,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:28:01,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:28:01,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 04:28:01,159 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 04:28:01,297 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=596093.3333333334, ans=0.2 2023-09-30 04:28:01,419 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=596093.3333333334, ans=0.125 2023-09-30 04:28:05,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:28:13,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:28:17,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 04:28:20,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:28:23,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:28:26,466 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.34 vs. limit=22.5 2023-09-30 04:28:27,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:28:27,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 04:28:27,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:28:27,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:28:27,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:28:27,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=596226.6666666666, ans=0.1 2023-09-30 04:28:27,785 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=596226.6666666666, ans=0.0 2023-09-30 04:28:29,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:28:32,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 04:28:35,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 04:28:36,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 04:28:36,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:28:37,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 04:28:38,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:28:40,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:28:44,020 INFO [train.py:1039] (3/4) Epoch 17, batch 4450, loss[loss=0.1799, simple_loss=0.2633, pruned_loss=0.04832, over 24457.00 frames. ], tot_loss[loss=0.183, simple_loss=0.2583, pruned_loss=0.05386, over 4720270.98 frames. ], batch size: 69, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:28:44,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 04:28:47,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:28:48,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:28:48,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:28:55,581 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:28:57,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:29:00,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:29:03,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:29:03,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:29:03,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:29:07,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 04:29:07,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:29:07,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:29:07,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:29:07,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:29:10,736 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 04:29:11,010 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=596360.0, ans=0.0 2023-09-30 04:29:16,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:29:16,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:29:16,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:29:17,243 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=11.87 vs. limit=15.0 2023-09-30 04:29:18,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:29:18,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:29:21,536 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=596426.6666666666, ans=0.5 2023-09-30 04:29:21,547 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=596426.6666666666, ans=0.0 2023-09-30 04:29:21,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=596426.6666666666, ans=0.125 2023-09-30 04:29:22,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 04:29:23,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=596426.6666666666, ans=0.04949747468305833 2023-09-30 04:29:24,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 04:29:24,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 04:29:24,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:29:29,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:29:30,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 04:29:31,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=596426.6666666666, ans=0.0 2023-09-30 04:29:34,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:29:38,852 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=596493.3333333334, ans=0.035 2023-09-30 04:29:38,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=596493.3333333334, ans=0.125 2023-09-30 04:29:40,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:29:41,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 04:29:41,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:29:41,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:29:41,771 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:29:41,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:29:43,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:29:47,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 04:29:47,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 04:29:50,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:29:51,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:29:53,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:29:54,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:29:56,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 04:29:59,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:30:01,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 04:30:01,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=596560.0, ans=0.0 2023-09-30 04:30:02,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:30:03,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=596560.0, ans=0.0 2023-09-30 04:30:07,635 INFO [train.py:1039] (3/4) Epoch 17, batch 4500, loss[loss=0.1849, simple_loss=0.2555, pruned_loss=0.05714, over 23864.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.2579, pruned_loss=0.05376, over 4722474.72 frames. ], batch size: 179, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:30:07,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:30:09,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 04:30:09,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 04:30:11,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:30:19,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:30:20,004 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:30:21,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:30:23,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:30:23,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:30:23,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:30:32,274 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.445e+02 1.892e+02 2.101e+02 2.322e+02 3.249e+02, threshold=4.203e+02, percent-clipped=0.0 2023-09-30 04:30:35,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:30:35,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:30:39,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:30:40,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:30:41,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:30:49,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 04:30:52,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:30:56,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:31:00,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:31:01,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 04:31:01,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:31:03,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:31:03,511 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=596826.6666666666, ans=0.1 2023-09-30 04:31:04,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:31:06,247 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:31:07,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:31:07,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 04:31:07,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:31:07,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:31:14,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:31:14,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:31:19,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:31:22,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:31:22,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:31:25,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 04:31:25,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 04:31:25,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 04:31:28,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=596893.3333333334, ans=0.125 2023-09-30 04:31:30,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 04:31:31,540 INFO [train.py:1039] (3/4) Epoch 17, batch 4550, loss[loss=0.1579, simple_loss=0.2406, pruned_loss=0.03765, over 24469.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2571, pruned_loss=0.05342, over 4719711.38 frames. ], batch size: 66, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:31:34,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 04:31:35,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:31:38,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:31:39,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:31:43,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:31:43,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=596960.0, ans=0.125 2023-09-30 04:31:48,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:31:49,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:31:51,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:31:52,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:31:52,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:31:53,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:31:53,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:31:57,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:31:57,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=597026.6666666666, ans=0.1 2023-09-30 04:32:00,427 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 04:32:01,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 04:32:03,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:32:04,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 04:32:07,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 04:32:08,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:32:11,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 04:32:12,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=597093.3333333334, ans=0.125 2023-09-30 04:32:13,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:32:13,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=597093.3333333334, ans=0.2 2023-09-30 04:32:14,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:16,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:16,455 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:32:16,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=597093.3333333334, ans=0.125 2023-09-30 04:32:18,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 04:32:21,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:32:25,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:25,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:32:26,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:32:26,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 04:32:28,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 04:32:28,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:32:30,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 04:32:31,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 04:32:31,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:32:35,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:32:35,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:32:37,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:37,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:32:39,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:32:40,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 04:32:42,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:32:43,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 04:32:43,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 04:32:43,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:32:43,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 04:32:46,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:32:47,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:32:50,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:32:50,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:32:50,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 04:32:52,980 INFO [train.py:1039] (3/4) Epoch 17, batch 4600, loss[loss=0.1874, simple_loss=0.2616, pruned_loss=0.05659, over 23743.00 frames. ], tot_loss[loss=0.181, simple_loss=0.2557, pruned_loss=0.05317, over 4713227.48 frames. ], batch size: 179, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:32:53,079 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:32:54,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:32:57,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:32:59,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:33:03,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:33:03,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:33:05,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:33:05,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 04:33:08,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:33:13,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:33:13,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:33:17,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:33:18,330 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.818e+02 2.016e+02 2.179e+02 3.781e+02, threshold=4.032e+02, percent-clipped=0.0 2023-09-30 04:33:18,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=597360.0, ans=0.125 2023-09-30 04:33:20,680 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=597360.0, ans=0.0 2023-09-30 04:33:22,065 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=597360.0, ans=0.125 2023-09-30 04:33:23,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 04:33:23,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:33:25,844 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.49 vs. limit=22.5 2023-09-30 04:33:26,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:33:29,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:33:30,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:33:36,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 04:33:36,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 04:33:37,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:33:41,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=597493.3333333334, ans=0.125 2023-09-30 04:33:42,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:33:44,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:33:46,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:33:48,295 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=597493.3333333334, ans=0.125 2023-09-30 04:33:50,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 04:33:51,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 04:33:56,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:33:57,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:34:00,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:00,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 04:34:00,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:34:00,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 04:34:00,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:00,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:34:03,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:03,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:34:04,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:34:05,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 04:34:06,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 04:34:07,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 04:34:07,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:34:09,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:34:09,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:34:10,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:34:15,489 INFO [train.py:1039] (3/4) Epoch 17, batch 4650, loss[loss=0.1624, simple_loss=0.2456, pruned_loss=0.03963, over 24468.00 frames. ], tot_loss[loss=0.181, simple_loss=0.2558, pruned_loss=0.05312, over 4726614.09 frames. ], batch size: 66, lr: 6.01e-03, grad_scale: 16.0 2023-09-30 04:34:20,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:34:26,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:34:26,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:34:27,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:34:27,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:34:27,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:34:27,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:34:32,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 04:34:32,966 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=597693.3333333334, ans=0.2 2023-09-30 04:34:35,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:34:37,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=597693.3333333334, ans=0.125 2023-09-30 04:34:38,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 04:34:38,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:34:40,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 04:34:40,358 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:34:41,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 04:34:42,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 04:34:42,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:43,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:34:44,336 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=597693.3333333334, ans=0.1 2023-09-30 04:34:45,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:34:47,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:34:47,312 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 04:34:50,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:34:52,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 04:34:52,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=597760.0, ans=0.0 2023-09-30 04:34:56,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:34:56,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:34:56,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 04:34:59,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:35:00,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:35:05,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:35:07,974 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=597826.6666666666, ans=0.0 2023-09-30 04:35:10,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:35:13,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:35:15,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:35:15,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:35:18,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 04:35:18,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 04:35:20,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 04:35:20,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 04:35:20,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=597893.3333333334, ans=0.0 2023-09-30 04:35:22,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:35:30,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:35:30,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:35:31,052 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 04:35:31,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:35:32,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:35:32,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:35:32,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=597893.3333333334, ans=0.125 2023-09-30 04:35:34,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:35:35,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:35:35,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:35:36,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:35:39,574 INFO [train.py:1039] (3/4) Epoch 17, batch 4700, loss[loss=0.1779, simple_loss=0.2491, pruned_loss=0.0533, over 23656.00 frames. ], tot_loss[loss=0.1813, simple_loss=0.2562, pruned_loss=0.05322, over 4724387.38 frames. ], batch size: 149, lr: 6.00e-03, grad_scale: 16.0 2023-09-30 04:35:39,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:35:39,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:35:41,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:35:42,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 04:35:44,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:35:44,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 04:35:53,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:35:55,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:35:55,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:35:56,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:35:59,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 04:36:03,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 04:36:03,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 04:36:04,679 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.407e+02 1.853e+02 2.008e+02 2.274e+02 4.210e+02, threshold=4.016e+02, percent-clipped=1.0 2023-09-30 04:36:06,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:36:08,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:36:08,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:36:12,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:36:17,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 04:36:19,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 04:36:22,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:36:22,852 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=598093.3333333334, ans=0.0 2023-09-30 04:36:24,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=598093.3333333334, ans=0.0 2023-09-30 04:36:29,601 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.79 vs. limit=22.5 2023-09-30 04:36:32,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 04:36:33,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:36:35,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:36:38,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 04:36:38,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:36:39,192 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=598160.0, ans=0.1 2023-09-30 04:36:43,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:36:43,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 04:36:46,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:36:46,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:36:50,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:36:50,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:36:51,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 04:36:52,034 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 04:36:54,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:36:55,172 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 04:36:57,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:36:57,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:36:57,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 04:36:58,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:37:00,905 INFO [train.py:1039] (3/4) Epoch 17, batch 4750, loss[loss=0.1866, simple_loss=0.2665, pruned_loss=0.05335, over 23340.00 frames. ], tot_loss[loss=0.1813, simple_loss=0.2562, pruned_loss=0.05317, over 4719676.11 frames. ], batch size: 93, lr: 6.00e-03, grad_scale: 16.0 2023-09-30 04:37:03,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 04:37:05,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=598293.3333333334, ans=0.125 2023-09-30 04:37:06,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:37:06,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:37:12,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:37:12,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:37:15,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 04:37:16,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:37:19,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 04:37:20,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=598360.0, ans=0.2 2023-09-30 04:37:21,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:37:21,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:37:22,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:37:26,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 04:37:32,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:37:34,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 04:37:36,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:37:39,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:37:39,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:37:39,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:37:39,890 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 04:37:39,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 04:37:41,923 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.55 vs. limit=15.0 2023-09-30 04:37:45,780 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.80 vs. limit=15.0 2023-09-30 04:37:46,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 04:37:48,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:37:51,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:37:54,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:37:54,654 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 04:37:54,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:37:56,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=598493.3333333334, ans=0.125 2023-09-30 04:37:59,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:38:01,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:38:03,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 04:38:03,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 04:38:03,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:38:04,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:38:04,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:38:06,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:38:06,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 04:38:06,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=598560.0, ans=0.07 2023-09-30 04:38:09,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 04:38:11,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:38:15,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:38:15,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 04:38:16,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:38:19,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:38:21,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:38:21,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:38:22,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 04:38:24,081 INFO [train.py:1039] (3/4) Epoch 17, batch 4800, loss[loss=0.1889, simple_loss=0.2613, pruned_loss=0.05822, over 23545.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2575, pruned_loss=0.05411, over 4717689.48 frames. ], batch size: 134, lr: 6.00e-03, grad_scale: 32.0 2023-09-30 04:38:26,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:38:26,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 04:38:26,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 04:38:28,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 04:38:31,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:38:32,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:38:34,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 04:38:39,542 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:38:39,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:38:44,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:38:47,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:38:47,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:38:47,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 04:38:49,111 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.366e+02 1.827e+02 2.030e+02 2.375e+02 4.462e+02, threshold=4.061e+02, percent-clipped=1.0 2023-09-30 04:38:49,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:38:49,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:38:50,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:38:53,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=598693.3333333334, ans=0.2 2023-09-30 04:38:53,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=598693.3333333334, ans=0.125 2023-09-30 04:38:54,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:38:56,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:38:56,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:38:57,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:38:57,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 04:38:57,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:38:59,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:39:02,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:39:05,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:39:07,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:39:07,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:39:07,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 04:39:09,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:39:11,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 04:39:11,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 04:39:12,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:39:12,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:39:12,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:39:12,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:39:12,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:39:15,739 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.15 vs. limit=15.0 2023-09-30 04:39:16,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:39:16,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:39:19,802 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:39:23,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:23,736 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:39:28,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 04:39:29,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:39:29,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:29,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:39:30,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:39:33,156 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.89 vs. limit=12.0 2023-09-30 04:39:33,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:39:35,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:39:35,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:36,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:39:37,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:39:37,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:39:42,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:39:43,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:43,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:39:44,652 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.57 vs. limit=15.0 2023-09-30 04:39:45,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 04:39:46,179 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.69 vs. limit=22.5 2023-09-30 04:39:48,820 INFO [train.py:1039] (3/4) Epoch 17, batch 4850, loss[loss=0.1707, simple_loss=0.2377, pruned_loss=0.0519, over 23584.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2578, pruned_loss=0.05463, over 4710595.74 frames. ], batch size: 135, lr: 6.00e-03, grad_scale: 32.0 2023-09-30 04:39:48,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 04:39:48,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:39:48,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:39:49,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:39:49,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:39:52,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:39:57,559 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=598960.0, ans=0.05 2023-09-30 04:40:01,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 04:40:03,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:40:09,995 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:40:11,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 04:40:11,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:40:14,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:40:16,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:40:17,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:40:17,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 04:40:23,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:40:25,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:40:26,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 04:40:26,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:40:26,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 04:40:29,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:40:29,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:40:33,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:40:33,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 04:40:33,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 04:40:35,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 04:40:36,071 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=19.98 vs. limit=22.5 2023-09-30 04:40:42,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:40:43,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 04:40:45,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:40:45,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:40:48,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:40:48,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 04:40:48,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:40:48,552 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=599160.0, ans=0.125 2023-09-30 04:40:49,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 04:40:49,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:40:51,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:40:51,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=599160.0, ans=0.125 2023-09-30 04:40:53,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 04:41:03,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:41:08,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:41:08,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:41:11,687 INFO [train.py:1039] (3/4) Epoch 17, batch 4900, loss[loss=0.1619, simple_loss=0.2499, pruned_loss=0.03699, over 24481.00 frames. ], tot_loss[loss=0.182, simple_loss=0.256, pruned_loss=0.05403, over 4701363.46 frames. ], batch size: 66, lr: 6.00e-03, grad_scale: 32.0 2023-09-30 04:41:15,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 04:41:15,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:41:19,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:41:20,197 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=599293.3333333334, ans=0.125 2023-09-30 04:41:21,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:41:21,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:41:24,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 04:41:27,013 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.47 vs. limit=15.0 2023-09-30 04:41:30,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 04:41:35,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 04:41:36,396 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.673e+02 1.955e+02 2.270e+02 2.616e+02 3.974e+02, threshold=4.540e+02, percent-clipped=0.0 2023-09-30 04:41:36,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 04:41:36,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:41:36,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:41:36,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:41:36,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:41:36,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:41:36,946 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=599360.0, ans=0.125 2023-09-30 04:41:38,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 04:41:43,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 04:41:43,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:41:44,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:41:45,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:41:46,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:41:48,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:41:48,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:41:48,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 04:41:50,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:41:52,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:41:52,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 04:41:52,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 04:41:56,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 04:41:57,548 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.71 vs. limit=15.0 2023-09-30 04:41:58,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:41:59,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:41:59,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:42:01,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:42:01,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 04:42:01,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:42:03,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 04:42:05,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:42:05,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=599493.3333333334, ans=0.125 2023-09-30 04:42:07,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 04:42:09,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:42:15,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 04:42:15,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:42:15,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 04:42:15,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=599493.3333333334, ans=0.2 2023-09-30 04:42:16,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 04:42:18,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=599560.0, ans=0.0 2023-09-30 04:42:21,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:42:23,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:42:25,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 04:42:25,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 04:42:26,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:42:26,921 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=599560.0, ans=0.2 2023-09-30 04:42:28,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:42:31,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:42:31,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:42:31,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:42:32,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 04:42:34,752 INFO [train.py:1039] (3/4) Epoch 17, batch 4950, loss[loss=0.1764, simple_loss=0.2385, pruned_loss=0.05712, over 23564.00 frames. ], tot_loss[loss=0.181, simple_loss=0.2546, pruned_loss=0.05366, over 4698886.47 frames. ], batch size: 285, lr: 6.00e-03, grad_scale: 32.0 2023-09-30 04:42:34,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:42:39,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:42:39,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 04:42:40,275 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=599626.6666666666, ans=0.125 2023-09-30 04:42:42,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 04:42:44,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 04:42:44,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:42:45,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 04:42:45,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:42:45,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:42:47,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:42:47,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:42:49,373 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:42:49,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:42:51,046 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:42:52,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:42:55,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:42:55,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:42:59,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 04:43:04,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=599693.3333333334, ans=0.2 2023-09-30 04:43:05,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:43:06,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:43:08,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:43:08,445 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:43:11,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:43:12,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 04:43:12,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 04:43:15,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:43:18,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:43:18,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:43:20,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:43:21,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:43:21,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 04:43:21,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=599826.6666666666, ans=0.125 2023-09-30 04:43:24,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:43:26,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:43:28,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:43:31,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:43:31,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:43:31,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 04:43:33,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:43:35,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:43:37,779 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=10.39 vs. limit=15.0 2023-09-30 04:43:38,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:43:39,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:43:39,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:43:40,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:43:41,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:43:41,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:43:44,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:43:44,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:43:45,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:43:46,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 04:43:51,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:43:55,623 INFO [train.py:1039] (3/4) Epoch 17, batch 5000, loss[loss=0.1877, simple_loss=0.2745, pruned_loss=0.05049, over 24014.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.2551, pruned_loss=0.05308, over 4709009.19 frames. ], batch size: 80, lr: 5.99e-03, grad_scale: 16.0 2023-09-30 04:43:57,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 04:43:57,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 04:44:05,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:44:05,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:44:07,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 04:44:07,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 04:44:09,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:44:11,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 04:44:11,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:44:11,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:44:12,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 04:44:14,277 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:44:16,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:44:16,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 04:44:17,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:44:17,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:44:19,559 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=600026.6666666666, ans=0.125 2023-09-30 04:44:20,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 04:44:20,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 04:44:20,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:44:22,122 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.867e+02 2.091e+02 2.376e+02 3.821e+02, threshold=4.182e+02, percent-clipped=0.0 2023-09-30 04:44:22,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 04:44:22,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:44:23,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:44:23,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 04:44:23,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 04:44:24,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 04:44:26,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 04:44:27,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:44:27,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:44:29,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 04:44:29,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:44:32,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:44:32,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=600093.3333333334, ans=0.125 2023-09-30 04:44:34,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:44:34,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 04:44:36,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 04:44:36,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:44:39,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:44:42,436 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 04:44:44,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:44:46,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:44:46,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:44:49,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 04:44:49,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:44:51,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:44:51,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:44:52,621 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.67 vs. limit=12.0 2023-09-30 04:44:53,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 04:44:53,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:44:56,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:44:56,899 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=600160.0, ans=0.07 2023-09-30 04:44:58,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:45:03,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 04:45:06,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:45:18,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:45:19,648 INFO [train.py:1039] (3/4) Epoch 17, batch 5050, loss[loss=0.175, simple_loss=0.2457, pruned_loss=0.05211, over 23457.00 frames. ], tot_loss[loss=0.1808, simple_loss=0.2558, pruned_loss=0.05289, over 4710463.53 frames. ], batch size: 134, lr: 5.99e-03, grad_scale: 16.0 2023-09-30 04:45:19,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:45:19,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:45:19,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:45:19,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:45:19,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:45:19,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:45:23,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:45:23,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 04:45:25,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:45:28,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:45:29,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:45:30,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 04:45:31,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:45:31,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:45:33,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=600293.3333333334, ans=0.0 2023-09-30 04:45:34,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 04:45:36,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:45:36,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:45:46,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 04:45:46,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 04:45:48,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:45:48,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 04:45:48,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:45:50,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:45:50,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:45:51,369 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.98 vs. limit=22.5 2023-09-30 04:45:51,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:45:51,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 04:45:51,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 04:45:53,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:45:55,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:45:59,643 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.38 vs. limit=15.0 2023-09-30 04:46:00,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:46:00,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 04:46:04,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:46:05,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 04:46:07,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:46:07,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:46:07,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:46:08,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:46:11,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:46:13,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:46:15,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:46:15,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:46:15,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:46:15,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 04:46:15,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:46:17,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:46:20,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:46:20,278 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 04:46:20,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 04:46:23,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:46:25,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:46:25,374 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 04:46:29,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:46:29,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 04:46:29,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:46:35,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:46:35,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:46:35,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 04:46:37,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 04:46:38,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:46:40,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:46:40,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 04:46:42,039 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 04:46:42,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=600626.6666666666, ans=0.125 2023-09-30 04:46:43,303 INFO [train.py:1039] (3/4) Epoch 17, batch 5100, loss[loss=0.1892, simple_loss=0.2534, pruned_loss=0.06253, over 23672.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.256, pruned_loss=0.05317, over 4699354.68 frames. ], batch size: 232, lr: 5.99e-03, grad_scale: 16.0 2023-09-30 04:46:44,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:46:48,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 04:46:49,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 04:46:51,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:46:52,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:46:53,280 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=600626.6666666666, ans=10.0 2023-09-30 04:46:55,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:46:56,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 04:46:56,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 04:47:02,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:47:04,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:47:08,741 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.763e+02 1.981e+02 2.119e+02 3.450e+02, threshold=3.962e+02, percent-clipped=0.0 2023-09-30 04:47:08,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:47:09,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=600693.3333333334, ans=0.1 2023-09-30 04:47:12,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 04:47:14,287 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=11.11 vs. limit=15.0 2023-09-30 04:47:14,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:47:16,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:47:16,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 04:47:19,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:47:19,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:47:19,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 04:47:22,861 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 04:47:22,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:47:24,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 04:47:24,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 04:47:26,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:47:28,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=600760.0, ans=0.2 2023-09-30 04:47:28,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=600760.0, ans=0.0 2023-09-30 04:47:31,469 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=600826.6666666666, ans=0.2 2023-09-30 04:47:37,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:47:39,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 04:47:39,520 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 04:47:39,532 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 04:47:41,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 04:47:41,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:47:41,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=600826.6666666666, ans=0.125 2023-09-30 04:47:44,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 04:47:50,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 04:47:53,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 04:47:54,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 04:47:56,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=600893.3333333334, ans=0.125 2023-09-30 04:47:57,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 04:47:59,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 04:47:59,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 04:48:05,690 INFO [train.py:1039] (3/4) Epoch 17, batch 5150, loss[loss=0.2471, simple_loss=0.3074, pruned_loss=0.09338, over 19587.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.257, pruned_loss=0.05354, over 4705991.47 frames. ], batch size: 389, lr: 5.99e-03, grad_scale: 16.0 2023-09-30 04:48:05,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:48:05,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:48:05,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:48:05,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:48:06,864 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=7.45 vs. limit=15.0 2023-09-30 04:48:07,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 04:48:07,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:48:08,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 04:48:08,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 04:48:09,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 04:48:09,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:48:09,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 04:48:11,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:48:12,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 04:48:12,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:48:14,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:48:20,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:48:20,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 04:48:21,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=601026.6666666666, ans=0.1 2023-09-30 04:48:22,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:48:22,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 04:48:26,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 04:48:26,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:48:26,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:48:26,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:48:26,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:48:27,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 04:48:28,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:48:29,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:48:30,500 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=17.44 vs. limit=15.0 2023-09-30 04:48:31,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 04:48:32,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 04:48:34,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:48:40,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:48:42,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 04:48:46,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:48:50,794 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=601093.3333333334, ans=0.125 2023-09-30 04:48:52,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:48:53,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:49:00,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:49:00,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:49:03,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 04:49:03,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=601160.0, ans=0.2 2023-09-30 04:49:06,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:49:06,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 04:49:08,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 04:49:11,220 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.73 vs. limit=12.0 2023-09-30 04:49:11,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:49:11,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:49:13,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 04:49:18,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:49:20,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:49:23,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:49:23,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:49:23,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 04:49:24,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:49:24,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 04:49:25,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:49:27,927 INFO [train.py:1039] (3/4) Epoch 17, batch 5200, loss[loss=0.1919, simple_loss=0.2792, pruned_loss=0.05226, over 24391.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2581, pruned_loss=0.05439, over 4698179.46 frames. ], batch size: 77, lr: 5.99e-03, grad_scale: 32.0 2023-09-30 04:49:29,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:49:31,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:49:35,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:49:40,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 04:49:42,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:49:44,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:49:46,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:49:47,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 04:49:47,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:49:50,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 04:49:50,987 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.50 vs. limit=15.0 2023-09-30 04:49:53,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 04:49:53,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:49:55,064 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.818e+02 1.970e+02 2.155e+02 3.146e+02, threshold=3.941e+02, percent-clipped=0.0 2023-09-30 04:49:55,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 04:49:57,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 04:49:59,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 04:49:59,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 04:50:00,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 04:50:03,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 04:50:04,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:50:04,524 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 04:50:05,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:50:06,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:50:07,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:50:08,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 04:50:09,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:50:10,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:50:14,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 04:50:14,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 04:50:14,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 04:50:20,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 04:50:21,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:50:27,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:50:27,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:50:28,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 04:50:30,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:50:30,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 04:50:30,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:50:30,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:50:35,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:50:36,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:50:40,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:50:41,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:50:41,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:50:46,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:50:48,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 04:50:49,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=601560.0, ans=0.0 2023-09-30 04:50:50,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 04:50:50,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:50:50,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:50:51,692 INFO [train.py:1039] (3/4) Epoch 17, batch 5250, loss[loss=0.1845, simple_loss=0.2713, pruned_loss=0.04882, over 24306.00 frames. ], tot_loss[loss=0.1831, simple_loss=0.2572, pruned_loss=0.05452, over 4691121.72 frames. ], batch size: 74, lr: 5.99e-03, grad_scale: 32.0 2023-09-30 04:50:51,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 04:50:53,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:50:56,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:50:57,643 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.03 vs. limit=15.0 2023-09-30 04:51:00,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:51:00,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:51:01,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:51:07,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:51:07,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:51:11,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:51:13,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:51:15,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 04:51:15,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:51:18,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:51:33,494 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=601760.0, ans=0.1 2023-09-30 04:51:52,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=601893.3333333334, ans=0.0 2023-09-30 04:51:58,456 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=601893.3333333334, ans=0.125 2023-09-30 04:51:58,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=601893.3333333334, ans=0.1 2023-09-30 04:52:06,463 INFO [train.py:1039] (3/4) Epoch 17, batch 5300, loss[loss=0.1963, simple_loss=0.2553, pruned_loss=0.0687, over 23768.00 frames. ], tot_loss[loss=0.1818, simple_loss=0.2559, pruned_loss=0.05385, over 4705438.76 frames. ], batch size: 164, lr: 5.98e-03, grad_scale: 16.0 2023-09-30 04:52:21,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:52:21,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 04:52:21,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 04:52:21,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:52:22,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:52:22,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:52:22,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:52:22,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:52:22,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:52:22,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:52:22,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 04:52:23,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:52:23,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 04:52:23,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 04:52:23,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 04:52:23,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 04:52:23,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 04:52:23,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 04:52:24,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:52:24,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:52:24,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:52:25,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:52:25,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:52:25,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:52:25,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:52:25,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:52:25,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:52:25,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:52:25,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:52:25,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:52:25,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:52:26,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 04:52:27,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:52:27,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:52:27,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 04:52:27,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 04:52:27,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 04:52:27,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:52:27,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 04:52:28,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 04:52:28,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:52:29,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:52:29,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:52:29,680 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 04:52:29,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 04:52:29,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:52:29,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:52:30,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 04:52:30,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 04:52:30,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 04:52:30,590 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 04:52:39,887 INFO [train.py:1039] (3/4) Epoch 18, batch 0, loss[loss=0.18, simple_loss=0.2516, pruned_loss=0.05423, over 23646.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2516, pruned_loss=0.05423, over 23646.00 frames. ], batch size: 149, lr: 5.81e-03, grad_scale: 32.0 2023-09-30 04:52:39,888 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-30 04:52:53,323 INFO [train.py:1071] (3/4) Epoch 18, validation: loss=0.3168, simple_loss=0.2865, pruned_loss=0.1735, over 1125622.00 frames. 2023-09-30 04:52:53,324 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-30 04:52:57,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 04:52:58,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:52:58,965 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=602040.0, ans=0.1 2023-09-30 04:53:00,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:53:01,502 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.655e+02 1.872e+02 2.065e+02 2.362e+02 3.138e+02, threshold=4.130e+02, percent-clipped=0.0 2023-09-30 04:53:03,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=602040.0, ans=0.015 2023-09-30 04:53:06,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:53:06,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:53:06,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:53:08,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 04:53:09,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 04:53:11,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:53:11,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=602106.6666666666, ans=0.0 2023-09-30 04:53:13,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:53:16,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:53:16,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:53:16,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 04:53:16,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:53:19,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 04:53:19,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:53:29,637 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 04:53:29,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:53:31,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 04:53:34,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 04:53:34,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:53:36,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:53:37,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=602173.3333333334, ans=0.125 2023-09-30 04:53:41,117 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:53:44,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:53:51,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 04:53:52,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=602240.0, ans=0.0 2023-09-30 04:53:54,161 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=18.68 vs. limit=22.5 2023-09-30 04:53:54,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 04:53:55,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:53:55,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:53:57,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:53:59,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:54:01,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 04:54:03,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:54:05,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:54:07,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=602306.6666666666, ans=0.125 2023-09-30 04:54:10,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:54:13,555 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 04:54:13,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=602373.3333333334, ans=0.125 2023-09-30 04:54:15,520 INFO [train.py:1039] (3/4) Epoch 18, batch 50, loss[loss=0.1788, simple_loss=0.2615, pruned_loss=0.04806, over 24440.00 frames. ], tot_loss[loss=0.1823, simple_loss=0.2576, pruned_loss=0.05348, over 1071256.33 frames. ], batch size: 69, lr: 5.81e-03, grad_scale: 32.0 2023-09-30 04:54:16,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 04:54:20,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:54:20,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:54:21,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 04:54:21,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 04:54:21,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:54:23,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:54:25,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:54:26,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:54:29,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 04:54:29,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:54:37,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:54:37,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 04:54:41,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 04:54:43,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:54:44,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:54:44,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:54:46,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:54:48,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 04:54:48,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 04:54:48,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:54:53,718 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=12.59 vs. limit=15.0 2023-09-30 04:54:57,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:54:58,894 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:54:58,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 04:54:59,106 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=602506.6666666666, ans=0.0 2023-09-30 04:55:00,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 04:55:03,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 04:55:03,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:55:03,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 04:55:03,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:55:06,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 04:55:12,341 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=602573.3333333334, ans=0.125 2023-09-30 04:55:13,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:55:15,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:55:15,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:55:16,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:55:18,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 04:55:21,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 04:55:21,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 04:55:21,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:55:21,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 04:55:24,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:55:24,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:55:26,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 04:55:28,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 04:55:28,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 04:55:29,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:55:31,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:55:31,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 04:55:31,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 04:55:32,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:55:34,317 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:55:35,733 INFO [train.py:1039] (3/4) Epoch 18, batch 100, loss[loss=0.1913, simple_loss=0.2565, pruned_loss=0.06307, over 23828.00 frames. ], tot_loss[loss=0.1823, simple_loss=0.2579, pruned_loss=0.05339, over 1893006.63 frames. ], batch size: 179, lr: 5.81e-03, grad_scale: 16.0 2023-09-30 04:55:35,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 04:55:35,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:55:38,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:55:42,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:55:44,981 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.549e+02 1.863e+02 2.072e+02 2.465e+02 3.411e+02, threshold=4.144e+02, percent-clipped=0.0 2023-09-30 04:55:45,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:55:48,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 04:55:48,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:55:51,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 04:55:51,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:55:51,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 04:55:51,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:55:51,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 04:55:52,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 04:55:54,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 04:55:55,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:55:55,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:55:55,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:55:56,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=602773.3333333334, ans=0.1 2023-09-30 04:56:00,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 04:56:01,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:56:02,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:56:04,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 04:56:05,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 04:56:06,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=602773.3333333334, ans=0.0 2023-09-30 04:56:10,202 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 04:56:10,229 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 04:56:11,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:56:11,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:56:15,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 04:56:16,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:56:18,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:24,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:25,657 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 04:56:27,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 04:56:27,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=602906.6666666666, ans=0.0 2023-09-30 04:56:30,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:56:31,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:56:35,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:38,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:56:40,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:56:43,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:56:46,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:47,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:56:49,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:56:49,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:56:51,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:56:51,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 04:56:51,466 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 04:56:51,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:56:53,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:56:53,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:56:53,831 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:56:53,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 04:56:55,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:56:55,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 04:56:55,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:56:56,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:56:56,991 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:56:57,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:56:58,448 INFO [train.py:1039] (3/4) Epoch 18, batch 150, loss[loss=0.1874, simple_loss=0.2632, pruned_loss=0.05581, over 23693.00 frames. ], tot_loss[loss=0.1834, simple_loss=0.2587, pruned_loss=0.0541, over 2514463.89 frames. ], batch size: 149, lr: 5.81e-03, grad_scale: 16.0 2023-09-30 04:56:58,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:57:02,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:57:05,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 04:57:05,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:57:05,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:08,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:57:08,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:11,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 04:57:11,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:16,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 04:57:16,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 04:57:16,518 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 04:57:19,569 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:57:19,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 04:57:21,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:57:23,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:57:23,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:57:23,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=603106.6666666666, ans=0.0 2023-09-30 04:57:24,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:24,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:57:26,169 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 04:57:27,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:57:30,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=603173.3333333334, ans=0.04949747468305833 2023-09-30 04:57:32,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten.whitening_limit, batch_count=603173.3333333334, ans=15.0 2023-09-30 04:57:35,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:57:35,739 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=13.82 vs. limit=15.0 2023-09-30 04:57:39,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 04:57:39,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 04:57:42,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 04:57:43,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:57:43,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:57:44,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 04:57:48,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 04:57:48,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=603240.0, ans=0.125 2023-09-30 04:57:49,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:57:51,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:57:51,811 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=603240.0, ans=0.0 2023-09-30 04:57:51,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=603240.0, ans=0.125 2023-09-30 04:57:52,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 04:57:58,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:57:58,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:57:58,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 04:57:59,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 04:58:01,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:58:02,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 04:58:03,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=603306.6666666666, ans=0.5 2023-09-30 04:58:06,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 04:58:07,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 04:58:09,782 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:58:12,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 04:58:12,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 04:58:12,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 04:58:12,789 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 04:58:17,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:58:19,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=603373.3333333334, ans=0.2 2023-09-30 04:58:20,322 INFO [train.py:1039] (3/4) Epoch 18, batch 200, loss[loss=0.1806, simple_loss=0.2666, pruned_loss=0.04726, over 24658.00 frames. ], tot_loss[loss=0.1842, simple_loss=0.2597, pruned_loss=0.05438, over 3011326.75 frames. ], batch size: 73, lr: 5.81e-03, grad_scale: 16.0 2023-09-30 04:58:20,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 04:58:20,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:58:23,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 04:58:23,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:58:23,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:58:27,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 04:58:30,021 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.857e+02 2.048e+02 2.282e+02 3.617e+02, threshold=4.095e+02, percent-clipped=0.0 2023-09-30 04:58:30,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 04:58:32,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:58:32,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:58:35,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 04:58:35,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 04:58:35,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:58:49,049 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=603440.0, ans=0.2 2023-09-30 04:58:56,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 04:58:57,344 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.15 vs. limit=15.0 2023-09-30 04:58:57,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 04:58:58,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 04:58:59,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:58:59,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 04:58:59,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 04:59:01,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:03,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 04:59:03,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:59:03,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:59:06,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 04:59:06,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 04:59:06,954 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:59:11,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 04:59:21,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 04:59:27,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:29,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 04:59:32,674 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=603640.0, ans=0.1 2023-09-30 04:59:35,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:38,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 04:59:38,995 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:59:39,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 04:59:39,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 04:59:40,591 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 04:59:41,935 INFO [train.py:1039] (3/4) Epoch 18, batch 250, loss[loss=0.1648, simple_loss=0.2454, pruned_loss=0.04208, over 24677.00 frames. ], tot_loss[loss=0.1836, simple_loss=0.2587, pruned_loss=0.05423, over 3380165.66 frames. ], batch size: 65, lr: 5.81e-03, grad_scale: 16.0 2023-09-30 04:59:42,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 04:59:44,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 04:59:44,141 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 04:59:45,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:47,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 04:59:47,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:47,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 04:59:52,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 04:59:52,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 04:59:53,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 04:59:56,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:00:05,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:00:09,004 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:00:09,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:00:11,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=603773.3333333334, ans=0.125 2023-09-30 05:00:15,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=603840.0, ans=0.0 2023-09-30 05:00:18,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:00:18,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:00:18,888 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=603840.0, ans=0.125 2023-09-30 05:00:20,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:00:20,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:00:22,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 05:00:22,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:00:22,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:00:26,383 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:00:29,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 05:00:29,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:00:31,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:00:32,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:00:32,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:00:32,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:00:35,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:00:35,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:00:37,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:00:38,842 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:00:38,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:00:42,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:00:45,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=603906.6666666666, ans=0.05 2023-09-30 05:00:47,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:00:50,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:00:56,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:00:58,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:01:00,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=603973.3333333334, ans=0.125 2023-09-30 05:01:03,452 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 05:01:04,800 INFO [train.py:1039] (3/4) Epoch 18, batch 300, loss[loss=0.1803, simple_loss=0.2422, pruned_loss=0.05925, over 23799.00 frames. ], tot_loss[loss=0.1822, simple_loss=0.2562, pruned_loss=0.05414, over 3677043.65 frames. ], batch size: 164, lr: 5.80e-03, grad_scale: 16.0 2023-09-30 05:01:04,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:01:04,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:01:07,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 05:01:07,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 05:01:09,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:01:09,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 05:01:14,087 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.412e+02 1.790e+02 1.961e+02 2.231e+02 3.675e+02, threshold=3.922e+02, percent-clipped=0.0 2023-09-30 05:01:14,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:01:15,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:01:16,212 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=604040.0, ans=0.125 2023-09-30 05:01:18,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:01:20,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 05:01:20,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:01:22,303 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 05:01:23,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:01:23,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 05:01:23,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:01:27,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 05:01:34,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:01:34,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 05:01:38,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 05:01:38,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:01:39,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:01:42,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:01:42,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 05:01:42,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:01:44,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=604173.3333333334, ans=0.125 2023-09-30 05:01:44,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=604173.3333333334, ans=0.0 2023-09-30 05:01:45,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:01:46,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:01:47,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:01:50,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 05:01:50,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 05:01:52,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:01:55,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:01:58,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 05:01:59,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:02:03,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:02:06,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:02:06,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 05:02:10,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:02:11,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:02:13,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:02:14,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:02:16,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 05:02:16,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 05:02:16,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:02:16,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 05:02:19,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:02:20,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:21,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=604306.6666666666, ans=0.125 2023-09-30 05:02:22,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:02:22,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:02:23,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:25,350 INFO [train.py:1039] (3/4) Epoch 18, batch 350, loss[loss=0.1883, simple_loss=0.2679, pruned_loss=0.05435, over 24078.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.2539, pruned_loss=0.05363, over 3905053.84 frames. ], batch size: 86, lr: 5.80e-03, grad_scale: 16.0 2023-09-30 05:02:28,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:02:28,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 05:02:30,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:35,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:02:38,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=604373.3333333334, ans=0.1 2023-09-30 05:02:39,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:02:40,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:43,044 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 05:02:44,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:02:46,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 05:02:47,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:02:47,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 05:02:49,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:02:51,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 05:02:52,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:02:54,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:02:55,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:02:57,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:02:57,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:02:57,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:02:57,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:02:57,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:03:00,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:03:00,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:03:09,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:03:09,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:03:10,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:03:10,726 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:03:16,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 05:03:16,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:03:22,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:03:22,636 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:03:22,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:03:24,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 05:03:27,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:03:27,229 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 05:03:30,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 05:03:30,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:03:32,026 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=604640.0, ans=0.2 2023-09-30 05:03:33,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:03:33,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 05:03:36,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:03:39,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:03:41,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:03:43,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:03:43,445 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:03:43,589 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=604640.0, ans=0.1 2023-09-30 05:03:46,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:03:46,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=604706.6666666666, ans=0.0 2023-09-30 05:03:47,938 INFO [train.py:1039] (3/4) Epoch 18, batch 400, loss[loss=0.1932, simple_loss=0.262, pruned_loss=0.06227, over 23819.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2526, pruned_loss=0.0534, over 4068370.01 frames. ], batch size: 195, lr: 5.80e-03, grad_scale: 32.0 2023-09-30 05:03:49,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:03:53,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:03:53,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 05:03:53,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:03:55,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:03:55,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:03:57,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:03:58,490 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.488e+02 1.749e+02 1.897e+02 2.088e+02 3.470e+02, threshold=3.794e+02, percent-clipped=0.0 2023-09-30 05:04:00,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:04:00,408 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=604706.6666666666, ans=0.1 2023-09-30 05:04:01,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:04:03,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 05:04:04,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 05:04:04,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:04:06,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 05:04:06,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:04:11,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:04:11,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:04:12,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 05:04:12,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:04:12,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:04:12,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:04:15,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:04:17,313 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 05:04:18,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 05:04:23,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:04:23,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:04:25,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 05:04:27,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 05:04:30,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:04:32,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:04:38,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 05:04:40,417 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:04:41,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 05:04:43,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:04:46,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:04:46,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 05:04:50,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:04:53,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 05:04:55,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:04:57,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:04:59,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 05:05:02,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 05:05:02,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 05:05:05,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:05:05,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:05:07,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 05:05:10,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:05:10,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:05:10,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:05:11,915 INFO [train.py:1039] (3/4) Epoch 18, batch 450, loss[loss=0.1775, simple_loss=0.2445, pruned_loss=0.05531, over 23828.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.254, pruned_loss=0.05346, over 4201331.13 frames. ], batch size: 164, lr: 5.80e-03, grad_scale: 32.0 2023-09-30 05:05:12,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 05:05:13,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:05:13,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:05:13,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:05:13,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 05:05:15,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:05:16,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:05:18,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:05:28,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:05:29,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:05:31,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 05:05:33,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 05:05:35,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=605106.6666666666, ans=0.0 2023-09-30 05:05:37,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:05:38,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=605106.6666666666, ans=0.2 2023-09-30 05:05:40,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:05:40,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:05:43,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:05:44,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:05:45,644 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.67 vs. limit=15.0 2023-09-30 05:05:46,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 05:05:48,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 05:05:48,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 05:05:49,736 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:05:49,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:05:49,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:05:51,549 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 05:05:51,563 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 05:05:53,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:05:55,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:05:56,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 05:06:01,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 05:06:01,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:06:02,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 05:06:03,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 05:06:05,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:06:08,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:06:08,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:06:11,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 05:06:16,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:06:16,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 05:06:18,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 05:06:18,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:06:24,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:06:26,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:06:28,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:06:29,591 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 05:06:29,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=605306.6666666666, ans=0.0 2023-09-30 05:06:33,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:06:34,638 INFO [train.py:1039] (3/4) Epoch 18, batch 500, loss[loss=0.2014, simple_loss=0.2702, pruned_loss=0.06636, over 23821.00 frames. ], tot_loss[loss=0.1817, simple_loss=0.2555, pruned_loss=0.054, over 4305982.63 frames. ], batch size: 195, lr: 5.80e-03, grad_scale: 16.0 2023-09-30 05:06:34,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:06:34,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=605373.3333333334, ans=0.0 2023-09-30 05:06:36,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:06:36,380 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 05:06:38,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 05:06:38,674 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:06:44,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 05:06:47,412 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.482e+02 1.876e+02 2.083e+02 2.292e+02 4.355e+02, threshold=4.166e+02, percent-clipped=1.0 2023-09-30 05:06:48,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 05:06:50,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:06:53,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:06:53,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:06:55,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:03,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:07:03,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:07:03,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 05:07:03,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:07:03,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 05:07:04,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:07:08,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:07:08,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:07:09,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:07:09,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:07:11,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 05:07:14,965 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 05:07:16,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=605506.6666666666, ans=0.1 2023-09-30 05:07:18,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:07:20,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:21,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:21,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:21,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:07:24,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 05:07:25,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=605573.3333333334, ans=0.125 2023-09-30 05:07:27,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:07:28,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:07:32,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:07:33,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=605573.3333333334, ans=0.2 2023-09-30 05:07:35,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:07:42,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:07:44,249 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 05:07:46,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 05:07:46,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:07:46,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:07:50,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 05:07:52,803 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:07:55,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:07:58,196 INFO [train.py:1039] (3/4) Epoch 18, batch 550, loss[loss=0.1688, simple_loss=0.2446, pruned_loss=0.04651, over 24427.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.2563, pruned_loss=0.05395, over 4409195.20 frames. ], batch size: 58, lr: 5.80e-03, grad_scale: 16.0 2023-09-30 05:08:01,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 05:08:02,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 05:08:02,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:08:02,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 05:08:02,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:08:02,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:08:04,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:04,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:04,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:08:06,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:08:09,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:08:09,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 05:08:09,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:08:13,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:08:13,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:17,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:08:18,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:21,247 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=8.70 vs. limit=15.0 2023-09-30 05:08:22,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=605773.3333333334, ans=0.1 2023-09-30 05:08:27,429 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 05:08:28,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 05:08:30,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:08:36,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:08:37,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:08:38,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:08:41,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:08:41,568 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 05:08:41,721 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:08:43,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 05:08:46,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:08:46,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:08:46,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:08:49,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:08:50,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 05:08:52,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 05:08:52,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:08:52,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:08:54,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:08:54,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:08:57,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:08:59,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:09:02,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:09:04,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:09:05,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=605973.3333333334, ans=0.2 2023-09-30 05:09:06,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 05:09:06,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:09:08,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:09:08,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:09:10,276 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:09:11,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 05:09:13,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 05:09:13,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=605973.3333333334, ans=0.0 2023-09-30 05:09:18,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 05:09:19,456 INFO [train.py:1039] (3/4) Epoch 18, batch 600, loss[loss=0.2518, simple_loss=0.3042, pruned_loss=0.09973, over 19595.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2572, pruned_loss=0.05418, over 4484095.05 frames. ], batch size: 388, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:09:22,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 05:09:24,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:09:24,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:09:24,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:09:30,825 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.862e+02 2.137e+02 2.512e+02 3.782e+02, threshold=4.274e+02, percent-clipped=0.0 2023-09-30 05:09:31,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:09:34,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 05:09:34,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 05:09:36,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:09:39,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:09:41,948 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:09:42,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=606106.6666666666, ans=0.125 2023-09-30 05:09:43,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 05:09:43,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:09:47,375 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.22 vs. limit=10.0 2023-09-30 05:09:49,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 05:09:54,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:09:54,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:09:54,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:09:58,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=606173.3333333334, ans=0.125 2023-09-30 05:10:02,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:10:02,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:10:02,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:10:07,345 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.65 vs. limit=10.0 2023-09-30 05:10:10,989 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:10:12,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:10:12,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:10:12,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:10:21,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 05:10:25,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 05:10:25,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:10:30,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 05:10:32,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:10:35,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 05:10:35,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:10:35,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:10:42,014 INFO [train.py:1039] (3/4) Epoch 18, batch 650, loss[loss=0.1846, simple_loss=0.2696, pruned_loss=0.04982, over 24413.00 frames. ], tot_loss[loss=0.1826, simple_loss=0.2567, pruned_loss=0.05426, over 4529705.17 frames. ], batch size: 77, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:10:43,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 05:10:45,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 05:10:47,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:10:48,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:10:51,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:10:52,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=606373.3333333334, ans=0.2 2023-09-30 05:10:54,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 05:10:55,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:11:00,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:11:00,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:11:00,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=606440.0, ans=0.5 2023-09-30 05:11:02,680 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.64 vs. limit=22.5 2023-09-30 05:11:03,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:11:10,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 05:11:12,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:11:12,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:11:15,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:11:15,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 05:11:17,681 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.47 vs. limit=15.0 2023-09-30 05:11:18,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:11:18,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:18,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 05:11:20,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:22,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:11:24,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:11:24,593 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=606506.6666666666, ans=0.125 2023-09-30 05:11:25,644 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 05:11:25,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:11:25,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:11:28,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:28,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:11:29,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:11:30,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:11:30,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 05:11:32,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:11:32,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:11:33,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:11:33,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:11:34,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=606573.3333333334, ans=0.2 2023-09-30 05:11:36,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:11:38,317 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 05:11:40,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 05:11:40,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:40,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:11:40,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:11:41,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:11:43,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:11:49,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:11:50,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:11:50,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:11:54,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:11:54,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:11:54,317 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:12:04,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:12:04,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:12:04,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:12:04,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:12:04,903 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.31 vs. limit=15.0 2023-09-30 05:12:05,553 INFO [train.py:1039] (3/4) Epoch 18, batch 700, loss[loss=0.184, simple_loss=0.2609, pruned_loss=0.05358, over 24383.00 frames. ], tot_loss[loss=0.1813, simple_loss=0.2557, pruned_loss=0.05345, over 4579062.56 frames. ], batch size: 77, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:12:07,653 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=606706.6666666666, ans=0.1 2023-09-30 05:12:10,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 05:12:11,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 05:12:15,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 05:12:16,483 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.793e+02 1.992e+02 2.237e+02 3.434e+02, threshold=3.985e+02, percent-clipped=0.0 2023-09-30 05:12:16,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:12:16,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:12:20,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 05:12:25,247 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:12:26,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:12:27,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.scale_min, batch_count=606773.3333333334, ans=0.2 2023-09-30 05:12:30,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:12:30,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:12:32,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:12:34,008 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=606773.3333333334, ans=0.0 2023-09-30 05:12:35,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:12:37,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 05:12:37,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:12:38,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 05:12:41,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 05:12:42,813 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.56 vs. limit=6.0 2023-09-30 05:12:46,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:12:46,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:12:48,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:12:53,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:12:53,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 05:12:55,834 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.96 vs. limit=6.0 2023-09-30 05:12:59,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:12:59,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:12:59,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=606906.6666666666, ans=0.1 2023-09-30 05:13:01,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 05:13:02,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:13:04,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:13:08,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:13:14,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:13:14,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 05:13:15,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 05:13:15,978 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 05:13:18,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=606973.3333333334, ans=15.0 2023-09-30 05:13:21,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:13:22,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:13:23,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:13:26,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:13:26,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 05:13:27,520 INFO [train.py:1039] (3/4) Epoch 18, batch 750, loss[loss=0.1739, simple_loss=0.2478, pruned_loss=0.05007, over 23814.00 frames. ], tot_loss[loss=0.1808, simple_loss=0.2552, pruned_loss=0.0532, over 4611094.35 frames. ], batch size: 212, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:13:30,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 05:13:30,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 05:13:32,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 05:13:32,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 05:13:33,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 05:13:34,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:13:35,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 05:13:36,295 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.58 vs. limit=15.0 2023-09-30 05:13:36,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:13:36,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:13:39,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:13:42,022 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:13:42,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 05:13:42,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:13:46,569 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:13:48,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:13:49,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:13:52,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:13:52,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:13:54,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 05:13:56,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:13:56,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:13:57,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:14:00,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:14:01,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 05:14:02,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:14:04,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 05:14:04,722 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 05:14:04,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 05:14:04,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:14:06,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 05:14:07,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:14:11,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_na.min_abs, batch_count=607173.3333333334, ans=0.02 2023-09-30 05:14:13,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:14:13,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:14:15,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:14:18,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:14:19,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn1.whiten.whitening_limit, batch_count=607240.0, ans=22.5 2023-09-30 05:14:19,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:14:19,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 05:14:19,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:14:21,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 05:14:22,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:14:24,556 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=607240.0, ans=0.0 2023-09-30 05:14:25,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:14:27,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 05:14:27,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:14:30,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=607240.0, ans=0.2 2023-09-30 05:14:30,645 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.64 vs. limit=15.0 2023-09-30 05:14:34,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:14:34,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:14:35,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:14:38,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:14:42,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 05:14:42,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:14:42,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:14:46,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:14:46,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:14:49,879 INFO [train.py:1039] (3/4) Epoch 18, batch 800, loss[loss=0.1686, simple_loss=0.2388, pruned_loss=0.04921, over 23684.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2555, pruned_loss=0.05314, over 4638490.49 frames. ], batch size: 149, lr: 5.79e-03, grad_scale: 32.0 2023-09-30 05:14:50,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:14:50,078 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:14:57,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:14:57,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:14:59,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:14:59,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:14:59,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:14:59,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:00,863 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.976e+02 2.273e+02 2.818e+02 4.949e+02, threshold=4.546e+02, percent-clipped=4.0 2023-09-30 05:15:01,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:15:01,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=607373.3333333334, ans=0.125 2023-09-30 05:15:06,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:15:06,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:15:08,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=607440.0, ans=0.0 2023-09-30 05:15:09,359 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.23 vs. limit=15.0 2023-09-30 05:15:10,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 05:15:11,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:13,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:15:13,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:15:13,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:15:14,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 05:15:14,716 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:15:14,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 05:15:19,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:15:22,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:15:24,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:15:24,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:15:27,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:28,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:31,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:15:31,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:15:33,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 05:15:34,984 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 05:15:35,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 05:15:35,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:15:35,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:15:37,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:15:38,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:15:39,135 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.31 vs. limit=10.0 2023-09-30 05:15:42,432 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.57 vs. limit=22.5 2023-09-30 05:15:43,201 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 05:15:43,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 05:15:46,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:15:48,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:15:52,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:15:55,219 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=607640.0, ans=0.125 2023-09-30 05:15:56,485 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:15:58,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 05:15:58,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:16:00,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=607640.0, ans=0.09899494936611666 2023-09-30 05:16:03,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 05:16:10,360 INFO [train.py:1039] (3/4) Epoch 18, batch 850, loss[loss=0.1523, simple_loss=0.2292, pruned_loss=0.03775, over 24260.00 frames. ], tot_loss[loss=0.1811, simple_loss=0.2559, pruned_loss=0.05315, over 4662968.39 frames. ], batch size: 56, lr: 5.79e-03, grad_scale: 16.0 2023-09-30 05:16:10,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:16:13,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:16:13,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 05:16:14,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:16:15,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:16:17,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 05:16:18,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:16:18,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:16:20,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:16:21,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:16:22,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:16:24,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 05:16:24,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 05:16:24,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 05:16:27,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:16:27,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:16:29,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:16:29,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:16:30,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:16:36,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:16:36,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:16:36,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 05:16:40,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 05:16:43,621 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:16:45,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 05:16:50,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 05:16:50,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 05:16:54,247 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 05:16:54,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:16:54,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:16:54,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 05:16:54,686 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=607840.0, ans=0.125 2023-09-30 05:16:55,933 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:16:57,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:16:58,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 05:17:00,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:17:01,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:17:04,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:17:04,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:17:05,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:17:07,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 05:17:08,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 05:17:13,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:17:13,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:17:15,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:17:15,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:17:15,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:17:15,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=607973.3333333334, ans=0.0 2023-09-30 05:17:18,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:17:21,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:17:23,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:17:23,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:17:23,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:17:33,412 INFO [train.py:1039] (3/4) Epoch 18, batch 900, loss[loss=0.2041, simple_loss=0.2653, pruned_loss=0.07143, over 23811.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2574, pruned_loss=0.05408, over 4672981.61 frames. ], batch size: 212, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:17:33,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 05:17:34,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:17:36,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 05:17:36,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:17:36,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:17:38,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 05:17:38,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=608040.0, ans=0.1 2023-09-30 05:17:46,202 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.416e+02 1.851e+02 2.083e+02 2.531e+02 4.017e+02, threshold=4.166e+02, percent-clipped=0.0 2023-09-30 05:17:46,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:17:47,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:17:49,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 05:17:50,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=608106.6666666666, ans=0.125 2023-09-30 05:17:52,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:17:52,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 05:17:53,089 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 05:17:54,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:17:54,701 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:17:56,840 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:17:56,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:17:58,506 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=608106.6666666666, ans=0.0 2023-09-30 05:18:01,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=608106.6666666666, ans=0.125 2023-09-30 05:18:07,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:18:07,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:18:08,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:18:11,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:18:13,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=608173.3333333334, ans=0.0 2023-09-30 05:18:16,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 05:18:16,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:18:20,746 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.84 vs. limit=8.0 2023-09-30 05:18:21,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:18:22,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:18:24,609 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 05:18:26,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 05:18:30,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:18:30,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:18:33,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:18:33,832 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.59 vs. limit=6.0 2023-09-30 05:18:39,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:18:39,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:18:41,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 05:18:41,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:18:43,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 05:18:46,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:18:46,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:18:48,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:18:49,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:18:54,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 05:18:54,389 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 05:18:56,278 INFO [train.py:1039] (3/4) Epoch 18, batch 950, loss[loss=0.1882, simple_loss=0.263, pruned_loss=0.05667, over 24558.00 frames. ], tot_loss[loss=0.183, simple_loss=0.2575, pruned_loss=0.05425, over 4670251.94 frames. ], batch size: 60, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:18:58,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 05:18:58,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 05:18:59,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:19:02,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 05:19:08,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:19:10,037 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 05:19:11,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:19:11,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:19:11,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=608440.0, ans=0.0 2023-09-30 05:19:12,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 05:19:15,055 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 05:19:15,802 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.65 vs. limit=6.0 2023-09-30 05:19:20,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:19:20,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:19:21,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:19:21,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:19:21,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 05:19:23,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 05:19:25,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:19:26,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 05:19:26,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:19:26,815 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=608440.0, ans=0.125 2023-09-30 05:19:33,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:19:33,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:19:33,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:19:34,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 05:19:36,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 05:19:38,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:19:38,394 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=608506.6666666666, ans=0.125 2023-09-30 05:19:39,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:19:42,104 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=608506.6666666666, ans=0.1 2023-09-30 05:19:42,599 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.85 vs. limit=15.0 2023-09-30 05:19:44,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:19:44,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:19:48,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 05:19:51,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 05:19:51,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:19:51,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:19:51,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:19:51,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:19:51,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=608573.3333333334, ans=0.2 2023-09-30 05:19:56,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 05:19:58,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:20:01,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:20:01,490 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:20:01,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 05:20:01,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:20:01,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:20:03,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 05:20:05,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=608640.0, ans=0.1 2023-09-30 05:20:08,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:20:08,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:20:15,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:20:15,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 05:20:15,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 05:20:19,773 INFO [train.py:1039] (3/4) Epoch 18, batch 1000, loss[loss=0.1641, simple_loss=0.2451, pruned_loss=0.04154, over 24490.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2574, pruned_loss=0.05406, over 4681187.67 frames. ], batch size: 63, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:20:19,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:20:25,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 05:20:25,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:20:30,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:20:33,033 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.610e+02 2.036e+02 2.242e+02 3.072e+02 5.531e+02, threshold=4.484e+02, percent-clipped=11.0 2023-09-30 05:20:33,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 05:20:33,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 05:20:36,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:20:36,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:20:38,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:20:43,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 05:20:45,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=608773.3333333334, ans=0.0 2023-09-30 05:20:46,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 05:20:47,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 05:20:48,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:20:50,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 05:20:51,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 05:20:51,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 05:20:53,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:20:53,406 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=608840.0, ans=0.125 2023-09-30 05:20:55,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:21:03,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:21:03,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:21:05,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:21:06,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:21:06,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 05:21:06,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:21:06,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:21:08,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:21:08,416 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 05:21:10,542 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=608906.6666666666, ans=0.125 2023-09-30 05:21:13,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 05:21:13,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 05:21:16,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 05:21:17,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:21:20,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=608906.6666666666, ans=0.0 2023-09-30 05:21:25,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:21:26,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:21:26,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:21:28,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:21:29,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 05:21:31,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:21:33,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 05:21:33,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 05:21:36,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:21:36,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:21:38,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:21:38,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=608973.3333333334, ans=0.125 2023-09-30 05:21:41,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:21:42,868 INFO [train.py:1039] (3/4) Epoch 18, batch 1050, loss[loss=0.172, simple_loss=0.2444, pruned_loss=0.04981, over 23796.00 frames. ], tot_loss[loss=0.1814, simple_loss=0.2557, pruned_loss=0.05355, over 4691355.57 frames. ], batch size: 179, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:21:43,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:21:43,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=609040.0, ans=0.125 2023-09-30 05:21:47,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:21:47,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:21:48,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=609040.0, ans=0.125 2023-09-30 05:21:49,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 05:21:51,267 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:21:53,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:21:56,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:21:58,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:21:58,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=609106.6666666666, ans=0.125 2023-09-30 05:22:01,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:22:01,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:22:01,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:22:02,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:22:02,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 05:22:04,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:22:05,278 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=609106.6666666666, ans=0.125 2023-09-30 05:22:06,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 05:22:09,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:22:09,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 05:22:09,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:22:10,591 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.76 vs. limit=12.0 2023-09-30 05:22:17,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:22:17,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:22:17,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:22:21,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 05:22:21,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 05:22:21,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:22:24,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=609173.3333333334, ans=0.0 2023-09-30 05:22:26,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 05:22:29,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 05:22:31,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:22:32,555 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.55 vs. limit=15.0 2023-09-30 05:22:33,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=609240.0, ans=0.0 2023-09-30 05:22:33,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=609240.0, ans=0.125 2023-09-30 05:22:34,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 05:22:36,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 05:22:36,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:22:37,775 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:22:40,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:22:45,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 05:22:47,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 05:22:47,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 05:22:47,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:22:48,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:22:50,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 05:22:53,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:22:54,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:22:55,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:22:56,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:22:56,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:01,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:01,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 05:23:02,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:23:02,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 05:23:04,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 05:23:05,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:23:06,948 INFO [train.py:1039] (3/4) Epoch 18, batch 1100, loss[loss=0.1836, simple_loss=0.2618, pruned_loss=0.05268, over 23262.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2558, pruned_loss=0.05302, over 4708522.74 frames. ], batch size: 105, lr: 5.78e-03, grad_scale: 8.0 2023-09-30 05:23:08,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:23:14,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:23:20,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:23:22,011 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.922e+02 2.115e+02 2.641e+02 4.048e+02, threshold=4.230e+02, percent-clipped=0.0 2023-09-30 05:23:22,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:23:22,208 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:23:22,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 05:23:25,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:23:26,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:23:28,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:23:31,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:23:31,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 05:23:33,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 05:23:33,809 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=609440.0, ans=0.0 2023-09-30 05:23:35,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:23:35,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:23:37,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:23:40,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:23:44,814 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.62 vs. limit=6.0 2023-09-30 05:23:45,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:23:47,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 05:23:49,228 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 05:23:49,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:52,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:52,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 05:23:54,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:23:54,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 05:23:56,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:23:56,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:23:56,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:23:57,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:23:57,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 05:24:05,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:24:05,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 05:24:06,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:24:12,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:24:15,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 05:24:15,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 05:24:17,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:24:17,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=609640.0, ans=0.0 2023-09-30 05:24:20,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:24:22,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:24:22,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 05:24:23,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:24:23,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:24:25,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 05:24:25,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:24:25,775 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 05:24:26,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 05:24:27,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:24:27,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:24:28,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:24:30,509 INFO [train.py:1039] (3/4) Epoch 18, batch 1150, loss[loss=0.1824, simple_loss=0.2613, pruned_loss=0.05177, over 23197.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.2567, pruned_loss=0.05376, over 4709009.85 frames. ], batch size: 105, lr: 5.78e-03, grad_scale: 8.0 2023-09-30 05:24:33,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:24:35,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:24:38,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:24:38,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:24:38,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 05:24:38,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:24:42,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 05:24:44,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:24:44,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:24:49,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 05:24:52,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:24:56,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:24:56,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:24:56,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 05:24:57,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:24:57,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:24:58,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=609773.3333333334, ans=0.125 2023-09-30 05:25:02,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 05:25:02,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:25:04,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:25:09,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=609840.0, ans=0.0 2023-09-30 05:25:15,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:25:23,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:25:23,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 05:25:23,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:25:24,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:25:30,995 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 05:25:33,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:25:40,786 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 05:25:45,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:25:46,794 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:25:46,838 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:25:48,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:25:49,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:25:53,407 INFO [train.py:1039] (3/4) Epoch 18, batch 1200, loss[loss=0.1919, simple_loss=0.2778, pruned_loss=0.05296, over 24666.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.257, pruned_loss=0.05358, over 4720721.07 frames. ], batch size: 73, lr: 5.78e-03, grad_scale: 16.0 2023-09-30 05:25:57,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:25:57,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:25:58,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:25:58,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:25:59,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=610040.0, ans=0.035 2023-09-30 05:26:00,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:26:00,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:26:03,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:26:06,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:26:06,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:26:08,276 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.757e+02 1.911e+02 2.190e+02 3.350e+02, threshold=3.822e+02, percent-clipped=0.0 2023-09-30 05:26:10,022 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 05:26:11,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 05:26:18,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:26:19,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:26:19,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=610106.6666666666, ans=0.125 2023-09-30 05:26:22,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:26:24,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:26:24,437 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 05:26:26,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:26:34,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 05:26:34,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:26:34,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 05:26:36,391 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:26:39,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 05:26:44,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 05:26:44,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:26:46,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:26:47,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:26:47,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:26:48,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:26:49,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:26:51,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:26:51,448 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 05:26:51,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:26:52,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:26:52,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:26:53,811 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.97 vs. limit=15.0 2023-09-30 05:26:54,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:26:55,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:26:56,386 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=610240.0, ans=0.0 2023-09-30 05:27:00,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 05:27:02,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:27:06,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 05:27:11,331 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 05:27:14,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:27:15,615 INFO [train.py:1039] (3/4) Epoch 18, batch 1250, loss[loss=0.1838, simple_loss=0.2631, pruned_loss=0.05229, over 23788.00 frames. ], tot_loss[loss=0.1818, simple_loss=0.2571, pruned_loss=0.05319, over 4732036.39 frames. ], batch size: 85, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:27:17,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:27:17,511 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=610373.3333333334, ans=0.0 2023-09-30 05:27:18,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:27:21,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:27:24,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 05:27:25,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=610373.3333333334, ans=0.125 2023-09-30 05:27:27,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:27:29,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:27:29,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 05:27:31,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:27:31,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=610440.0, ans=0.125 2023-09-30 05:27:34,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:27:37,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 05:27:37,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:27:40,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:27:40,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:27:42,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:27:46,337 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=610440.0, ans=0.0 2023-09-30 05:27:47,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 05:27:47,629 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:27:47,637 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:27:49,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:27:49,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:27:50,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:27:54,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:27:59,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 05:28:00,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:28:02,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:28:03,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 05:28:04,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=610573.3333333334, ans=0.125 2023-09-30 05:28:05,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:28:05,563 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 05:28:06,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:28:07,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:28:10,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:28:10,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=610573.3333333334, ans=0.1 2023-09-30 05:28:13,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:28:13,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:28:15,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 05:28:15,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 05:28:15,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 05:28:18,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:28:21,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 05:28:21,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:28:23,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 05:28:25,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:28:26,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 05:28:27,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 05:28:28,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:28:28,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 05:28:28,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:28:30,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 05:28:30,964 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.63 vs. limit=22.5 2023-09-30 05:28:34,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:28:35,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:28:35,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:28:38,668 INFO [train.py:1039] (3/4) Epoch 18, batch 1300, loss[loss=0.1705, simple_loss=0.2395, pruned_loss=0.05077, over 23606.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.2565, pruned_loss=0.05293, over 4737016.06 frames. ], batch size: 135, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:28:38,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:28:41,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:28:41,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 05:28:45,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:28:45,645 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.91 vs. limit=10.0 2023-09-30 05:28:48,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 05:28:48,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:28:52,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:28:52,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:28:53,670 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.896e+02 2.139e+02 2.447e+02 3.795e+02, threshold=4.278e+02, percent-clipped=0.0 2023-09-30 05:28:53,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 05:28:59,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:28:59,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:29:00,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 05:29:05,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:29:10,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:29:10,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:29:10,948 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=610840.0, ans=0.2 2023-09-30 05:29:12,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:29:13,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:29:15,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:29:15,821 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=21.34 vs. limit=22.5 2023-09-30 05:29:16,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 05:29:17,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 05:29:23,469 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.30 vs. limit=6.0 2023-09-30 05:29:24,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:29:24,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:29:25,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=610840.0, ans=0.0 2023-09-30 05:29:26,261 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 05:29:26,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 05:29:27,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:29:30,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:29:31,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 05:29:33,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:29:33,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 05:29:35,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:29:40,648 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:29:40,652 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:29:42,917 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=14.46 vs. limit=22.5 2023-09-30 05:29:43,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 05:29:45,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 05:29:45,451 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 05:29:47,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=610973.3333333334, ans=0.035 2023-09-30 05:29:49,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:29:53,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 05:29:54,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:30:01,239 INFO [train.py:1039] (3/4) Epoch 18, batch 1350, loss[loss=0.1631, simple_loss=0.2468, pruned_loss=0.03968, over 24459.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2558, pruned_loss=0.05249, over 4743346.59 frames. ], batch size: 63, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:30:01,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 05:30:07,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:30:08,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:30:14,046 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:30:14,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:30:16,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:30:18,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:30:21,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:30:23,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 05:30:24,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:30:24,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:30:27,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 05:30:29,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:30:31,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:30:31,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 05:30:32,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 05:30:34,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 05:30:35,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:30:35,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 05:30:49,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:30:56,295 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=611240.0, ans=0.125 2023-09-30 05:30:59,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:30:59,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:30:59,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 05:31:02,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:31:03,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 05:31:03,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:31:03,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:31:04,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=611240.0, ans=0.125 2023-09-30 05:31:07,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:31:09,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 05:31:09,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=611306.6666666666, ans=0.2 2023-09-30 05:31:11,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:31:15,408 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.65 vs. limit=15.0 2023-09-30 05:31:17,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 05:31:21,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 05:31:24,495 INFO [train.py:1039] (3/4) Epoch 18, batch 1400, loss[loss=0.1636, simple_loss=0.2469, pruned_loss=0.04011, over 24662.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2543, pruned_loss=0.05219, over 4724094.88 frames. ], batch size: 65, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:31:25,671 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.39 vs. limit=5.0 2023-09-30 05:31:26,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 05:31:26,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:31:29,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:31:31,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:31:35,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 05:31:37,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 05:31:38,680 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.427e+02 1.909e+02 2.025e+02 2.334e+02 3.516e+02, threshold=4.051e+02, percent-clipped=0.0 2023-09-30 05:31:48,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:31:48,728 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=611440.0, ans=0.2 2023-09-30 05:31:51,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:31:54,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:31:54,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:31:59,186 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.04 vs. limit=15.0 2023-09-30 05:31:59,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:32:00,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 05:32:04,039 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.98 vs. limit=15.0 2023-09-30 05:32:08,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:32:09,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:32:12,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 05:32:14,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:32:14,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:32:16,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:32:16,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:32:20,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:32:20,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:32:21,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:32:21,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 05:32:21,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:32:27,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:32:31,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:32:39,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 05:32:40,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 05:32:40,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:32:43,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 05:32:44,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:32:45,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:32:46,987 INFO [train.py:1039] (3/4) Epoch 18, batch 1450, loss[loss=0.1865, simple_loss=0.2573, pruned_loss=0.05786, over 23277.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2544, pruned_loss=0.05216, over 4724519.33 frames. ], batch size: 105, lr: 5.77e-03, grad_scale: 16.0 2023-09-30 05:32:48,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:32:48,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=611706.6666666666, ans=0.1 2023-09-30 05:32:52,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:32:52,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:32:52,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 05:32:53,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=611706.6666666666, ans=0.2 2023-09-30 05:32:53,544 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.84 vs. limit=15.0 2023-09-30 05:32:58,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:32:58,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:32:59,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:32:59,831 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 05:33:01,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:33:03,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 05:33:04,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:33:05,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:33:05,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 05:33:06,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:33:07,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:33:08,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 05:33:08,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:33:08,298 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=611773.3333333334, ans=0.1 2023-09-30 05:33:10,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:33:12,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:33:14,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:33:18,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:33:18,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:33:21,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:33:21,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:33:24,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:33:24,713 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:33:24,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:33:24,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:33:30,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 05:33:31,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:33:33,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=611840.0, ans=0.09899494936611666 2023-09-30 05:33:36,460 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 05:33:38,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:33:40,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:33:40,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:33:41,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 05:33:46,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:33:47,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 05:33:49,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 05:33:50,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:33:53,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:33:53,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:33:54,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=611973.3333333334, ans=0.125 2023-09-30 05:33:54,629 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.94 vs. limit=10.0 2023-09-30 05:33:54,640 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=4.17 vs. limit=12.0 2023-09-30 05:33:55,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 05:33:56,027 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.75 vs. limit=15.0 2023-09-30 05:33:59,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 05:33:59,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 05:34:02,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:34:05,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:34:09,829 INFO [train.py:1039] (3/4) Epoch 18, batch 1500, loss[loss=0.1781, simple_loss=0.2638, pruned_loss=0.04617, over 24322.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.2547, pruned_loss=0.05223, over 4722441.34 frames. ], batch size: 74, lr: 5.77e-03, grad_scale: 8.0 2023-09-30 05:34:10,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=612040.0, ans=0.1 2023-09-30 05:34:15,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 05:34:15,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:34:15,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:34:15,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:34:16,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:34:18,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:34:18,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 05:34:20,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:34:21,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:34:21,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:34:22,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:34:24,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:34:24,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:34:25,934 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.869e+02 2.047e+02 2.380e+02 3.622e+02, threshold=4.094e+02, percent-clipped=0.0 2023-09-30 05:34:32,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:34:32,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 05:34:34,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:34:34,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:34:34,683 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.12 vs. limit=15.0 2023-09-30 05:34:35,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:34:41,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 05:34:44,895 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=612173.3333333334, ans=0.2 2023-09-30 05:34:46,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 05:34:46,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:34:46,850 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.94 vs. limit=6.0 2023-09-30 05:34:47,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 05:34:49,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 05:34:51,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:34:53,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:34:53,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:34:54,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 05:34:54,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:34:55,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:34:55,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 05:34:56,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:34:59,493 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.05 vs. limit=15.0 2023-09-30 05:35:03,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:35:03,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 05:35:03,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=612240.0, ans=0.125 2023-09-30 05:35:08,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 05:35:11,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:35:16,369 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 05:35:16,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:35:16,454 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 05:35:18,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:35:18,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:35:19,698 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 05:35:21,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:35:24,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 05:35:26,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:35:30,010 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=612306.6666666666, ans=22.5 2023-09-30 05:35:30,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:35:30,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:35:30,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:35:31,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:35:31,239 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=612373.3333333334, ans=0.125 2023-09-30 05:35:32,330 INFO [train.py:1039] (3/4) Epoch 18, batch 1550, loss[loss=0.1794, simple_loss=0.2512, pruned_loss=0.05385, over 23921.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2555, pruned_loss=0.05276, over 4719788.85 frames. ], batch size: 195, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:35:32,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:35:32,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 05:35:34,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 05:35:34,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:35:35,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 05:35:35,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 05:35:38,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:35:40,225 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:35:40,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=612373.3333333334, ans=0.0 2023-09-30 05:35:41,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:35:41,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:35:41,868 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=612373.3333333334, ans=0.125 2023-09-30 05:35:43,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:35:45,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:35:47,519 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 05:35:47,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:35:47,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:35:48,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=612440.0, ans=15.0 2023-09-30 05:35:49,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:35:52,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:35:52,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 05:35:52,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:35:53,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 05:35:53,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 05:35:53,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 05:35:55,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:35:57,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:36:00,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:36:02,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=612440.0, ans=0.95 2023-09-30 05:36:03,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 05:36:04,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 05:36:12,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:36:15,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:36:15,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:36:15,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:36:16,668 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.08 vs. limit=15.0 2023-09-30 05:36:17,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 05:36:22,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 05:36:24,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:36:27,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:36:30,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:36:30,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:36:30,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 05:36:30,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:36:33,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:36:33,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:36:35,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 05:36:35,137 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 05:36:36,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:36:44,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 05:36:46,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=612640.0, ans=0.0 2023-09-30 05:36:47,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:36:49,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:36:51,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 05:36:52,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:36:54,069 INFO [train.py:1039] (3/4) Epoch 18, batch 1600, loss[loss=0.1774, simple_loss=0.2621, pruned_loss=0.04629, over 24569.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2561, pruned_loss=0.05285, over 4733721.27 frames. ], batch size: 71, lr: 5.76e-03, grad_scale: 16.0 2023-09-30 05:36:54,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:36:56,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:36:56,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:36:57,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:37:00,597 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.87 vs. limit=15.0 2023-09-30 05:37:01,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:37:01,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 05:37:03,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 05:37:05,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 05:37:08,391 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:37:09,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 05:37:11,284 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.776e+02 1.986e+02 2.218e+02 2.701e+02, threshold=3.972e+02, percent-clipped=0.0 2023-09-30 05:37:11,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:37:13,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:37:16,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:37:20,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 05:37:22,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:37:22,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 05:37:22,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:37:22,785 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=612773.3333333334, ans=0.125 2023-09-30 05:37:24,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 05:37:25,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff2.min_abs, batch_count=612773.3333333334, ans=0.1 2023-09-30 05:37:26,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=612840.0, ans=0.125 2023-09-30 05:37:28,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=612840.0, ans=0.04949747468305833 2023-09-30 05:37:29,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 05:37:36,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:37:36,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 05:37:38,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:37:38,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:37:38,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:37:41,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 05:37:46,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 05:37:46,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=612906.6666666666, ans=0.2 2023-09-30 05:37:49,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:37:50,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:37:50,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:37:52,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:37:53,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:37:55,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:37:55,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=612906.6666666666, ans=0.125 2023-09-30 05:37:57,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:38:06,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:38:06,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:38:10,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 05:38:10,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:38:12,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 05:38:16,814 INFO [train.py:1039] (3/4) Epoch 18, batch 1650, loss[loss=0.1848, simple_loss=0.2539, pruned_loss=0.05788, over 23651.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.2557, pruned_loss=0.05275, over 4734732.53 frames. ], batch size: 149, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:38:17,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:38:19,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:38:21,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:38:21,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 05:38:21,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 05:38:21,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 05:38:21,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 05:38:26,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:38:26,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:38:27,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:38:27,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:38:29,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:38:29,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=613040.0, ans=0.125 2023-09-30 05:38:32,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 05:38:33,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=613106.6666666666, ans=22.5 2023-09-30 05:38:35,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:38:35,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:38:35,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:38:35,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:38:37,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 05:38:38,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 05:38:39,736 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=613106.6666666666, ans=0.1 2023-09-30 05:38:44,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:38:46,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:38:48,295 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=613173.3333333334, ans=0.125 2023-09-30 05:38:54,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 05:38:55,492 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.18 vs. limit=15.0 2023-09-30 05:38:56,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:38:57,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 05:39:02,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:39:03,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:39:05,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:39:06,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:39:06,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:39:06,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:39:07,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:39:09,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:39:09,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:39:09,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:39:11,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:39:13,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:39:15,100 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=613240.0, ans=0.0 2023-09-30 05:39:17,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:39:17,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=613240.0, ans=0.1 2023-09-30 05:39:18,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 05:39:19,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:39:21,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 05:39:21,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 05:39:21,569 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 05:39:21,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:39:23,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:39:23,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:39:24,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:39:24,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 05:39:28,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:39:28,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=613306.6666666666, ans=0.125 2023-09-30 05:39:33,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:39:33,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:39:37,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 05:39:41,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:39:41,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:39:42,811 INFO [train.py:1039] (3/4) Epoch 18, batch 1700, loss[loss=0.1523, simple_loss=0.2253, pruned_loss=0.03967, over 24323.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2548, pruned_loss=0.05282, over 4736126.63 frames. ], batch size: 56, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:39:42,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 05:39:42,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:39:42,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:39:42,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:39:44,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:39:46,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:39:46,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 05:39:50,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten.whitening_limit, batch_count=613373.3333333334, ans=22.5 2023-09-30 05:39:50,103 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=14.52 vs. limit=22.5 2023-09-30 05:39:50,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:40:00,300 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.464e+02 1.875e+02 2.086e+02 2.365e+02 3.763e+02, threshold=4.171e+02, percent-clipped=0.0 2023-09-30 05:40:00,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:40:02,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:40:09,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:40:10,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:40:10,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:40:10,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=613440.0, ans=0.0 2023-09-30 05:40:11,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:40:14,828 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 05:40:15,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:40:16,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:40:16,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:40:18,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:40:20,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 05:40:20,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 05:40:23,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:40:25,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 05:40:26,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:40:37,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:40:38,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:40:40,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:40:41,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 05:40:41,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 05:40:41,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:40:43,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:40:43,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 05:40:43,730 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=613573.3333333334, ans=0.125 2023-09-30 05:40:44,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:40:44,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:40:44,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:40:44,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:40:45,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=613573.3333333334, ans=0.125 2023-09-30 05:40:46,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:40:46,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:40:48,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:40:49,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:40:49,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:40:54,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:40:56,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 05:40:58,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:40:58,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:41:01,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 05:41:05,583 INFO [train.py:1039] (3/4) Epoch 18, batch 1750, loss[loss=0.1794, simple_loss=0.246, pruned_loss=0.05638, over 23658.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2547, pruned_loss=0.05261, over 4737279.80 frames. ], batch size: 149, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:41:08,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:41:11,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:41:11,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 05:41:11,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 05:41:13,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:41:16,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:41:16,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:41:21,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 05:41:21,285 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=613773.3333333334, ans=0.0 2023-09-30 05:41:23,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:41:24,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 05:41:24,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:41:26,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:41:28,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 05:41:31,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 05:41:32,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:41:34,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 05:41:41,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:41:45,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:41:45,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:41:49,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:41:49,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:41:52,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:41:54,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:41:56,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:41:58,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:41:59,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 05:41:59,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:42:01,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=613906.6666666666, ans=0.0 2023-09-30 05:42:04,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 05:42:04,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:42:05,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:42:07,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:42:11,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:42:11,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 05:42:13,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:42:13,769 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=613973.3333333334, ans=0.125 2023-09-30 05:42:15,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:42:18,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:42:20,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=613973.3333333334, ans=0.0 2023-09-30 05:42:21,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:42:23,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:42:23,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 05:42:23,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:42:23,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=613973.3333333334, ans=0.125 2023-09-30 05:42:24,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:42:24,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:42:24,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:42:24,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:42:27,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:42:28,509 INFO [train.py:1039] (3/4) Epoch 18, batch 1800, loss[loss=0.1695, simple_loss=0.2433, pruned_loss=0.04779, over 23370.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.2538, pruned_loss=0.0527, over 4722810.61 frames. ], batch size: 105, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:42:30,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:42:31,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:42:33,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 05:42:36,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:42:39,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 05:42:41,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:42:43,134 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.27 vs. limit=15.0 2023-09-30 05:42:45,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:42:45,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=614106.6666666666, ans=0.125 2023-09-30 05:42:46,628 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.378e+02 1.840e+02 1.957e+02 2.222e+02 3.420e+02, threshold=3.915e+02, percent-clipped=0.0 2023-09-30 05:42:48,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:42:48,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:42:50,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:42:52,060 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:42:52,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 05:42:54,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:42:58,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:43:00,531 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.29 vs. limit=15.0 2023-09-30 05:43:01,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 05:43:03,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 05:43:03,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 05:43:05,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:43:06,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:43:06,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:43:08,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:43:10,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=614173.3333333334, ans=0.125 2023-09-30 05:43:12,945 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 05:43:14,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:43:16,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:43:18,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 05:43:20,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 05:43:21,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:43:22,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:43:23,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:43:23,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=614240.0, ans=0.125 2023-09-30 05:43:27,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=614240.0, ans=0.2 2023-09-30 05:43:30,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 05:43:31,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=614240.0, ans=0.0 2023-09-30 05:43:36,476 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=614306.6666666666, ans=0.0 2023-09-30 05:43:37,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:43:37,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 05:43:39,421 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:43:39,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:43:39,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:43:41,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 05:43:44,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:43:44,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:43:46,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 05:43:46,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:43:49,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:43:49,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:43:49,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:43:50,775 INFO [train.py:1039] (3/4) Epoch 18, batch 1850, loss[loss=0.1617, simple_loss=0.2341, pruned_loss=0.04468, over 24351.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.254, pruned_loss=0.05242, over 4728069.25 frames. ], batch size: 56, lr: 5.76e-03, grad_scale: 8.0 2023-09-30 05:43:52,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:43:52,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:43:54,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:43:55,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:43:58,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:43:58,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:44:02,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=614373.3333333334, ans=0.125 2023-09-30 05:44:06,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:44:07,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 05:44:09,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 05:44:14,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 05:44:17,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:44:17,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 05:44:17,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 05:44:23,297 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.81 vs. limit=22.5 2023-09-30 05:44:27,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:44:30,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 05:44:33,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:44:33,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:44:38,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 05:44:38,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:44:39,511 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 05:44:39,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:44:39,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=614573.3333333334, ans=0.125 2023-09-30 05:44:41,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:44:44,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:44:46,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:44:48,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:44:48,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 05:44:48,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:44:49,062 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.80 vs. limit=15.0 2023-09-30 05:44:51,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:44:52,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:44:55,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 05:44:55,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:44:55,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=614640.0, ans=0.125 2023-09-30 05:44:58,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:44:58,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:44:58,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 05:44:58,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 05:45:01,328 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=9.11 vs. limit=15.0 2023-09-30 05:45:02,498 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 05:45:04,265 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 05:45:05,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:45:05,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:45:06,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:45:06,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:45:06,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=614640.0, ans=0.125 2023-09-30 05:45:08,006 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 05:45:08,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:45:08,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:45:09,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:45:09,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:45:11,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:45:11,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 05:45:14,128 INFO [train.py:1039] (3/4) Epoch 18, batch 1900, loss[loss=0.1978, simple_loss=0.2631, pruned_loss=0.06621, over 23840.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2549, pruned_loss=0.05306, over 4706831.20 frames. ], batch size: 212, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:45:14,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:45:14,343 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 05:45:14,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:45:15,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:45:16,288 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=614706.6666666666, ans=0.1 2023-09-30 05:45:21,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:45:24,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:45:24,240 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 05:45:24,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 05:45:25,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:45:27,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:45:27,251 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 05:45:28,857 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 05:45:31,883 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.898e+02 2.159e+02 2.520e+02 3.634e+02, threshold=4.318e+02, percent-clipped=0.0 2023-09-30 05:45:33,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 05:45:35,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:45:39,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 05:45:40,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 05:45:43,411 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.58 vs. limit=10.0 2023-09-30 05:45:48,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 05:45:52,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 05:45:54,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:45:54,159 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 05:45:54,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 05:45:54,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 05:45:54,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 05:45:54,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:45:58,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 05:46:01,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:46:07,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:46:07,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 05:46:07,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:46:11,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 05:46:11,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:46:16,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=614906.6666666666, ans=0.2 2023-09-30 05:46:19,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:46:19,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:46:19,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:46:19,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:46:21,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 05:46:22,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 05:46:22,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:46:26,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:46:26,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:46:29,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:46:29,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:46:29,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 05:46:30,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:46:33,812 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:46:36,670 INFO [train.py:1039] (3/4) Epoch 18, batch 1950, loss[loss=0.1922, simple_loss=0.2684, pruned_loss=0.05801, over 23275.00 frames. ], tot_loss[loss=0.1825, simple_loss=0.2565, pruned_loss=0.05422, over 4704723.88 frames. ], batch size: 93, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:46:36,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:46:38,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:46:38,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:46:40,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 05:46:40,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 05:46:40,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:46:40,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=615040.0, ans=0.125 2023-09-30 05:46:42,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:46:45,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:46:45,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:46:45,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:46:45,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=615040.0, ans=0.125 2023-09-30 05:46:47,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=615040.0, ans=0.0 2023-09-30 05:46:49,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:46:50,743 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:46:50,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:46:50,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 05:46:51,602 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.85 vs. limit=15.0 2023-09-30 05:46:52,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:46:55,766 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.01 vs. limit=15.0 2023-09-30 05:46:56,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:47:00,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:47:00,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:47:00,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 05:47:00,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 05:47:01,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 05:47:01,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:47:03,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:47:05,934 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=12.63 vs. limit=15.0 2023-09-30 05:47:08,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:47:09,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:47:16,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:47:16,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=615173.3333333334, ans=0.0 2023-09-30 05:47:19,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:47:20,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:47:21,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 05:47:22,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:47:26,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:47:28,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:47:29,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:47:39,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:47:39,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:47:41,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:47:44,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:47:46,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:47:46,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=615306.6666666666, ans=0.2 2023-09-30 05:47:47,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:47:47,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 05:47:47,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 05:47:49,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:47:51,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 05:47:52,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:47:53,276 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=615306.6666666666, ans=0.1 2023-09-30 05:47:57,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:47:58,908 INFO [train.py:1039] (3/4) Epoch 18, batch 2000, loss[loss=0.1722, simple_loss=0.2343, pruned_loss=0.05506, over 23671.00 frames. ], tot_loss[loss=0.1827, simple_loss=0.2571, pruned_loss=0.05416, over 4710033.22 frames. ], batch size: 232, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:47:59,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:48:00,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:48:02,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:48:04,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:48:09,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 05:48:09,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 05:48:12,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:48:16,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 05:48:16,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 05:48:17,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:48:19,241 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.532e+02 1.900e+02 2.106e+02 2.415e+02 3.499e+02, threshold=4.211e+02, percent-clipped=0.0 2023-09-30 05:48:20,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:48:22,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 05:48:24,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:48:24,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=615440.0, ans=0.125 2023-09-30 05:48:27,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:48:27,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:48:29,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 05:48:29,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 05:48:30,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 05:48:30,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:48:34,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:48:36,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 05:48:36,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:48:37,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:48:39,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:48:39,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 05:48:43,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 05:48:43,032 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:48:43,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:48:49,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:48:51,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:48:51,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:48:51,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:48:54,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:48:55,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:48:55,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:48:55,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:48:55,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:00,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:49:00,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 05:49:04,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=615640.0, ans=0.0 2023-09-30 05:49:07,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:49:08,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:12,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:12,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:49:15,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:16,130 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.82 vs. limit=15.0 2023-09-30 05:49:16,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:49:16,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:19,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:49:19,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:49:20,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:22,119 INFO [train.py:1039] (3/4) Epoch 18, batch 2050, loss[loss=0.1738, simple_loss=0.2278, pruned_loss=0.05992, over 22658.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.256, pruned_loss=0.05324, over 4710940.73 frames. ], batch size: 322, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:49:22,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:25,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:49:27,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:31,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:49:35,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:49:37,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:49:38,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:49:40,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 05:49:40,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:49:40,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:49:41,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:49:51,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:49:51,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=615773.3333333334, ans=0.125 2023-09-30 05:49:52,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:53,194 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 05:49:55,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:49:56,285 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.22 vs. limit=12.0 2023-09-30 05:49:58,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 05:49:58,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:50:03,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:50:04,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:50:05,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:50:06,406 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:50:06,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:50:08,102 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:50:10,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:50:14,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:50:16,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 05:50:18,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:50:18,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:50:23,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:50:28,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:50:30,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 05:50:35,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:50:37,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:50:40,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:50:41,835 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=615973.3333333334, ans=0.5 2023-09-30 05:50:43,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 05:50:44,547 INFO [train.py:1039] (3/4) Epoch 18, batch 2100, loss[loss=0.1686, simple_loss=0.2542, pruned_loss=0.0415, over 24418.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2544, pruned_loss=0.05265, over 4719541.91 frames. ], batch size: 69, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:50:46,762 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 05:50:46,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:50:48,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:50:48,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:50:49,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:50:49,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 05:50:51,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 05:50:53,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 05:50:56,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:50:57,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:50:58,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=616040.0, ans=0.125 2023-09-30 05:50:59,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:50:59,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:51:00,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 05:51:01,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 05:51:01,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 05:51:01,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 05:51:03,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:51:03,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:51:03,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 05:51:04,690 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.913e+02 2.117e+02 2.530e+02 3.526e+02, threshold=4.234e+02, percent-clipped=0.0 2023-09-30 05:51:04,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 05:51:10,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 05:51:10,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 05:51:13,683 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.62 vs. limit=15.0 2023-09-30 05:51:14,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:51:14,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:51:18,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:51:18,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=616173.3333333334, ans=0.125 2023-09-30 05:51:19,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 05:51:21,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:51:21,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 05:51:22,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 05:51:22,869 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:51:22,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 05:51:24,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 05:51:24,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 05:51:28,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:51:29,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:51:31,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:51:34,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 05:51:35,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:51:38,235 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.05 vs. limit=15.0 2023-09-30 05:51:39,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:51:39,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 05:51:39,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:51:39,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:51:39,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:51:41,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 05:51:42,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 05:51:42,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 05:51:47,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 05:51:50,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:51:50,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 05:51:55,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:51:57,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:51:59,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:51:59,485 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:51:59,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 05:51:59,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:52:01,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:52:02,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:52:02,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:52:02,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:04,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 05:52:05,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 05:52:05,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:52:07,385 INFO [train.py:1039] (3/4) Epoch 18, batch 2150, loss[loss=0.192, simple_loss=0.2795, pruned_loss=0.0522, over 24676.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2535, pruned_loss=0.0521, over 4715910.32 frames. ], batch size: 73, lr: 5.75e-03, grad_scale: 8.0 2023-09-30 05:52:08,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:52:08,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:52:09,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:52:09,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:52:16,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 05:52:17,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:52:19,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:21,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:52:21,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:21,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:52:25,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:27,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:52:27,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 05:52:30,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:30,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 05:52:36,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:52:37,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:52:39,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:39,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:52:39,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:39,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:52:40,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:52:40,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:52:40,816 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:52:42,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 05:52:43,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 05:52:45,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:46,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:52:47,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:52:47,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:52:49,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=616506.6666666666, ans=0.125 2023-09-30 05:52:49,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=616506.6666666666, ans=0.125 2023-09-30 05:52:50,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:52:50,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:52:53,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:52:53,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 05:52:53,954 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 05:52:57,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:52:58,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:52:58,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:53:00,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 05:53:00,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:00,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=616573.3333333334, ans=0.0 2023-09-30 05:53:01,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:53:01,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 05:53:03,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=616573.3333333334, ans=0.125 2023-09-30 05:53:05,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 05:53:05,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 05:53:05,861 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 05:53:05,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:05,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:53:07,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 05:53:07,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:53:07,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 05:53:07,415 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 05:53:07,416 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 05:53:07,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 05:53:07,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=616573.3333333334, ans=0.1 2023-09-30 05:53:10,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:10,599 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:53:10,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:53:12,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:53:13,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 05:53:15,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:15,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:53:19,794 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 05:53:24,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:53:24,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 05:53:29,815 INFO [train.py:1039] (3/4) Epoch 18, batch 2200, loss[loss=0.1845, simple_loss=0.2555, pruned_loss=0.0568, over 23734.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2538, pruned_loss=0.052, over 4726102.86 frames. ], batch size: 232, lr: 5.74e-03, grad_scale: 8.0 2023-09-30 05:53:29,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:53:33,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:53:33,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:53:35,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:53:36,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 05:53:40,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:53:40,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:53:40,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 05:53:46,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 05:53:47,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 05:53:49,927 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.966e+02 2.292e+02 2.651e+02 4.144e+02, threshold=4.583e+02, percent-clipped=0.0 2023-09-30 05:53:55,363 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten.whitening_limit, batch_count=616773.3333333334, ans=15.0 2023-09-30 05:53:56,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 05:53:56,395 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=616773.3333333334, ans=0.125 2023-09-30 05:53:59,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:54:00,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:54:02,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 05:54:02,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=616840.0, ans=0.0 2023-09-30 05:54:03,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 05:54:05,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 05:54:09,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 05:54:11,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:54:11,912 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 05:54:15,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:54:18,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:54:21,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:54:22,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:54:24,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 05:54:25,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:54:28,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 05:54:30,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:54:30,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 05:54:30,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:54:30,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=616906.6666666666, ans=0.125 2023-09-30 05:54:32,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 05:54:32,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=616906.6666666666, ans=0.0 2023-09-30 05:54:33,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:54:33,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:54:33,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:54:35,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 05:54:35,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:54:36,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 05:54:39,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 05:54:40,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:54:42,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:54:44,429 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 05:54:44,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:54:44,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=616973.3333333334, ans=0.0 2023-09-30 05:54:46,766 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 05:54:46,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 05:54:47,018 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 05:54:50,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:54:50,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 05:54:50,939 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=617040.0, ans=0.07 2023-09-30 05:54:50,975 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=617040.0, ans=0.05 2023-09-30 05:54:52,059 INFO [train.py:1039] (3/4) Epoch 18, batch 2250, loss[loss=0.1817, simple_loss=0.258, pruned_loss=0.05274, over 23980.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.255, pruned_loss=0.05263, over 4724521.75 frames. ], batch size: 86, lr: 5.74e-03, grad_scale: 8.0 2023-09-30 05:54:52,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:54:53,742 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 05:54:55,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:54:57,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:55:04,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 05:55:06,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 05:55:07,573 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.81 vs. limit=15.0 2023-09-30 05:55:10,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:55:10,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:55:11,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 05:55:14,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 05:55:14,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:55:14,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:55:17,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 05:55:19,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:55:19,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:55:20,152 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.80 vs. limit=22.5 2023-09-30 05:55:20,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 05:55:26,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:55:26,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=617173.3333333334, ans=0.125 2023-09-30 05:55:27,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 05:55:29,147 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 05:55:30,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 05:55:30,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:55:34,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:55:36,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=617173.3333333334, ans=0.125 2023-09-30 05:55:38,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.min_positive, batch_count=617173.3333333334, ans=0.025 2023-09-30 05:55:39,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:55:41,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:55:42,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:55:42,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:55:44,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:55:45,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:55:50,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:55:53,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 05:55:57,435 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=617306.6666666666, ans=0.125 2023-09-30 05:55:59,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 05:55:59,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 05:55:59,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 05:56:04,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 05:56:07,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 05:56:07,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 05:56:07,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:56:09,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 05:56:12,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 05:56:14,355 INFO [train.py:1039] (3/4) Epoch 18, batch 2300, loss[loss=0.174, simple_loss=0.2582, pruned_loss=0.0449, over 24541.00 frames. ], tot_loss[loss=0.1811, simple_loss=0.2557, pruned_loss=0.05325, over 4721955.37 frames. ], batch size: 71, lr: 5.74e-03, grad_scale: 8.0 2023-09-30 05:56:14,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:56:14,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:56:20,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:56:20,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:56:23,524 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 05:56:26,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:56:27,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=617373.3333333334, ans=0.125 2023-09-30 05:56:33,608 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.613e+02 1.928e+02 2.258e+02 2.853e+02 4.796e+02, threshold=4.517e+02, percent-clipped=2.0 2023-09-30 05:56:33,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:56:33,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 05:56:33,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:56:33,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:56:33,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 05:56:35,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 05:56:38,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:56:38,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:56:39,364 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=12.07 vs. limit=15.0 2023-09-30 05:56:40,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 05:56:42,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 05:56:47,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:56:52,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 05:56:53,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:56:56,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 05:57:00,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:57:03,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:57:05,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 05:57:05,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:57:05,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 05:57:06,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=617573.3333333334, ans=0.0 2023-09-30 05:57:10,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 05:57:10,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:57:10,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:57:10,621 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:57:12,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:57:12,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 05:57:12,247 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 05:57:13,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 05:57:13,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 05:57:13,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:57:13,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 05:57:22,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:57:23,252 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.89 vs. limit=15.0 2023-09-30 05:57:27,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:57:30,298 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:57:30,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:57:31,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 05:57:35,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 05:57:35,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:57:35,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 05:57:35,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 05:57:37,206 INFO [train.py:1039] (3/4) Epoch 18, batch 2350, loss[loss=0.193, simple_loss=0.2523, pruned_loss=0.06687, over 23412.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2571, pruned_loss=0.05428, over 4713665.13 frames. ], batch size: 285, lr: 5.74e-03, grad_scale: 8.0 2023-09-30 05:57:42,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:57:42,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 05:57:47,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 05:57:49,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 05:57:52,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:57:52,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:57:52,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:57:54,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:57:56,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 05:57:57,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:58:01,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 05:58:01,762 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=617773.3333333334, ans=0.1 2023-09-30 05:58:02,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 05:58:06,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:58:06,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 05:58:11,548 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 05:58:11,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=617840.0, ans=0.0 2023-09-30 05:58:13,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 05:58:14,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 05:58:14,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 05:58:14,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:58:16,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 05:58:19,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 05:58:22,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 05:58:22,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 05:58:25,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:58:25,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 05:58:26,831 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.83 vs. limit=15.0 2023-09-30 05:58:30,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 05:58:31,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 05:58:34,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 05:58:34,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 05:58:39,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=617906.6666666666, ans=0.07 2023-09-30 05:58:40,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 05:58:43,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 05:58:44,662 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.08 vs. limit=15.0 2023-09-30 05:58:45,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 05:58:45,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 05:58:45,409 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 05:58:45,448 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 05:58:48,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 05:58:50,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 05:58:56,609 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 05:58:59,657 INFO [train.py:1039] (3/4) Epoch 18, batch 2400, loss[loss=0.1948, simple_loss=0.2848, pruned_loss=0.05241, over 24317.00 frames. ], tot_loss[loss=0.1829, simple_loss=0.2566, pruned_loss=0.05456, over 4708014.79 frames. ], batch size: 74, lr: 5.74e-03, grad_scale: 16.0 2023-09-30 05:58:59,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 05:59:00,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=618040.0, ans=0.0 2023-09-30 05:59:02,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 05:59:03,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 05:59:04,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 05:59:13,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 05:59:13,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 05:59:15,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 05:59:17,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 05:59:17,792 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=14.16 vs. limit=15.0 2023-09-30 05:59:18,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:59:18,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 05:59:20,009 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.950e+02 2.202e+02 2.533e+02 3.814e+02, threshold=4.404e+02, percent-clipped=0.0 2023-09-30 05:59:25,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:59:26,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 05:59:31,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 05:59:34,889 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 05:59:36,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=618173.3333333334, ans=0.125 2023-09-30 05:59:38,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 05:59:42,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 05:59:46,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 05:59:46,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 05:59:46,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 05:59:58,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:00:01,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:00:03,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:00:06,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:00:06,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 06:00:06,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:00:06,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:00:06,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:00:06,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:00:11,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:00:11,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:00:11,522 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=618306.6666666666, ans=0.1 2023-09-30 06:00:12,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 06:00:12,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 06:00:14,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:00:16,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:00:16,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 06:00:16,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 06:00:16,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 06:00:16,653 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 06:00:18,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 06:00:18,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=618306.6666666666, ans=0.0 2023-09-30 06:00:20,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:00:20,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:00:21,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:00:23,291 INFO [train.py:1039] (3/4) Epoch 18, batch 2450, loss[loss=0.1952, simple_loss=0.2751, pruned_loss=0.05764, over 23991.00 frames. ], tot_loss[loss=0.1813, simple_loss=0.2554, pruned_loss=0.05355, over 4712810.59 frames. ], batch size: 80, lr: 5.74e-03, grad_scale: 16.0 2023-09-30 06:00:23,400 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 06:00:24,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:00:25,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:00:28,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:00:28,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:00:33,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:00:33,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:00:35,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 06:00:38,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:00:38,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:00:43,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:00:43,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:00:43,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:00:44,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 06:00:45,440 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.78 vs. limit=15.0 2023-09-30 06:00:50,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:00:51,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:00:53,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:00:56,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:00:56,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:00:58,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:00:58,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:01:01,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 06:01:03,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:01:08,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:01:09,371 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=9.03 vs. limit=15.0 2023-09-30 06:01:10,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:01:11,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:01:11,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:01:12,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:01:13,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:01:13,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=618573.3333333334, ans=0.0 2023-09-30 06:01:14,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 06:01:20,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:01:20,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:01:24,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:01:24,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:01:31,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:01:31,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 06:01:32,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:01:32,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:01:32,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 06:01:34,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:01:34,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:01:39,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:01:41,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:01:43,448 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:01:46,483 INFO [train.py:1039] (3/4) Epoch 18, batch 2500, loss[loss=0.1865, simple_loss=0.249, pruned_loss=0.06194, over 23535.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2546, pruned_loss=0.0529, over 4717119.48 frames. ], batch size: 256, lr: 5.74e-03, grad_scale: 16.0 2023-09-30 06:01:46,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 06:01:48,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:01:54,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:02:03,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:02:04,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:02:04,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:02:04,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 06:02:06,277 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.482e+02 1.833e+02 2.012e+02 2.316e+02 3.261e+02, threshold=4.025e+02, percent-clipped=0.0 2023-09-30 06:02:13,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:02:13,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:02:14,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 06:02:14,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:02:14,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 06:02:18,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:02:18,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:02:19,902 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 06:02:19,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:02:20,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 06:02:20,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:02:24,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:02:26,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:02:26,680 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=618840.0, ans=0.125 2023-09-30 06:02:27,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:02:29,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 06:02:31,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:02:32,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:02:38,142 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:02:41,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:02:45,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:02:50,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 06:02:53,026 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.72 vs. limit=10.0 2023-09-30 06:02:53,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 06:02:53,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:02:53,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 06:02:56,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:02:56,417 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:02:56,604 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 06:02:56,605 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 06:02:56,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 06:03:01,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:03:04,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 06:03:04,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 06:03:04,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:03:04,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 06:03:09,324 INFO [train.py:1039] (3/4) Epoch 18, batch 2550, loss[loss=0.1674, simple_loss=0.2471, pruned_loss=0.04381, over 24619.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.255, pruned_loss=0.05272, over 4719286.74 frames. ], batch size: 60, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:03:09,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 06:03:11,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:03:11,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:03:13,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:03:16,423 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:03:17,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 06:03:17,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=619040.0, ans=0.125 2023-09-30 06:03:18,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:03:21,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 06:03:22,280 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=619040.0, ans=0.125 2023-09-30 06:03:23,423 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:03:26,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:03:28,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:03:28,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 06:03:29,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:03:29,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:03:30,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=619106.6666666666, ans=0.125 2023-09-30 06:03:31,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:03:33,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:03:34,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 06:03:34,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 06:03:34,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:03:34,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 06:03:35,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=619106.6666666666, ans=0.125 2023-09-30 06:03:43,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=619173.3333333334, ans=0.0 2023-09-30 06:03:46,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=619173.3333333334, ans=0.0 2023-09-30 06:03:46,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=619173.3333333334, ans=0.125 2023-09-30 06:03:48,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:03:48,791 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=619173.3333333334, ans=0.09899494936611666 2023-09-30 06:03:53,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:03:53,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:03:53,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:03:55,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 06:03:57,409 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.35 vs. limit=22.5 2023-09-30 06:04:01,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:04:02,507 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.71 vs. limit=15.0 2023-09-30 06:04:05,626 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.27 vs. limit=10.0 2023-09-30 06:04:05,805 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=5.49 vs. limit=15.0 2023-09-30 06:04:06,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:04:06,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:04:06,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:04:07,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 06:04:07,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:04:12,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:04:12,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:04:16,906 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.80 vs. limit=15.0 2023-09-30 06:04:17,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:04:17,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 06:04:17,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:04:19,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:04:20,534 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 06:04:20,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 06:04:22,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:04:31,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:04:32,409 INFO [train.py:1039] (3/4) Epoch 18, batch 2600, loss[loss=0.1866, simple_loss=0.2696, pruned_loss=0.05177, over 23444.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2552, pruned_loss=0.05257, over 4711461.49 frames. ], batch size: 93, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:04:32,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:04:35,682 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 06:04:35,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=619373.3333333334, ans=0.125 2023-09-30 06:04:38,651 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 06:04:38,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:04:40,702 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 06:04:40,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 06:04:40,872 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 06:04:41,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=619373.3333333334, ans=0.1 2023-09-30 06:04:41,350 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=619373.3333333334, ans=0.125 2023-09-30 06:04:44,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:04:44,787 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 06:04:46,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 06:04:47,868 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 06:04:49,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:04:51,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 06:04:51,977 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.32 vs. limit=15.0 2023-09-30 06:04:52,580 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.862e+02 2.048e+02 2.291e+02 3.453e+02, threshold=4.097e+02, percent-clipped=0.0 2023-09-30 06:04:52,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 06:04:54,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 06:04:54,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 06:04:55,950 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 06:04:57,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 06:05:04,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:05:04,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:05:04,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:05:04,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 06:05:07,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:05:11,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=619506.6666666666, ans=0.125 2023-09-30 06:05:15,981 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 06:05:24,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:05:24,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:05:25,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 06:05:25,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:05:25,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:05:27,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 06:05:27,736 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_ff3.min_abs, batch_count=619573.3333333334, ans=0.2 2023-09-30 06:05:29,491 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.21 vs. limit=10.0 2023-09-30 06:05:30,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:05:30,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:05:33,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:05:37,290 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=619640.0, ans=0.1 2023-09-30 06:05:38,582 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 06:05:38,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:05:38,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:05:44,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:05:45,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:05:45,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 06:05:45,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:05:50,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:05:50,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:05:54,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 06:05:55,846 INFO [train.py:1039] (3/4) Epoch 18, batch 2650, loss[loss=0.1743, simple_loss=0.2645, pruned_loss=0.04203, over 24441.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2551, pruned_loss=0.05257, over 4720218.74 frames. ], batch size: 69, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:05:56,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:05:57,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:06:00,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 06:06:00,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:06:02,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:06:02,467 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 06:06:02,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:06:06,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:06:08,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:06:10,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:06:12,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:06:13,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 06:06:14,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:06:14,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:06:18,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 06:06:20,315 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 06:06:22,226 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=619773.3333333334, ans=0.0 2023-09-30 06:06:23,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:06:26,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 06:06:28,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:06:28,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 06:06:32,690 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.14 vs. limit=15.0 2023-09-30 06:06:33,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:06:33,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:06:33,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:06:34,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:06:39,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 06:06:39,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 06:06:40,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=619840.0, ans=0.0 2023-09-30 06:06:41,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:06:44,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 06:06:44,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:06:46,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:06:46,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:06:47,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:06:48,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:06:51,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:06:53,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:06:54,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:06:54,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:06:56,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:06:57,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:06:59,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:06:59,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:07:01,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:07:02,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:07:05,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:07:06,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:07:06,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:07:06,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 06:07:08,862 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=619973.3333333334, ans=0.07 2023-09-30 06:07:10,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:07:11,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:07:14,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:07:15,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:07:16,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:07:16,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:07:17,505 INFO [train.py:1039] (3/4) Epoch 18, batch 2700, loss[loss=0.1541, simple_loss=0.2335, pruned_loss=0.0373, over 24525.00 frames. ], tot_loss[loss=0.1814, simple_loss=0.2564, pruned_loss=0.05316, over 4720583.57 frames. ], batch size: 60, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:07:19,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:07:19,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 06:07:20,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:07:25,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 06:07:26,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:07:26,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:07:28,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:07:28,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:07:28,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:07:28,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:07:28,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=620040.0, ans=0.1 2023-09-30 06:07:29,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 06:07:29,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 06:07:29,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:07:31,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:07:33,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:07:35,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:07:38,509 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.935e+02 2.170e+02 2.390e+02 3.266e+02, threshold=4.340e+02, percent-clipped=0.0 2023-09-30 06:07:38,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:07:40,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 06:07:40,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:07:44,141 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.60 vs. limit=15.0 2023-09-30 06:07:47,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:07:47,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:07:50,056 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.28 vs. limit=15.0 2023-09-30 06:07:52,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:07:53,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:07:53,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:07:53,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:07:55,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:08:01,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:08:01,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:08:01,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:08:04,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:08:04,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:08:14,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:08:16,460 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:08:18,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:08:18,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:08:21,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:08:22,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:08:22,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:08:25,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:08:27,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:08:27,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:08:30,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:08:32,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:08:32,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:08:33,940 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.82 vs. limit=15.0 2023-09-30 06:08:36,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 06:08:36,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:08:36,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=620306.6666666666, ans=0.125 2023-09-30 06:08:39,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:08:39,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 06:08:41,310 INFO [train.py:1039] (3/4) Epoch 18, batch 2750, loss[loss=0.1762, simple_loss=0.2482, pruned_loss=0.05209, over 23756.00 frames. ], tot_loss[loss=0.1814, simple_loss=0.2562, pruned_loss=0.05327, over 4716329.66 frames. ], batch size: 135, lr: 5.73e-03, grad_scale: 16.0 2023-09-30 06:08:41,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 06:08:41,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:08:45,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:08:46,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:08:49,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:08:49,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:08:49,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:08:52,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:08:54,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 06:08:55,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:08:55,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:08:55,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 06:08:55,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:08:55,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:09:01,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 06:09:03,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:09:03,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:09:05,448 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:09:07,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:09:07,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:09:07,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:09:09,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:09:10,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:09:15,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:09:15,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:09:15,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:09:17,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:09:18,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 06:09:25,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:09:28,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:09:29,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:09:32,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:09:32,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:09:32,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:09:40,298 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:09:40,534 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=620573.3333333334, ans=0.0 2023-09-30 06:09:41,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:09:41,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 06:09:46,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:09:47,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 06:09:53,143 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 06:09:54,856 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:09:56,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 06:09:58,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:09:59,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:09:59,873 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 06:09:59,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:10:02,952 INFO [train.py:1039] (3/4) Epoch 18, batch 2800, loss[loss=0.1693, simple_loss=0.2493, pruned_loss=0.04471, over 24431.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2543, pruned_loss=0.05298, over 4704734.79 frames. ], batch size: 63, lr: 5.73e-03, grad_scale: 32.0 2023-09-30 06:10:03,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 06:10:03,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:10:03,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:10:04,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 06:10:04,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:10:06,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:10:06,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:10:07,844 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 06:10:07,845 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 06:10:11,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:10:13,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:10:13,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:10:18,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:10:20,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 06:10:21,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 06:10:23,089 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.604e+02 1.862e+02 2.010e+02 2.282e+02 3.813e+02, threshold=4.021e+02, percent-clipped=0.0 2023-09-30 06:10:23,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 06:10:24,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:10:24,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:10:24,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:10:28,933 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=620773.3333333334, ans=0.125 2023-09-30 06:10:30,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:10:30,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:10:30,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 06:10:31,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:10:39,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=620840.0, ans=0.5 2023-09-30 06:10:40,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:10:42,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:10:45,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:10:47,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:10:47,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:10:54,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:10:54,932 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 06:10:55,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:10:56,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:10:56,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:10:59,710 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=620906.6666666666, ans=0.1 2023-09-30 06:11:01,064 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:11:01,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:11:04,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:11:06,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:11:06,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:11:06,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:11:07,921 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 06:11:09,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:11:10,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:11:10,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 06:11:10,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:11:11,165 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=620973.3333333334, ans=0.125 2023-09-30 06:11:12,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:11:12,526 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:11:15,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 06:11:15,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:11:15,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:11:17,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:11:20,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 06:11:25,718 INFO [train.py:1039] (3/4) Epoch 18, batch 2850, loss[loss=0.1988, simple_loss=0.2778, pruned_loss=0.05985, over 23740.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2539, pruned_loss=0.05243, over 4712961.05 frames. ], batch size: 85, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:11:27,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:11:27,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 06:11:28,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:11:29,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=621040.0, ans=0.125 2023-09-30 06:11:30,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:11:34,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:11:34,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:11:35,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:11:38,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=621040.0, ans=0.125 2023-09-30 06:11:39,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:11:40,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:11:42,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:11:42,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 06:11:44,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=621106.6666666666, ans=0.125 2023-09-30 06:11:48,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 06:11:48,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:11:50,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 06:11:50,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:11:52,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=621106.6666666666, ans=0.125 2023-09-30 06:11:55,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 06:11:55,451 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 06:11:56,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:12:11,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:12:12,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:12:12,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:12:14,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 06:12:14,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:12:14,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:12:17,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:12:17,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 06:12:19,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:12:19,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:12:19,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:12:19,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:12:22,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:12:22,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:12:23,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:12:25,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:12:28,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:12:28,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:12:29,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:12:31,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:12:37,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:12:37,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=621306.6666666666, ans=0.125 2023-09-30 06:12:40,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 06:12:40,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 06:12:42,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:12:43,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:12:43,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 06:12:44,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:12:44,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:12:46,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:12:46,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:12:46,154 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 06:12:47,639 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 06:12:47,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:12:47,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:12:49,201 INFO [train.py:1039] (3/4) Epoch 18, batch 2900, loss[loss=0.2029, simple_loss=0.2681, pruned_loss=0.06882, over 23675.00 frames. ], tot_loss[loss=0.179, simple_loss=0.254, pruned_loss=0.05199, over 4721690.42 frames. ], batch size: 232, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:12:52,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 06:12:53,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:12:53,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:12:55,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 06:12:58,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:12:58,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 06:13:00,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 06:13:01,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:13:01,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:13:03,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:13:05,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:13:08,431 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.787e+02 2.109e+02 2.458e+02 3.664e+02, threshold=4.218e+02, percent-clipped=0.0 2023-09-30 06:13:10,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:13:10,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:13:13,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 06:13:13,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 06:13:13,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:13:16,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:13:17,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 06:13:19,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 06:13:22,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:13:22,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 06:13:22,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:13:25,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:13:25,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 06:13:28,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:13:28,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:13:33,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:13:36,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:13:37,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 06:13:37,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 06:13:37,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:13:43,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:13:48,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 06:13:50,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:13:54,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:14:02,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:14:02,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:14:03,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 06:14:08,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:14:08,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 06:14:08,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:14:08,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:14:10,971 INFO [train.py:1039] (3/4) Epoch 18, batch 2950, loss[loss=0.1762, simple_loss=0.2677, pruned_loss=0.04236, over 24327.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.255, pruned_loss=0.05179, over 4728360.81 frames. ], batch size: 74, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:14:14,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:14:16,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 06:14:18,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:14:18,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=621706.6666666666, ans=0.125 2023-09-30 06:14:20,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:14:20,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:14:22,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:14:23,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 06:14:25,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 06:14:25,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:14:25,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:14:32,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:14:33,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:14:36,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:14:38,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:14:41,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:14:41,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:14:44,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:14:44,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:14:44,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:14:46,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 06:14:48,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=621840.0, ans=0.1 2023-09-30 06:14:53,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 06:14:53,364 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 06:14:54,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:14:56,278 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 06:14:58,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 06:14:58,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:14:58,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:14:58,526 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 06:14:58,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:15:03,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 06:15:03,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:15:03,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:15:06,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:15:08,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:15:08,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:15:08,581 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 06:15:09,954 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:15:10,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 06:15:14,824 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:15:16,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:15:16,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 06:15:16,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:15:19,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 06:15:22,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:15:23,702 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=7.14 vs. limit=15.0 2023-09-30 06:15:24,769 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 06:15:25,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:15:25,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:15:28,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:15:28,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 06:15:29,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:15:31,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:15:31,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:15:31,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:15:33,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:15:34,751 INFO [train.py:1039] (3/4) Epoch 18, batch 3000, loss[loss=0.1527, simple_loss=0.237, pruned_loss=0.03427, over 24451.00 frames. ], tot_loss[loss=0.181, simple_loss=0.2561, pruned_loss=0.05291, over 4718742.64 frames. ], batch size: 63, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:15:34,751 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-30 06:15:43,902 INFO [zipformer.py:1853] (3/4) name=encoder.encoders.5.encoder.layers.1.self_attn_weights, attn_weights_entropy = tensor([3.4895, 3.0130, 4.1155, 3.9111], device='cuda:3') 2023-09-30 06:15:49,342 INFO [train.py:1071] (3/4) Epoch 18, validation: loss=0.3403, simple_loss=0.2856, pruned_loss=0.1975, over 1125622.00 frames. 2023-09-30 06:15:49,343 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-30 06:15:49,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:15:51,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:15:51,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 06:15:52,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:15:55,799 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=7.01 vs. limit=15.0 2023-09-30 06:15:56,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:15:56,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:16:01,094 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.65 vs. limit=8.0 2023-09-30 06:16:01,519 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 06:16:01,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 06:16:03,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:16:03,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:16:03,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 06:16:04,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:16:09,653 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.852e+02 2.145e+02 2.482e+02 3.954e+02, threshold=4.290e+02, percent-clipped=0.0 2023-09-30 06:16:12,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:16:22,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:16:27,544 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=622173.3333333334, ans=0.125 2023-09-30 06:16:29,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 06:16:31,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:16:34,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:16:36,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:16:36,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:16:37,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:16:37,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 06:16:38,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 06:16:41,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:16:41,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 06:16:44,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:16:44,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:16:44,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:16:44,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:16:49,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:16:49,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:16:49,358 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:16:50,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:16:53,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 06:16:54,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:16:54,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:16:56,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:16:59,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:17:00,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:17:02,825 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 06:17:02,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 06:17:02,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:17:03,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 06:17:04,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:17:07,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 06:17:10,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:17:12,318 INFO [train.py:1039] (3/4) Epoch 18, batch 3050, loss[loss=0.1569, simple_loss=0.2371, pruned_loss=0.03836, over 24564.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2567, pruned_loss=0.05362, over 4711568.05 frames. ], batch size: 60, lr: 5.72e-03, grad_scale: 32.0 2023-09-30 06:17:12,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:17:12,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 06:17:13,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 06:17:13,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 06:17:14,190 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=622373.3333333334, ans=0.125 2023-09-30 06:17:15,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:17:15,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:17:15,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 06:17:16,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:17:16,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:17:20,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 06:17:22,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:17:25,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:17:25,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:17:30,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:17:33,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 06:17:38,811 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=622440.0, ans=0.2 2023-09-30 06:17:40,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 06:17:40,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 06:17:41,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:17:43,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:17:44,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=622506.6666666666, ans=0.0 2023-09-30 06:17:47,026 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:17:48,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:17:48,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:17:53,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:17:53,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:17:53,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:17:54,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:17:54,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:17:56,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:17:58,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:17:58,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=622506.6666666666, ans=0.125 2023-09-30 06:18:01,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:18:02,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 06:18:02,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:18:03,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:18:05,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:18:05,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:18:05,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:18:06,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:18:10,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:18:10,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:18:17,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:18:17,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:18:17,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:18:22,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:18:22,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:18:22,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:18:23,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 06:18:25,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:18:26,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:18:27,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 06:18:29,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:18:30,608 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.70 vs. limit=15.0 2023-09-30 06:18:34,534 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:18:35,977 INFO [train.py:1039] (3/4) Epoch 18, batch 3100, loss[loss=0.1558, simple_loss=0.2299, pruned_loss=0.04081, over 24459.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.2561, pruned_loss=0.05353, over 4700911.80 frames. ], batch size: 58, lr: 5.72e-03, grad_scale: 16.0 2023-09-30 06:18:37,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:18:39,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 06:18:40,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 06:18:45,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 06:18:47,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 06:18:49,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:18:52,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:18:52,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:18:55,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 06:18:57,053 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.806e+02 2.048e+02 2.293e+02 3.321e+02, threshold=4.096e+02, percent-clipped=0.0 2023-09-30 06:18:58,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:19:04,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 06:19:08,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 06:19:08,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:08,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:19:09,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:19:09,375 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=622840.0, ans=0.125 2023-09-30 06:19:10,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 06:19:13,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:19:13,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 06:19:13,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:19:14,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:19:17,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 06:19:18,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:19:22,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:19:23,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 06:19:24,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 06:19:25,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:25,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:19:28,721 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:19:28,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:28,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:19:30,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:19:30,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:19:33,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:19:33,400 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:19:33,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:33,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 06:19:33,790 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=622906.6666666666, ans=0.0 2023-09-30 06:19:39,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:19:40,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 06:19:43,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:19:43,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 06:19:44,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=622973.3333333334, ans=0.0 2023-09-30 06:19:45,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:19:45,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:19:45,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 06:19:57,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 06:19:58,769 INFO [train.py:1039] (3/4) Epoch 18, batch 3150, loss[loss=0.1855, simple_loss=0.2653, pruned_loss=0.05283, over 23309.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2547, pruned_loss=0.05304, over 4693656.22 frames. ], batch size: 93, lr: 5.72e-03, grad_scale: 16.0 2023-09-30 06:20:00,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:20:00,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:20:03,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:20:03,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:20:05,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 06:20:05,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:20:05,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 06:20:05,955 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=22.70 vs. limit=22.5 2023-09-30 06:20:06,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 06:20:08,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:20:09,885 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 06:20:14,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 06:20:14,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:20:16,383 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 06:20:18,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 06:20:18,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 06:20:20,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 06:20:20,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 06:20:20,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:20:20,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:20:21,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:20:23,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 06:20:27,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:20:27,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:20:27,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:20:30,542 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 06:20:33,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 06:20:33,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:20:36,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:20:38,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:20:38,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 06:20:41,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 06:20:43,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:20:43,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 06:20:43,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 06:20:44,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:20:44,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:20:44,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:20:46,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:20:47,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 06:20:49,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:20:49,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:20:52,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:20:52,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:20:52,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 06:20:53,108 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 06:20:53,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=623240.0, ans=0.2 2023-09-30 06:20:54,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:20:56,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 06:20:56,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:20:58,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 06:20:58,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 06:21:01,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:21:01,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:21:04,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 06:21:04,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 06:21:05,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:21:08,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:21:10,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:21:10,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:21:15,451 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:21:16,159 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=7.53 vs. limit=12.0 2023-09-30 06:21:16,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:21:18,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 06:21:21,423 INFO [train.py:1039] (3/4) Epoch 18, batch 3200, loss[loss=0.193, simple_loss=0.2723, pruned_loss=0.05679, over 23867.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2538, pruned_loss=0.05257, over 4702473.57 frames. ], batch size: 86, lr: 5.71e-03, grad_scale: 32.0 2023-09-30 06:21:23,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:21:23,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 06:21:28,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:21:30,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:21:30,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 06:21:34,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:21:39,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:21:42,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:21:43,655 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.425e+02 1.833e+02 1.996e+02 2.319e+02 3.127e+02, threshold=3.992e+02, percent-clipped=0.0 2023-09-30 06:21:45,848 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=623440.0, ans=0.0 2023-09-30 06:21:50,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:21:59,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 06:22:00,161 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=623506.6666666666, ans=0.125 2023-09-30 06:22:01,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:22:04,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 06:22:05,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 06:22:09,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:22:09,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:22:10,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:22:15,499 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 06:22:17,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 06:22:20,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 06:22:21,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 06:22:24,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:22:32,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:22:32,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:22:32,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:22:32,952 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 06:22:32,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:22:37,170 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=11.44 vs. limit=15.0 2023-09-30 06:22:37,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:22:39,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 06:22:41,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 06:22:41,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 06:22:43,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 06:22:44,457 INFO [train.py:1039] (3/4) Epoch 18, batch 3250, loss[loss=0.1819, simple_loss=0.2587, pruned_loss=0.05253, over 24428.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2544, pruned_loss=0.05284, over 4709352.49 frames. ], batch size: 63, lr: 5.71e-03, grad_scale: 32.0 2023-09-30 06:22:44,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:22:46,305 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.43 vs. limit=15.0 2023-09-30 06:22:48,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 06:22:48,479 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 06:22:48,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:22:48,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:22:49,200 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.81 vs. limit=6.0 2023-09-30 06:22:50,060 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 06:22:53,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:22:56,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:22:57,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=623706.6666666666, ans=0.1 2023-09-30 06:23:05,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:23:05,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 06:23:07,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:23:07,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:23:07,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:23:08,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:23:08,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:23:13,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:23:13,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:23:13,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:23:13,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:23:13,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:23:13,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:23:17,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:23:19,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:23:21,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:23:21,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:23:22,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:23:24,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:23:24,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:23:28,166 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.07 vs. limit=10.0 2023-09-30 06:23:28,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 06:23:30,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:23:30,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:23:32,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:23:32,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:23:37,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:23:46,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:23:46,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:23:46,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 06:23:46,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:23:46,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 06:23:46,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:23:50,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 06:23:51,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 06:23:51,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:23:54,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:23:55,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:23:56,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 06:23:57,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:24:00,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:24:00,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:24:02,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 06:24:03,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:24:06,764 INFO [train.py:1039] (3/4) Epoch 18, batch 3300, loss[loss=0.2063, simple_loss=0.2639, pruned_loss=0.07432, over 22729.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2544, pruned_loss=0.05279, over 4712667.13 frames. ], batch size: 322, lr: 5.71e-03, grad_scale: 16.0 2023-09-30 06:24:06,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:24:06,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 06:24:09,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:24:09,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 06:24:12,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 06:24:13,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 06:24:13,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:24:16,988 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=624040.0, ans=0.1 2023-09-30 06:24:18,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:24:19,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:24:19,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:24:22,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 06:24:22,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 06:24:26,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:24:26,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:24:26,651 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=624106.6666666666, ans=0.0 2023-09-30 06:24:30,037 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.896e+02 2.095e+02 2.468e+02 4.456e+02, threshold=4.189e+02, percent-clipped=2.0 2023-09-30 06:24:33,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 06:24:33,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:24:33,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:24:35,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:24:36,805 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 06:24:36,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:24:37,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:24:38,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:24:38,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:24:38,540 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 06:24:43,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:24:43,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:24:44,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:24:44,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 06:24:46,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 06:24:46,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:24:47,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:24:49,374 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 06:24:51,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 06:24:52,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:24:54,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 06:24:54,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=624240.0, ans=0.05 2023-09-30 06:24:56,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:24:59,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 06:24:59,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:25:03,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:25:03,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:25:03,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:25:03,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:25:07,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:25:07,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:25:07,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:25:08,830 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 06:25:08,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 06:25:12,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:25:12,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:25:12,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:25:15,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:25:15,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:25:16,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:25:16,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:25:16,779 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 06:25:18,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:25:19,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:25:22,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 06:25:22,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:25:22,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:25:22,994 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=12.83 vs. limit=15.0 2023-09-30 06:25:23,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:25:23,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=624306.6666666666, ans=0.1 2023-09-30 06:25:25,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:25:26,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:25:28,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:25:28,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:25:30,152 INFO [train.py:1039] (3/4) Epoch 18, batch 3350, loss[loss=0.2324, simple_loss=0.2909, pruned_loss=0.08694, over 19376.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.2562, pruned_loss=0.05342, over 4710629.63 frames. ], batch size: 388, lr: 5.71e-03, grad_scale: 16.0 2023-09-30 06:25:30,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=624373.3333333334, ans=0.1 2023-09-30 06:25:33,194 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=7.01 vs. limit=15.0 2023-09-30 06:25:33,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:25:35,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:25:35,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:25:39,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:25:39,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=624373.3333333334, ans=0.125 2023-09-30 06:25:41,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:25:44,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:25:44,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:25:45,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 06:25:47,209 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 06:25:47,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:25:51,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 06:25:51,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 06:25:53,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:25:53,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:25:54,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:25:56,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 06:25:56,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:25:56,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:25:59,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:26:00,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=624440.0, ans=0.1 2023-09-30 06:26:00,231 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=624440.0, ans=0.125 2023-09-30 06:26:01,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:26:02,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:26:03,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:26:06,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:26:09,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:26:09,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:26:10,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=624506.6666666666, ans=0.1 2023-09-30 06:26:13,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:26:15,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:26:17,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:26:18,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:26:21,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:26:23,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 06:26:23,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=624573.3333333334, ans=0.125 2023-09-30 06:26:24,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:26:24,716 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 06:26:24,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:26:27,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 06:26:27,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:26:28,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=624573.3333333334, ans=0.125 2023-09-30 06:26:29,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:26:37,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:26:39,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 06:26:39,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:26:40,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:26:42,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:26:48,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:26:51,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 06:26:51,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:26:51,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:26:53,301 INFO [train.py:1039] (3/4) Epoch 18, batch 3400, loss[loss=0.1839, simple_loss=0.2555, pruned_loss=0.05611, over 23546.00 frames. ], tot_loss[loss=0.1812, simple_loss=0.2563, pruned_loss=0.0531, over 4722845.65 frames. ], batch size: 106, lr: 5.71e-03, grad_scale: 16.0 2023-09-30 06:26:53,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:26:53,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 06:26:54,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:26:54,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 06:26:56,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:26:56,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:26:56,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:26:58,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:26:58,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 06:27:02,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 06:27:02,854 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 06:27:02,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:27:04,677 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=624706.6666666666, ans=0.125 2023-09-30 06:27:07,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:27:07,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:27:09,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:27:10,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:27:15,396 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.30 vs. limit=15.0 2023-09-30 06:27:16,035 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.548e+02 1.809e+02 2.054e+02 2.312e+02 3.383e+02, threshold=4.108e+02, percent-clipped=0.0 2023-09-30 06:27:16,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:27:17,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 06:27:21,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=624773.3333333334, ans=0.125 2023-09-30 06:27:22,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:27:23,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=624773.3333333334, ans=0.0 2023-09-30 06:27:24,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:27:24,499 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:27:27,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 06:27:35,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:27:38,602 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=624840.0, ans=0.1 2023-09-30 06:27:38,603 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=624840.0, ans=0.125 2023-09-30 06:27:39,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 06:27:45,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:27:47,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:27:48,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 06:27:48,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:27:50,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:27:50,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:27:50,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:27:53,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:27:58,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:28:00,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:28:04,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:28:07,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 06:28:13,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:28:14,626 INFO [train.py:1039] (3/4) Epoch 18, batch 3450, loss[loss=0.173, simple_loss=0.2475, pruned_loss=0.04922, over 23213.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.2563, pruned_loss=0.05333, over 4710317.74 frames. ], batch size: 119, lr: 5.71e-03, grad_scale: 16.0 2023-09-30 06:28:16,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 06:28:21,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 06:28:21,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:28:22,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=625040.0, ans=0.125 2023-09-30 06:28:23,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:28:23,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 06:28:23,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:28:27,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:28:27,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=625040.0, ans=0.09899494936611666 2023-09-30 06:28:28,085 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.68 vs. limit=15.0 2023-09-30 06:28:32,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:28:32,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:28:34,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:28:34,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:28:37,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:28:42,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 06:28:46,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=625106.6666666666, ans=0.1 2023-09-30 06:28:48,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 06:28:48,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 06:28:48,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:28:52,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:28:57,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 06:28:59,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:29:04,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:29:05,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:29:07,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:29:08,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:29:09,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=625240.0, ans=0.1 2023-09-30 06:29:12,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 06:29:12,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:29:12,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:29:16,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:29:18,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 06:29:20,661 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=625306.6666666666, ans=0.2 2023-09-30 06:29:23,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:29:28,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:29:28,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:29:33,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:29:37,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:29:37,277 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:29:38,605 INFO [train.py:1039] (3/4) Epoch 18, batch 3500, loss[loss=0.1952, simple_loss=0.2564, pruned_loss=0.06696, over 23829.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2547, pruned_loss=0.05307, over 4693812.23 frames. ], batch size: 164, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:29:39,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:29:40,652 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:29:45,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:29:46,155 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=625373.3333333334, ans=0.0 2023-09-30 06:29:47,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:29:48,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 06:29:50,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:29:53,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 06:29:55,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:29:55,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 06:30:01,128 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.444e+02 1.872e+02 2.058e+02 2.368e+02 3.255e+02, threshold=4.116e+02, percent-clipped=0.0 2023-09-30 06:30:01,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:30:01,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:30:01,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=625440.0, ans=0.125 2023-09-30 06:30:03,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:30:03,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:30:03,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 06:30:03,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:05,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:30:05,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 06:30:07,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=625440.0, ans=0.125 2023-09-30 06:30:09,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:11,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 06:30:12,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:30:14,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:16,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 06:30:16,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:30:17,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:30:20,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:30:22,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:24,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:30:24,451 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:30:25,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 06:30:27,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 06:30:27,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 06:30:29,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:30:30,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:30,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:30:30,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:30:33,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 06:30:34,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:30:41,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:30:42,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 06:30:42,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 06:30:42,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:30:46,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:30:46,623 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:30:46,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:30:47,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=625640.0, ans=0.1 2023-09-30 06:30:49,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 06:30:50,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:30:52,317 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=625640.0, ans=0.0 2023-09-30 06:30:53,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:30:53,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 06:30:55,402 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=625640.0, ans=0.125 2023-09-30 06:30:56,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 06:30:59,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:31:00,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:31:00,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:31:01,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:31:02,355 INFO [train.py:1039] (3/4) Epoch 18, batch 3550, loss[loss=0.1691, simple_loss=0.2584, pruned_loss=0.03993, over 24435.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2532, pruned_loss=0.05319, over 4679667.82 frames. ], batch size: 69, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:31:04,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:31:08,002 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.76 vs. limit=15.0 2023-09-30 06:31:09,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=625706.6666666666, ans=0.1 2023-09-30 06:31:12,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:31:16,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 06:31:18,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:31:19,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:31:21,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:31:22,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:31:22,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:31:27,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:31:27,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:31:27,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:31:29,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 06:31:29,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:31:36,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:31:36,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:31:38,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:31:38,225 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:31:38,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:31:38,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 06:31:38,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:31:40,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:31:41,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 06:31:45,580 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=625840.0, ans=0.125 2023-09-30 06:31:48,248 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.77 vs. limit=6.0 2023-09-30 06:31:48,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:31:50,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:31:50,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:31:52,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 06:31:54,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:31:55,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 06:31:57,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:31:58,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:31:58,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:32:02,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 06:32:03,229 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=10.23 vs. limit=15.0 2023-09-30 06:32:04,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:32:04,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=625906.6666666666, ans=0.125 2023-09-30 06:32:07,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:32:09,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 06:32:09,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:32:15,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:32:16,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 06:32:23,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 06:32:23,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:32:24,961 INFO [train.py:1039] (3/4) Epoch 18, batch 3600, loss[loss=0.2023, simple_loss=0.2732, pruned_loss=0.0657, over 23394.00 frames. ], tot_loss[loss=0.1792, simple_loss=0.2527, pruned_loss=0.05289, over 4670824.78 frames. ], batch size: 93, lr: 5.70e-03, grad_scale: 32.0 2023-09-30 06:32:25,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:32:25,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=626040.0, ans=0.125 2023-09-30 06:32:27,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:32:28,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:32:30,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:32:35,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:32:35,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:32:37,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:32:37,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:32:38,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:32:38,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 06:32:41,329 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.20 vs. limit=15.0 2023-09-30 06:32:43,489 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:32:43,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:32:46,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:32:46,936 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=626106.6666666666, ans=0.1 2023-09-30 06:32:47,992 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.865e+02 1.967e+02 2.260e+02 3.686e+02, threshold=3.933e+02, percent-clipped=0.0 2023-09-30 06:32:49,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:32:51,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:32:52,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:32:52,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 06:32:54,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:32:56,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:32:58,147 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:32:59,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:33:01,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:33:03,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:33:03,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 06:33:08,564 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=626173.3333333334, ans=0.05 2023-09-30 06:33:13,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:33:13,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=626240.0, ans=0.1 2023-09-30 06:33:14,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:33:15,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=626240.0, ans=0.125 2023-09-30 06:33:16,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 06:33:21,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:33:25,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:33:27,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:33:34,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:33:34,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:33:34,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 06:33:36,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 06:33:38,194 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 06:33:41,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:33:41,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:33:41,449 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=626306.6666666666, ans=0.09899494936611666 2023-09-30 06:33:42,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 06:33:44,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:33:44,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:33:44,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:33:45,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 06:33:47,876 INFO [train.py:1039] (3/4) Epoch 18, batch 3650, loss[loss=0.1743, simple_loss=0.2635, pruned_loss=0.04252, over 24459.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2536, pruned_loss=0.05271, over 4683251.05 frames. ], batch size: 69, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:33:48,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 06:33:49,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:33:51,358 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 06:33:55,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 06:33:57,559 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:34:00,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 06:34:02,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 06:34:07,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:34:07,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:34:08,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:34:13,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 06:34:13,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:34:14,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 06:34:14,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:34:14,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:34:14,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 06:34:14,956 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=626440.0, ans=0.0 2023-09-30 06:34:17,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 06:34:19,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:34:19,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:34:20,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:34:24,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 06:34:24,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 06:34:25,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:34:28,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 06:34:28,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:34:28,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:34:30,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=626506.6666666666, ans=0.125 2023-09-30 06:34:36,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:34:38,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:34:38,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:34:40,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:34:40,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:34:42,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:34:45,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:34:45,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:34:45,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:34:47,543 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=626573.3333333334, ans=0.07 2023-09-30 06:34:49,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:34:50,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:34:50,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:34:58,851 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 06:35:03,584 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:35:03,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:35:03,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:35:05,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:35:05,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:35:06,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:35:08,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 06:35:08,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:35:09,608 INFO [train.py:1039] (3/4) Epoch 18, batch 3700, loss[loss=0.1993, simple_loss=0.2695, pruned_loss=0.06456, over 23689.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2545, pruned_loss=0.05286, over 4694152.62 frames. ], batch size: 232, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:35:11,276 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:35:14,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:35:15,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:35:18,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:35:18,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 06:35:18,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:35:20,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 06:35:20,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:35:21,908 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=626706.6666666666, ans=0.125 2023-09-30 06:35:24,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:35:25,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=626773.3333333334, ans=0.0 2023-09-30 06:35:28,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:35:28,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:35:29,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:35:29,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:35:30,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=626773.3333333334, ans=0.125 2023-09-30 06:35:31,845 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 06:35:34,540 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 1.984e+02 2.156e+02 2.490e+02 5.109e+02, threshold=4.311e+02, percent-clipped=1.0 2023-09-30 06:35:34,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:35:34,894 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 06:35:42,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:35:42,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 06:35:45,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:35:45,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 06:35:45,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:35:50,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:35:52,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 06:35:54,174 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:35:54,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:35:57,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:35:57,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:35:59,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 06:36:04,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:36:06,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 06:36:06,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:36:06,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 06:36:10,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:36:12,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:36:13,301 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.93 vs. limit=22.5 2023-09-30 06:36:14,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:36:15,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 06:36:17,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:36:17,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:36:18,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:36:18,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:36:21,337 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.29 vs. limit=6.0 2023-09-30 06:36:21,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:36:24,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 06:36:25,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 06:36:25,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:36:25,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:36:27,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:36:28,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:36:30,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:36:31,893 INFO [train.py:1039] (3/4) Epoch 18, batch 3750, loss[loss=0.1764, simple_loss=0.2669, pruned_loss=0.04298, over 24309.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.2563, pruned_loss=0.05335, over 4702139.50 frames. ], batch size: 74, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:36:32,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:36:34,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:36:35,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 06:36:37,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 06:36:40,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 06:36:41,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 06:36:42,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:36:43,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:36:45,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:36:46,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:36:50,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:36:51,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 06:36:53,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:36:57,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:37:00,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:37:00,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 06:37:01,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:37:03,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:37:04,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:37:08,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 06:37:08,771 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=627173.3333333334, ans=0.0 2023-09-30 06:37:11,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 06:37:13,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:37:15,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:37:15,580 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=627173.3333333334, ans=0.0 2023-09-30 06:37:16,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:37:21,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:37:21,957 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=15.00 vs. limit=15.0 2023-09-30 06:37:24,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 06:37:28,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 06:37:31,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:37:34,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:37:34,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:37:39,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:37:44,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 06:37:45,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 06:37:48,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:37:49,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:37:51,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:37:53,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=627373.3333333334, ans=0.95 2023-09-30 06:37:54,262 INFO [train.py:1039] (3/4) Epoch 18, batch 3800, loss[loss=0.1977, simple_loss=0.2756, pruned_loss=0.05989, over 24412.00 frames. ], tot_loss[loss=0.1813, simple_loss=0.2557, pruned_loss=0.05346, over 4694769.26 frames. ], batch size: 77, lr: 5.70e-03, grad_scale: 16.0 2023-09-30 06:37:59,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:38:04,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:38:06,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 06:38:06,399 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 06:38:07,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:38:09,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=627440.0, ans=0.1 2023-09-30 06:38:10,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:38:10,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 06:38:13,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 06:38:13,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:38:14,004 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:38:14,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=627440.0, ans=0.125 2023-09-30 06:38:15,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:38:15,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:38:17,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:38:17,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 06:38:19,114 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.827e+02 2.039e+02 2.367e+02 3.749e+02, threshold=4.078e+02, percent-clipped=0.0 2023-09-30 06:38:20,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 06:38:22,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:38:25,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:38:26,422 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=627506.6666666666, ans=0.2 2023-09-30 06:38:27,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:38:28,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:38:30,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:38:30,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:38:32,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:38:33,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:38:40,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:38:40,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 06:38:42,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:38:44,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=627573.3333333334, ans=0.0 2023-09-30 06:38:46,453 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=12.97 vs. limit=15.0 2023-09-30 06:38:48,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:38:55,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:38:56,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=627573.3333333334, ans=0.1 2023-09-30 06:38:57,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 06:38:59,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=627640.0, ans=0.2 2023-09-30 06:39:00,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 06:39:01,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:39:03,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:39:05,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:39:05,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 06:39:08,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 06:39:09,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 06:39:09,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:39:11,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:39:16,702 INFO [train.py:1039] (3/4) Epoch 18, batch 3850, loss[loss=0.1539, simple_loss=0.2322, pruned_loss=0.03783, over 24585.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.2548, pruned_loss=0.05321, over 4694239.49 frames. ], batch size: 60, lr: 5.69e-03, grad_scale: 16.0 2023-09-30 06:39:18,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:39:18,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:39:23,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:39:24,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 06:39:24,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:39:26,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:39:29,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 06:39:32,754 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:39:35,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 06:39:37,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 06:39:42,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:39:45,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:39:49,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:39:49,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:39:52,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:39:52,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:39:53,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:39:53,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:39:53,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:39:56,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:39:57,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:39:58,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:39:58,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 06:39:58,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 06:40:00,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:40:00,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:40:05,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:05,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:40:05,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 06:40:08,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 06:40:10,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:11,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 06:40:14,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 06:40:19,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:21,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:40:21,826 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=627973.3333333334, ans=0.025 2023-09-30 06:40:24,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:25,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 06:40:25,268 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=627973.3333333334, ans=0.95 2023-09-30 06:40:28,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 06:40:30,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:40:31,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:40:34,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:40:34,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 06:40:35,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:37,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:37,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:40:37,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 06:40:37,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:40:38,642 INFO [train.py:1039] (3/4) Epoch 18, batch 3900, loss[loss=0.164, simple_loss=0.2412, pruned_loss=0.04337, over 23370.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2541, pruned_loss=0.0529, over 4689994.39 frames. ], batch size: 119, lr: 5.69e-03, grad_scale: 16.0 2023-09-30 06:40:38,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 06:40:38,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:38,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:40:40,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:40:42,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:43,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:40:43,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:40:43,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:40:45,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:40:45,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 06:40:45,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:48,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:40:50,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:40:50,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:40:52,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:40:53,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:40:53,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:40:55,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:40:57,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 06:40:57,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:40:59,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 06:41:00,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:41:00,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 06:41:02,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 06:41:03,818 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.888e+02 2.025e+02 2.251e+02 3.863e+02, threshold=4.050e+02, percent-clipped=0.0 2023-09-30 06:41:04,290 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=628106.6666666666, ans=0.0 2023-09-30 06:41:08,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:41:08,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:41:08,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:41:10,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:41:16,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:41:18,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:41:20,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=628173.3333333334, ans=0.1 2023-09-30 06:41:21,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:41:21,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:41:22,156 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=628173.3333333334, ans=0.125 2023-09-30 06:41:23,343 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:41:26,934 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=628240.0, ans=0.0 2023-09-30 06:41:28,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:41:28,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:41:37,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 06:41:39,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:41:43,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=628306.6666666666, ans=0.125 2023-09-30 06:41:49,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:41:51,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:41:51,423 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 06:41:52,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 06:41:52,850 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 06:41:54,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 06:41:56,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:41:57,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 06:42:00,985 INFO [train.py:1039] (3/4) Epoch 18, batch 3950, loss[loss=0.1809, simple_loss=0.2658, pruned_loss=0.04801, over 24503.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.253, pruned_loss=0.05278, over 4681364.00 frames. ], batch size: 66, lr: 5.69e-03, grad_scale: 16.0 2023-09-30 06:42:04,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:42:06,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 06:42:06,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:42:06,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=628373.3333333334, ans=0.0 2023-09-30 06:42:09,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:42:09,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:42:12,660 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.53 vs. limit=15.0 2023-09-30 06:42:13,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=628373.3333333334, ans=0.0 2023-09-30 06:42:16,488 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 06:42:17,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:42:18,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 06:42:19,481 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 06:42:19,518 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:42:23,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:42:23,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:42:23,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:42:26,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 06:42:28,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:42:28,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:42:28,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:42:30,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:42:31,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 06:42:44,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:42:44,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:42:49,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 06:42:54,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 06:42:54,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 06:42:55,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:42:56,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:43:00,222 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=628573.3333333334, ans=0.125 2023-09-30 06:43:04,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:43:05,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:43:05,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:43:06,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:43:06,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 06:43:11,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:43:12,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:43:17,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 06:43:24,665 INFO [train.py:1039] (3/4) Epoch 18, batch 4000, loss[loss=0.1864, simple_loss=0.2561, pruned_loss=0.05834, over 23350.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.2544, pruned_loss=0.0534, over 4682097.88 frames. ], batch size: 105, lr: 5.69e-03, grad_scale: 32.0 2023-09-30 06:43:27,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:43:36,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=628706.6666666666, ans=0.0 2023-09-30 06:43:37,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:43:42,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:43:43,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:43:44,044 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:43:44,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 06:43:44,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:43:45,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 06:43:45,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:43:45,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 06:43:47,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:43:48,589 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.566e+02 1.924e+02 2.193e+02 2.601e+02 4.615e+02, threshold=4.387e+02, percent-clipped=1.0 2023-09-30 06:43:50,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:43:52,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:43:52,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:43:52,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:43:52,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 06:43:54,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:43:56,118 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 06:43:57,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:43:57,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:43:59,432 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 06:44:00,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:44:00,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:44:12,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 06:44:12,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:44:14,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:44:15,912 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 06:44:16,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:44:18,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 06:44:18,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:44:18,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:44:20,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:44:21,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:44:21,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:44:22,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:44:23,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 06:44:25,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:44:26,606 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 06:44:33,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:44:36,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 06:44:37,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:44:37,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:44:38,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=628973.3333333334, ans=0.1 2023-09-30 06:44:38,104 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=628973.3333333334, ans=0.125 2023-09-30 06:44:38,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=628973.3333333334, ans=0.125 2023-09-30 06:44:39,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:44:39,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:44:43,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:44:45,404 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=6.78 vs. limit=15.0 2023-09-30 06:44:45,896 INFO [train.py:1039] (3/4) Epoch 18, batch 4050, loss[loss=0.2006, simple_loss=0.2635, pruned_loss=0.06883, over 23673.00 frames. ], tot_loss[loss=0.1806, simple_loss=0.255, pruned_loss=0.05308, over 4701746.47 frames. ], batch size: 232, lr: 5.69e-03, grad_scale: 32.0 2023-09-30 06:44:48,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 06:44:48,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 06:44:48,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=629040.0, ans=0.125 2023-09-30 06:44:51,050 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:44:51,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:44:51,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=629040.0, ans=0.0 2023-09-30 06:44:51,930 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.35 vs. limit=15.0 2023-09-30 06:44:52,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:44:54,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:44:55,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:45:00,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:45:00,848 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.74 vs. limit=6.0 2023-09-30 06:45:02,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:45:03,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 06:45:07,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:45:07,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:45:11,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:45:13,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:45:16,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 06:45:17,998 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=629173.3333333334, ans=0.125 2023-09-30 06:45:19,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 06:45:19,459 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 06:45:22,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:45:31,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 06:45:31,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:45:34,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:45:36,426 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.97 vs. limit=6.0 2023-09-30 06:45:37,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:45:37,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:45:37,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:45:41,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:45:44,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 06:45:44,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 06:45:46,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:45:47,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 06:45:52,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:46:01,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 06:46:01,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:46:01,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:46:04,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 06:46:04,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 06:46:04,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:46:05,186 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=629306.6666666666, ans=0.125 2023-09-30 06:46:07,717 INFO [train.py:1039] (3/4) Epoch 18, batch 4100, loss[loss=0.2093, simple_loss=0.2714, pruned_loss=0.0736, over 22793.00 frames. ], tot_loss[loss=0.1814, simple_loss=0.2562, pruned_loss=0.05327, over 4704308.29 frames. ], batch size: 323, lr: 5.69e-03, grad_scale: 32.0 2023-09-30 06:46:07,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:46:11,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:46:11,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:46:18,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 06:46:19,629 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 06:46:21,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 06:46:22,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 06:46:22,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:46:24,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:46:24,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:46:24,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:46:24,421 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 06:46:27,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:46:27,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:46:27,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:46:29,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:46:33,104 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.931e+02 2.115e+02 2.277e+02 3.051e+02, threshold=4.229e+02, percent-clipped=0.0 2023-09-30 06:46:34,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:46:36,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:46:36,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:46:36,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 06:46:37,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:46:37,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:46:37,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:46:38,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=629440.0, ans=0.1 2023-09-30 06:46:39,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:46:40,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 06:46:44,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:46:47,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 06:46:48,856 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:46:52,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:46:52,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 06:46:52,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:46:54,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:46:54,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:46:55,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 06:46:57,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:46:58,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:47:00,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 06:47:00,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:47:00,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:47:05,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:47:09,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:47:12,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:47:13,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=629640.0, ans=0.125 2023-09-30 06:47:14,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:47:22,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=629640.0, ans=0.0 2023-09-30 06:47:23,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:47:23,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:47:24,088 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.42 vs. limit=6.0 2023-09-30 06:47:26,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:47:29,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:47:30,945 INFO [train.py:1039] (3/4) Epoch 18, batch 4150, loss[loss=0.1657, simple_loss=0.2453, pruned_loss=0.04302, over 24504.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2561, pruned_loss=0.05284, over 4702193.54 frames. ], batch size: 63, lr: 5.69e-03, grad_scale: 32.0 2023-09-30 06:47:32,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:47:34,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:47:35,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:47:35,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:47:38,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 06:47:38,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:47:38,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 06:47:40,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 06:47:40,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 06:47:42,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:47:47,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:47:47,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:47:52,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:47:54,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:47:56,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:47:59,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 06:47:59,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:47:59,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=629773.3333333334, ans=0.125 2023-09-30 06:48:00,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 06:48:02,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:48:05,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:48:05,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 06:48:09,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 06:48:09,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:48:11,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 06:48:11,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:48:11,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:48:13,173 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 06:48:15,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:48:17,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:48:20,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 06:48:23,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:48:25,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:48:26,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 06:48:27,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=629906.6666666666, ans=0.125 2023-09-30 06:48:27,163 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=629906.6666666666, ans=0.0 2023-09-30 06:48:29,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:48:30,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 06:48:33,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:48:33,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:48:35,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:48:36,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 06:48:36,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:48:36,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 06:48:38,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 06:48:40,810 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.13 vs. limit=12.0 2023-09-30 06:48:41,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 06:48:41,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:48:41,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 06:48:41,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 06:48:42,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 06:48:43,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:48:43,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 06:48:44,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:48:46,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:48:46,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 06:48:46,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 06:48:53,164 INFO [train.py:1039] (3/4) Epoch 18, batch 4200, loss[loss=0.1841, simple_loss=0.2641, pruned_loss=0.05208, over 24061.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2551, pruned_loss=0.05291, over 4689724.94 frames. ], batch size: 86, lr: 5.68e-03, grad_scale: 32.0 2023-09-30 06:48:53,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:48:56,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 06:48:56,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:49:00,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:49:02,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:49:03,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:49:03,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:49:04,270 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=630040.0, ans=0.0 2023-09-30 06:49:05,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 06:49:07,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 06:49:07,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:49:08,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:49:11,259 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.45 vs. limit=22.5 2023-09-30 06:49:11,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:49:12,134 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=630106.6666666666, ans=0.0 2023-09-30 06:49:13,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 06:49:16,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:49:17,451 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.533e+02 1.914e+02 2.122e+02 2.477e+02 4.078e+02, threshold=4.245e+02, percent-clipped=0.0 2023-09-30 06:49:17,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:49:17,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 06:49:17,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:49:19,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:49:19,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:49:19,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 06:49:21,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 06:49:25,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 06:49:26,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:49:29,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 06:49:31,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:49:32,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=630173.3333333334, ans=0.2 2023-09-30 06:49:33,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:49:34,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:49:37,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:49:37,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 06:49:37,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:49:37,332 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=630173.3333333334, ans=0.0 2023-09-30 06:49:38,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:49:44,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 06:49:46,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:49:52,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:49:55,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 06:49:58,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:50:03,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 06:50:04,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:50:06,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 06:50:11,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 06:50:14,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 06:50:15,761 INFO [train.py:1039] (3/4) Epoch 18, batch 4250, loss[loss=0.164, simple_loss=0.2426, pruned_loss=0.0427, over 19392.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2537, pruned_loss=0.05258, over 4682419.56 frames. ], batch size: 42, lr: 5.68e-03, grad_scale: 32.0 2023-09-30 06:50:15,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 06:50:18,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:50:25,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 06:50:25,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 06:50:26,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:50:28,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:50:32,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:50:33,152 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.68 vs. limit=22.5 2023-09-30 06:50:36,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:50:36,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:50:39,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:50:39,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:50:40,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:50:42,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:50:42,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:50:44,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:50:45,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:50:47,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 06:50:47,401 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=630506.6666666666, ans=0.1 2023-09-30 06:50:50,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 06:50:51,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:50:53,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:50:53,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:50:53,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 06:50:53,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:50:54,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:50:58,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 06:50:59,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 06:51:03,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:51:04,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:51:06,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 06:51:06,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:51:08,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 06:51:10,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:51:12,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:51:15,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:51:15,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:51:18,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 06:51:20,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 06:51:20,445 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=630640.0, ans=0.125 2023-09-30 06:51:21,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:51:25,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:51:29,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:51:30,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:51:31,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=630640.0, ans=0.0 2023-09-30 06:51:32,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:51:33,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:51:34,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:51:36,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:51:36,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 06:51:37,583 INFO [train.py:1039] (3/4) Epoch 18, batch 4300, loss[loss=0.192, simple_loss=0.2574, pruned_loss=0.06327, over 23663.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2537, pruned_loss=0.05248, over 4693731.15 frames. ], batch size: 232, lr: 5.68e-03, grad_scale: 16.0 2023-09-30 06:51:37,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:51:42,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:51:42,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:51:47,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:51:50,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=630706.6666666666, ans=0.125 2023-09-30 06:51:56,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:51:56,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 06:51:56,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:51:59,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:51:59,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 06:51:59,327 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 06:52:02,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 06:52:03,678 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.954e+02 2.321e+02 2.799e+02 4.498e+02, threshold=4.642e+02, percent-clipped=1.0 2023-09-30 06:52:05,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:52:07,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 06:52:07,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:52:09,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 06:52:09,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=630840.0, ans=0.1 2023-09-30 06:52:10,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 06:52:12,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 06:52:14,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 06:52:14,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:52:15,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:52:17,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:52:19,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:52:19,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 06:52:21,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 06:52:22,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=630840.0, ans=0.0 2023-09-30 06:52:23,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:52:26,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:52:26,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 06:52:26,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:52:28,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:52:28,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 06:52:28,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 06:52:29,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 06:52:29,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:52:29,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 06:52:31,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 06:52:34,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:52:36,617 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 06:52:38,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:52:40,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:52:40,989 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:52:44,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 06:52:44,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 06:52:44,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:52:45,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:52:45,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:52:45,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:52:47,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:52:50,521 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=630973.3333333334, ans=0.07 2023-09-30 06:52:52,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:52:52,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:52:53,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:53:00,677 INFO [train.py:1039] (3/4) Epoch 18, batch 4350, loss[loss=0.1892, simple_loss=0.2596, pruned_loss=0.05937, over 23776.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.255, pruned_loss=0.05289, over 4696280.61 frames. ], batch size: 164, lr: 5.68e-03, grad_scale: 16.0 2023-09-30 06:53:00,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 06:53:02,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 06:53:05,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:53:10,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:53:12,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 06:53:12,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:53:16,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 06:53:18,875 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=631106.6666666666, ans=0.0 2023-09-30 06:53:20,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:53:23,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:53:23,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:53:26,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 06:53:28,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:53:29,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:53:36,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 06:53:37,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:53:39,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:53:44,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:53:47,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 06:53:49,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:53:50,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 06:53:51,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=631240.0, ans=0.1 2023-09-30 06:53:56,828 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 06:53:58,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:53:58,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 06:53:59,889 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 06:54:01,826 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 06:54:01,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:54:03,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:54:04,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:54:04,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:54:06,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:54:08,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:54:10,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 06:54:10,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:10,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:54:12,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:12,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 06:54:12,898 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.96 vs. limit=22.5 2023-09-30 06:54:13,652 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 06:54:13,659 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 06:54:13,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 06:54:18,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:54:18,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:54:18,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:54:19,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:54:21,161 INFO [train.py:1039] (3/4) Epoch 18, batch 4400, loss[loss=0.192, simple_loss=0.2722, pruned_loss=0.05589, over 24352.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.2563, pruned_loss=0.05333, over 4707316.20 frames. ], batch size: 77, lr: 5.68e-03, grad_scale: 32.0 2023-09-30 06:54:21,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 06:54:22,861 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 06:54:22,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:26,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:54:26,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:29,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:54:32,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 06:54:32,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 06:54:32,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 06:54:33,990 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 06:54:34,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 06:54:34,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:54:37,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 06:54:39,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:54:40,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:54:40,878 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 06:54:46,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:54:46,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 06:54:47,798 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 06:54:49,102 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.515e+02 1.990e+02 2.241e+02 2.697e+02 4.171e+02, threshold=4.482e+02, percent-clipped=0.0 2023-09-30 06:54:50,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 06:54:50,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 06:54:52,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 06:54:52,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:54:52,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:54:54,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:54:55,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:54:57,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 06:54:57,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 06:54:58,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:55:00,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 06:55:00,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:55:02,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:55:03,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:55:03,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 06:55:05,286 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 06:55:07,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:55:12,499 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=18.87 vs. limit=22.5 2023-09-30 06:55:15,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:55:16,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 06:55:19,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:55:20,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:55:25,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 06:55:25,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 06:55:25,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 06:55:27,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:55:27,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 06:55:28,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 06:55:33,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 06:55:36,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 06:55:38,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 06:55:38,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:55:38,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 06:55:38,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.max_abs, batch_count=631640.0, ans=10.0 2023-09-30 06:55:40,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 06:55:41,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:55:43,108 INFO [train.py:1039] (3/4) Epoch 18, batch 4450, loss[loss=0.1923, simple_loss=0.2736, pruned_loss=0.05554, over 24093.00 frames. ], tot_loss[loss=0.1835, simple_loss=0.2579, pruned_loss=0.05451, over 4704141.10 frames. ], batch size: 80, lr: 5.68e-03, grad_scale: 16.0 2023-09-30 06:55:43,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 06:55:44,516 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.12 vs. limit=8.0 2023-09-30 06:55:48,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:55:50,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:55:50,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 06:55:51,114 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=631706.6666666666, ans=0.125 2023-09-30 06:55:56,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:55:56,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:56:01,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:56:04,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:56:05,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:56:05,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:56:07,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 06:56:07,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:56:08,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:56:10,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:56:10,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 06:56:13,143 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 06:56:17,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:56:19,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:56:21,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:56:21,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:56:24,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:56:27,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 06:56:28,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=631840.0, ans=0.2 2023-09-30 06:56:29,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 06:56:29,474 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=631840.0, ans=0.1 2023-09-30 06:56:31,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 06:56:31,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 06:56:33,139 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=631906.6666666666, ans=0.0 2023-09-30 06:56:34,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:56:34,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 06:56:39,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 06:56:42,674 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:56:42,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=631906.6666666666, ans=0.125 2023-09-30 06:56:44,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 06:56:44,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:56:44,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:56:44,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 06:56:44,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:56:47,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:56:50,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 06:56:51,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 06:56:53,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 06:56:57,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:56:57,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:57:00,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:57:00,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 06:57:03,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 06:57:05,040 INFO [train.py:1039] (3/4) Epoch 18, batch 4500, loss[loss=0.1937, simple_loss=0.2627, pruned_loss=0.06236, over 23586.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2575, pruned_loss=0.05408, over 4715437.30 frames. ], batch size: 134, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 06:57:05,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 06:57:08,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:57:14,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:57:14,542 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=632040.0, ans=0.0 2023-09-30 06:57:15,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 06:57:15,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 06:57:17,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:57:19,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=632040.0, ans=0.125 2023-09-30 06:57:20,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:57:22,061 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:57:22,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 06:57:23,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:57:25,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:57:25,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:57:33,296 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.852e+02 2.175e+02 2.491e+02 3.622e+02, threshold=4.350e+02, percent-clipped=0.0 2023-09-30 06:57:38,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:57:38,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:57:42,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:57:42,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 06:57:44,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 06:57:51,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 06:57:55,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 06:57:58,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:58:01,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 06:58:01,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 06:58:02,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:58:02,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:58:02,641 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.16 vs. limit=22.5 2023-09-30 06:58:03,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:58:05,727 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:58:07,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=632240.0, ans=0.125 2023-09-30 06:58:08,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:58:08,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 06:58:08,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 06:58:08,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:58:15,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 06:58:15,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 06:58:17,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:58:21,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 06:58:21,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 06:58:23,543 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.53 vs. limit=15.0 2023-09-30 06:58:24,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 06:58:25,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 06:58:25,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 06:58:26,741 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.84 vs. limit=15.0 2023-09-30 06:58:28,747 INFO [train.py:1039] (3/4) Epoch 18, batch 4550, loss[loss=0.183, simple_loss=0.2683, pruned_loss=0.04881, over 24359.00 frames. ], tot_loss[loss=0.1816, simple_loss=0.256, pruned_loss=0.05361, over 4710311.61 frames. ], batch size: 74, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 06:58:29,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 06:58:29,275 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=632373.3333333334, ans=0.125 2023-09-30 06:58:31,316 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=3.98 vs. limit=12.0 2023-09-30 06:58:32,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 06:58:32,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:58:32,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=632373.3333333334, ans=0.0 2023-09-30 06:58:36,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:58:36,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:58:40,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:58:45,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 06:58:47,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 06:58:48,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:58:50,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 06:58:50,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:58:53,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:58:54,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 06:58:56,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:58:57,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 06:58:59,271 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 06:58:59,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 06:59:00,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 06:59:03,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 06:59:04,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:59:05,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 06:59:07,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 06:59:12,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:12,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:12,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 06:59:15,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 06:59:18,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:59:21,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:21,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 06:59:24,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:59:25,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=632573.3333333334, ans=0.2 2023-09-30 06:59:26,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 06:59:28,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 06:59:28,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 06:59:28,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 06:59:33,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 06:59:33,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 06:59:33,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:59:34,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 06:59:34,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:36,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 06:59:36,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=632640.0, ans=0.1 2023-09-30 06:59:38,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 06:59:38,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 06:59:39,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 06:59:39,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 06:59:41,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 06:59:41,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 06:59:41,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 06:59:46,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 06:59:46,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 06:59:48,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 06:59:48,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 06:59:50,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 06:59:50,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 06:59:51,590 INFO [train.py:1039] (3/4) Epoch 18, batch 4600, loss[loss=0.1491, simple_loss=0.1958, pruned_loss=0.0512, over 19233.00 frames. ], tot_loss[loss=0.1803, simple_loss=0.2543, pruned_loss=0.05317, over 4707245.73 frames. ], batch size: 388, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 06:59:53,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 06:59:54,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 06:59:56,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:00:01,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:00:01,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:00:01,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:00:03,902 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 07:00:05,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:00:08,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:00:08,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=632773.3333333334, ans=0.125 2023-09-30 07:00:10,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:00:11,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:18,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 07:00:18,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:20,150 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.577e+02 1.823e+02 2.071e+02 2.370e+02 3.584e+02, threshold=4.141e+02, percent-clipped=0.0 2023-09-30 07:00:21,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:26,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:00:26,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:00:32,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 07:00:32,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 07:00:33,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:00:40,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:40,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:00:42,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:00:42,663 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=632906.6666666666, ans=0.1 2023-09-30 07:00:46,805 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 07:00:48,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:00:51,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=632906.6666666666, ans=0.125 2023-09-30 07:00:53,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:00:54,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:00:55,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=632906.6666666666, ans=0.0 2023-09-30 07:00:58,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:00:58,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 07:00:59,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:00:59,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 07:00:59,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:00:59,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:01:00,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=632973.3333333334, ans=0.125 2023-09-30 07:01:02,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:01:02,111 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:01:02,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:01:02,543 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=632973.3333333334, ans=0.0 2023-09-30 07:01:03,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 07:01:03,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 07:01:03,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 07:01:03,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:01:03,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:01:06,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:01:06,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:01:15,436 INFO [train.py:1039] (3/4) Epoch 18, batch 4650, loss[loss=0.1853, simple_loss=0.27, pruned_loss=0.05035, over 24426.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2532, pruned_loss=0.05256, over 4705088.89 frames. ], batch size: 69, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 07:01:17,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:01:20,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:01:20,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:01:20,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:01:20,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:01:20,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:01:22,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:01:25,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 07:01:30,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:01:32,448 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 07:01:33,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:01:35,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 07:01:35,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:01:35,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 07:01:35,832 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=633106.6666666666, ans=0.0 2023-09-30 07:01:37,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 07:01:37,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:01:37,131 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:01:40,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:01:42,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:01:42,527 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 07:01:45,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:01:47,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 07:01:50,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:01:50,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:01:52,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 07:01:52,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:01:55,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:01:58,341 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=13.08 vs. limit=22.5 2023-09-30 07:01:59,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:02:03,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:02:06,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:02:06,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:02:06,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:02:10,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 07:02:10,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 07:02:11,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 07:02:11,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 07:02:13,214 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=633240.0, ans=0.125 2023-09-30 07:02:14,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:02:21,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:02:21,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:02:21,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 07:02:21,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:02:22,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:02:22,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:02:24,194 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:02:25,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:02:25,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:02:26,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:02:30,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:02:30,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:02:30,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:02:31,115 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=633306.6666666666, ans=0.2 2023-09-30 07:02:32,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 07:02:32,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:02:34,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 07:02:39,279 INFO [train.py:1039] (3/4) Epoch 18, batch 4700, loss[loss=0.1803, simple_loss=0.2594, pruned_loss=0.05061, over 24019.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2544, pruned_loss=0.05257, over 4712946.67 frames. ], batch size: 80, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 07:02:43,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:02:45,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:02:47,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:02:49,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:02:49,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:02:54,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 07:02:54,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 07:02:58,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:02:58,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:03:00,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:03:05,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:03:06,746 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.845e+02 2.033e+02 2.292e+02 3.478e+02, threshold=4.067e+02, percent-clipped=0.0 2023-09-30 07:03:07,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=633440.0, ans=0.1 2023-09-30 07:03:11,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=633506.6666666666, ans=0.125 2023-09-30 07:03:13,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:03:15,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 07:03:18,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:03:18,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=633506.6666666666, ans=0.125 2023-09-30 07:03:24,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 07:03:25,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:03:26,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:28,644 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=633573.3333333334, ans=0.04949747468305833 2023-09-30 07:03:31,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 07:03:33,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:03:39,481 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:03:39,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 07:03:41,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:41,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:03:43,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:03:44,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:03:44,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 07:03:46,420 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 07:03:48,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:03:48,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:48,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:48,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 07:03:48,782 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.22 vs. limit=22.5 2023-09-30 07:03:50,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:03:54,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 07:03:57,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:04:00,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:04:01,474 INFO [train.py:1039] (3/4) Epoch 18, batch 4750, loss[loss=0.2004, simple_loss=0.2675, pruned_loss=0.06666, over 23802.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2551, pruned_loss=0.05259, over 4724059.37 frames. ], batch size: 164, lr: 5.67e-03, grad_scale: 16.0 2023-09-30 07:04:03,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:04:03,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:04:05,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 07:04:05,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:04:07,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=633706.6666666666, ans=0.0 2023-09-30 07:04:09,231 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.61 vs. limit=22.5 2023-09-30 07:04:09,547 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.45 vs. limit=15.0 2023-09-30 07:04:09,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 07:04:10,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:04:11,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:04:12,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:04:20,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 07:04:24,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:04:27,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 07:04:27,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:04:30,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:04:30,779 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:04:30,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:04:33,050 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 07:04:33,054 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 07:04:35,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=633840.0, ans=0.125 2023-09-30 07:04:38,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.min_positive, batch_count=633840.0, ans=0.025 2023-09-30 07:04:39,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 07:04:40,097 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=633840.0, ans=0.0 2023-09-30 07:04:41,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:04:44,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:04:46,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:04:46,098 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 07:04:46,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:04:49,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:04:50,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:04:54,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 07:04:54,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 07:04:54,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:04:54,756 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=633906.6666666666, ans=0.0 2023-09-30 07:04:55,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:04:55,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:04:58,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 07:04:58,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 07:04:59,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 07:05:04,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:05:08,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=633973.3333333334, ans=0.2 2023-09-30 07:05:09,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:05:09,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 07:05:10,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:05:12,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:05:14,088 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:05:14,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:05:16,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:05:16,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=633973.3333333334, ans=0.125 2023-09-30 07:05:20,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:05:20,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 07:05:22,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 07:05:23,149 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.00 vs. limit=10.0 2023-09-30 07:05:23,725 INFO [train.py:1039] (3/4) Epoch 18, batch 4800, loss[loss=0.1607, simple_loss=0.2467, pruned_loss=0.03735, over 24462.00 frames. ], tot_loss[loss=0.181, simple_loss=0.2559, pruned_loss=0.053, over 4715236.95 frames. ], batch size: 66, lr: 5.67e-03, grad_scale: 32.0 2023-09-30 07:05:23,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 07:05:24,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=634040.0, ans=0.0 2023-09-30 07:05:25,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:05:25,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:05:27,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 07:05:27,634 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.30 vs. limit=6.0 2023-09-30 07:05:33,738 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:05:33,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:05:37,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:05:40,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:05:40,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:05:43,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 07:05:43,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:05:43,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:05:46,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:05:47,869 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=634106.6666666666, ans=0.0 2023-09-30 07:05:50,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:05:51,854 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.888e+02 2.165e+02 2.522e+02 3.456e+02, threshold=4.330e+02, percent-clipped=0.0 2023-09-30 07:05:54,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:05:54,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:05:55,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:05:55,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 07:05:55,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:05:55,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:05:59,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:06:02,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:06:03,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:06:03,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:06:05,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 07:06:07,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:06:09,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 07:06:10,019 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 07:06:12,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:06:12,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:06:12,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:06:12,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:06:12,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:06:14,202 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:06:15,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:06:15,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:06:20,838 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:06:22,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:24,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:06:29,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 07:06:29,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:06:29,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:30,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:06:30,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:06:35,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:06:36,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:06:36,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:36,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:06:38,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:06:38,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:06:38,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=634306.6666666666, ans=0.125 2023-09-30 07:06:43,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:06:43,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:43,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:06:46,850 INFO [train.py:1039] (3/4) Epoch 18, batch 4850, loss[loss=0.1555, simple_loss=0.232, pruned_loss=0.03954, over 24593.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2566, pruned_loss=0.05368, over 4706250.27 frames. ], batch size: 60, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:06:46,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 07:06:48,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 07:06:48,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:06:48,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:06:50,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:06:50,148 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:06:52,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:07:01,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 07:07:01,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:07:07,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:07:08,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:07:08,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:07:11,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:07:13,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:07:16,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:07:16,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 07:07:18,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:07:21,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:07:21,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 07:07:23,445 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:07:23,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 07:07:25,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:07:25,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:07:30,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:07:30,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 07:07:31,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 07:07:33,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:07:35,116 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=634573.3333333334, ans=0.0 2023-09-30 07:07:40,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:07:41,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 07:07:41,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:07:43,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:07:43,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:07:44,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 07:07:44,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:07:46,933 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.01 vs. limit=6.0 2023-09-30 07:07:47,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 07:07:47,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:07:49,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:07:51,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 07:07:53,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=634640.0, ans=0.125 2023-09-30 07:08:01,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:08:08,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:08:08,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:08:09,755 INFO [train.py:1039] (3/4) Epoch 18, batch 4900, loss[loss=0.187, simple_loss=0.2584, pruned_loss=0.05783, over 23713.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.2563, pruned_loss=0.05341, over 4715976.06 frames. ], batch size: 149, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:08:13,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 07:08:13,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:08:18,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:08:19,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:08:21,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:08:23,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 07:08:28,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 07:08:30,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=634773.3333333334, ans=0.125 2023-09-30 07:08:34,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 07:08:34,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 07:08:35,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:08:35,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:08:35,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:08:35,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:08:35,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:08:37,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 07:08:38,868 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.min_abs, batch_count=634773.3333333334, ans=0.5 2023-09-30 07:08:39,893 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.805e+02 1.986e+02 2.156e+02 3.448e+02, threshold=3.971e+02, percent-clipped=0.0 2023-09-30 07:08:40,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 07:08:41,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 07:08:43,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:08:43,441 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=634840.0, ans=0.2 2023-09-30 07:08:44,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:08:47,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:08:49,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:08:51,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:08:51,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 07:08:52,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:08:55,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:08:55,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 07:08:55,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 07:08:56,056 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.20 vs. limit=15.0 2023-09-30 07:08:58,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 07:09:00,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:09:01,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:09:01,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:09:03,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:09:03,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 07:09:03,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:09:04,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 07:09:07,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:09:09,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 07:09:10,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:09:14,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 07:09:16,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:09:17,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 07:09:17,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 07:09:24,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:09:25,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:09:27,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 07:09:27,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 07:09:27,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:09:28,589 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.15 vs. limit=8.0 2023-09-30 07:09:29,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:09:32,665 INFO [train.py:1039] (3/4) Epoch 18, batch 4950, loss[loss=0.201, simple_loss=0.2661, pruned_loss=0.06792, over 23825.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2545, pruned_loss=0.05268, over 4724176.95 frames. ], batch size: 179, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:09:32,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:09:32,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:09:32,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:09:32,828 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 07:09:34,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 07:09:37,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:09:38,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 07:09:38,161 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=635040.0, ans=0.125 2023-09-30 07:09:41,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 07:09:41,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 07:09:41,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:09:41,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=635040.0, ans=0.0 2023-09-30 07:09:42,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 07:09:42,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:09:42,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:09:44,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:09:44,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:09:48,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:09:48,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:09:49,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:09:51,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:09:52,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:09:52,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:09:56,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:10:03,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:10:05,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:10:05,778 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.36 vs. limit=15.0 2023-09-30 07:10:07,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:10:07,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=635173.3333333334, ans=0.125 2023-09-30 07:10:08,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:10,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:10:11,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 07:10:13,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 07:10:14,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:17,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:10:17,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:10:19,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:10:19,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:10:21,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:10:21,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:10:22,093 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=24.45 vs. limit=22.5 2023-09-30 07:10:23,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:10:25,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:10:26,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:10:28,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:28,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 07:10:28,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:10:29,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:10:30,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=635240.0, ans=0.0 2023-09-30 07:10:35,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:10:36,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:10:36,591 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:10:38,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:38,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:10:38,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:10:40,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:10:40,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=635306.6666666666, ans=0.2 2023-09-30 07:10:42,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:10:42,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:10:42,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=635306.6666666666, ans=0.07 2023-09-30 07:10:43,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 07:10:44,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=635306.6666666666, ans=0.2 2023-09-30 07:10:47,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:10:52,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 07:10:52,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 07:10:56,587 INFO [train.py:1039] (3/4) Epoch 18, batch 5000, loss[loss=0.1866, simple_loss=0.2503, pruned_loss=0.06148, over 22722.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.254, pruned_loss=0.05245, over 4711203.01 frames. ], batch size: 322, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:10:58,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:10:58,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:11:01,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 07:11:01,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 07:11:03,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:11:05,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 07:11:07,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:11:07,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 07:11:08,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 07:11:08,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:11:10,175 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:11:11,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 07:11:11,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:11:11,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:11:13,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 07:11:15,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 07:11:16,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:11:16,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 07:11:16,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 07:11:18,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:11:18,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:11:18,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 07:11:18,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 07:11:19,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 07:11:21,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:11:22,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:11:22,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 07:11:22,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:11:24,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:11:26,194 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.58 vs. limit=15.0 2023-09-30 07:11:26,594 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.857e+02 2.111e+02 2.507e+02 3.855e+02, threshold=4.222e+02, percent-clipped=0.0 2023-09-30 07:11:26,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:11:28,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 07:11:29,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 07:11:31,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:11:32,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:11:35,937 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 07:11:39,753 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:11:41,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:11:41,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:11:43,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 07:11:44,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:11:44,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:11:45,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:11:47,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 07:11:47,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:11:50,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:11:50,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=635573.3333333334, ans=0.1 2023-09-30 07:11:51,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:11:55,135 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=635573.3333333334, ans=0.07 2023-09-30 07:11:56,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 07:12:00,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=635573.3333333334, ans=0.2 2023-09-30 07:12:01,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:12:04,156 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=11.06 vs. limit=22.5 2023-09-30 07:12:12,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:12:13,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:12:15,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:12:15,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:12:15,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:12:15,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:12:15,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=635640.0, ans=0.125 2023-09-30 07:12:16,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:12:20,226 INFO [train.py:1039] (3/4) Epoch 18, batch 5050, loss[loss=0.1751, simple_loss=0.2436, pruned_loss=0.05326, over 23613.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2548, pruned_loss=0.0525, over 4706262.30 frames. ], batch size: 135, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:12:21,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:12:23,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 07:12:23,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=635706.6666666666, ans=0.0 2023-09-30 07:12:24,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:12:26,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:12:27,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:12:28,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 07:12:29,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:12:29,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:12:30,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=635706.6666666666, ans=0.1 2023-09-30 07:12:32,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:12:35,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:12:35,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:12:45,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 07:12:45,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 07:12:46,181 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.82 vs. limit=15.0 2023-09-30 07:12:47,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:12:47,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 07:12:48,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:12:49,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:12:50,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:12:52,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:12:52,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 07:12:52,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 07:12:54,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:12:58,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:12:59,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:12:59,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 07:13:01,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:13:04,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 07:13:05,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:13:05,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:13:07,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:13:08,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:13:08,905 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=635906.6666666666, ans=0.125 2023-09-30 07:13:12,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:13:12,453 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:13:15,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:13:16,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:13:16,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:13:16,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:13:16,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 07:13:17,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:13:18,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:13:19,328 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.18 vs. limit=10.0 2023-09-30 07:13:23,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:13:23,165 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 07:13:23,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 07:13:26,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:13:28,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:13:28,186 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 07:13:29,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:13:29,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 07:13:29,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:13:32,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=635973.3333333334, ans=0.035 2023-09-30 07:13:35,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:13:35,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:13:35,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 07:13:37,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 07:13:40,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:13:40,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:13:40,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:13:41,722 INFO [train.py:1039] (3/4) Epoch 18, batch 5100, loss[loss=0.1991, simple_loss=0.2653, pruned_loss=0.06643, over 22789.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.2556, pruned_loss=0.05264, over 4709062.47 frames. ], batch size: 322, lr: 5.66e-03, grad_scale: 16.0 2023-09-30 07:13:43,443 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 07:13:44,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:13:50,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 07:13:50,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 07:13:51,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:13:51,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=636040.0, ans=0.0 2023-09-30 07:13:53,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:13:54,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:13:56,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 07:13:56,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 07:14:01,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:14:01,340 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:14:06,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:14:08,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 07:14:10,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:14:10,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:14:10,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 07:14:11,816 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 1.826e+02 2.022e+02 2.261e+02 3.082e+02, threshold=4.044e+02, percent-clipped=0.0 2023-09-30 07:14:14,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:14:16,295 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:14:16,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 07:14:17,835 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 07:14:19,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:14:19,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 07:14:19,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 07:14:24,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:14:32,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:14:36,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 07:14:36,152 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 07:14:36,164 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 07:14:37,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 07:14:37,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:14:39,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 07:14:43,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=636240.0, ans=0.0 2023-09-30 07:14:44,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 07:14:47,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 07:14:49,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:14:52,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 07:14:52,653 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=636306.6666666666, ans=0.1 2023-09-30 07:14:52,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=636306.6666666666, ans=0.0 2023-09-30 07:14:53,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:14:54,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 07:15:00,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:15:00,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:15:00,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:15:02,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:15:02,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:15:02,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:15:02,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=636373.3333333334, ans=0.0 2023-09-30 07:15:03,687 INFO [train.py:1039] (3/4) Epoch 18, batch 5150, loss[loss=0.1622, simple_loss=0.2356, pruned_loss=0.04437, over 24329.00 frames. ], tot_loss[loss=0.1814, simple_loss=0.2568, pruned_loss=0.05301, over 4705152.34 frames. ], batch size: 56, lr: 5.66e-03, grad_scale: 8.0 2023-09-30 07:15:03,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 07:15:03,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 07:15:05,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 07:15:05,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:15:06,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 07:15:10,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:15:10,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 07:15:10,536 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=636373.3333333334, ans=0.0 2023-09-30 07:15:12,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:15:14,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:15:17,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 07:15:17,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 07:15:20,462 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.55 vs. limit=22.5 2023-09-30 07:15:21,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:15:21,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:15:22,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:15:22,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:15:22,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:15:22,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:15:22,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:15:24,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 07:15:24,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=636440.0, ans=15.0 2023-09-30 07:15:25,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:15:27,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:15:30,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 07:15:32,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 07:15:34,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:15:39,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:15:40,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 07:15:41,037 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=636506.6666666666, ans=0.125 2023-09-30 07:15:47,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:15:54,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:15:56,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:16:00,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:16:00,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:16:02,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=636573.3333333334, ans=0.125 2023-09-30 07:16:03,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 07:16:06,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:16:08,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:16:08,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:16:12,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:16:13,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:16:13,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=636640.0, ans=0.125 2023-09-30 07:16:15,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 07:16:18,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:16:20,506 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 07:16:23,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:16:23,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:16:24,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 07:16:24,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:16:24,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:16:24,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:16:27,166 INFO [train.py:1039] (3/4) Epoch 18, batch 5200, loss[loss=0.1707, simple_loss=0.254, pruned_loss=0.04372, over 24486.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2565, pruned_loss=0.05226, over 4726533.74 frames. ], batch size: 66, lr: 5.65e-03, grad_scale: 16.0 2023-09-30 07:16:28,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:16:30,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:16:35,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:16:40,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 07:16:41,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:16:42,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:16:44,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:16:46,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:16:46,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:16:47,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=636773.3333333334, ans=0.2 2023-09-30 07:16:48,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 07:16:51,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 07:16:51,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:16:55,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 07:16:58,250 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.839e+02 2.029e+02 2.263e+02 2.861e+02, threshold=4.059e+02, percent-clipped=0.0 2023-09-30 07:16:58,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:16:59,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:17:00,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 07:17:00,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=636840.0, ans=0.0 2023-09-30 07:17:01,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 07:17:03,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 07:17:03,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:17:03,209 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 07:17:03,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:17:03,453 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=636840.0, ans=0.125 2023-09-30 07:17:04,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:17:06,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:17:06,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 07:17:07,025 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.62 vs. limit=22.5 2023-09-30 07:17:07,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:17:09,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:17:14,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 07:17:14,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 07:17:14,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 07:17:19,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 07:17:19,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:17:25,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:17:27,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:17:28,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 07:17:28,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:17:30,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 07:17:30,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:17:30,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:17:34,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:17:36,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:17:38,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=636973.3333333334, ans=0.0 2023-09-30 07:17:39,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:17:39,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:17:39,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:17:45,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:17:46,490 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 07:17:46,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=636973.3333333334, ans=0.1 2023-09-30 07:17:47,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:17:47,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:17:49,441 INFO [train.py:1039] (3/4) Epoch 18, batch 5250, loss[loss=0.1851, simple_loss=0.2509, pruned_loss=0.05967, over 23741.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2551, pruned_loss=0.05194, over 4724834.90 frames. ], batch size: 212, lr: 5.65e-03, grad_scale: 16.0 2023-09-30 07:17:49,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:17:49,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:17:51,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:17:53,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:17:57,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:17:59,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:17:59,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=637040.0, ans=0.125 2023-09-30 07:18:01,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:18:06,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:18:08,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:18:09,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:18:11,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:18:13,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 07:18:13,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:18:13,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=637106.6666666666, ans=0.1 2023-09-30 07:18:14,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:18:26,347 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=637173.3333333334, ans=0.04949747468305833 2023-09-30 07:18:29,114 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=637173.3333333334, ans=0.125 2023-09-30 07:18:30,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=637173.3333333334, ans=0.125 2023-09-30 07:18:36,076 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=637240.0, ans=0.125 2023-09-30 07:18:38,713 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=637240.0, ans=0.125 2023-09-30 07:18:44,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=637240.0, ans=0.125 2023-09-30 07:18:45,644 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=637240.0, ans=0.125 2023-09-30 07:19:04,850 INFO [train.py:1039] (3/4) Epoch 18, batch 5300, loss[loss=0.1642, simple_loss=0.2221, pruned_loss=0.05316, over 23415.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2542, pruned_loss=0.05219, over 4712880.07 frames. ], batch size: 285, lr: 5.65e-03, grad_scale: 16.0 2023-09-30 07:19:14,045 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.23 vs. limit=6.0 2023-09-30 07:19:20,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:19:20,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 07:19:20,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 07:19:20,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:19:21,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:19:21,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:19:21,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:19:21,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:19:21,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:19:21,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:19:21,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:19:22,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:19:22,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 07:19:22,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 07:19:22,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 07:19:23,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:19:23,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 07:19:23,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 07:19:23,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:19:24,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:19:24,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:19:24,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:19:24,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:19:24,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:19:24,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:19:25,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:19:25,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:19:25,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:19:25,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:19:25,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:19:25,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:19:26,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 07:19:26,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:19:27,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:19:27,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 07:19:27,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 07:19:27,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:19:27,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:19:27,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 07:19:27,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 07:19:27,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 07:19:28,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:19:28,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:19:28,855 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 07:19:28,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 07:19:29,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:19:29,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:19:29,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 07:19:29,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 07:19:29,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 07:19:29,716 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 07:19:38,651 INFO [train.py:1039] (3/4) Epoch 19, batch 0, loss[loss=0.2052, simple_loss=0.2795, pruned_loss=0.06548, over 24036.00 frames. ], tot_loss[loss=0.2052, simple_loss=0.2795, pruned_loss=0.06548, over 24036.00 frames. ], batch size: 86, lr: 5.50e-03, grad_scale: 32.0 2023-09-30 07:19:38,651 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-30 07:19:52,807 INFO [train.py:1071] (3/4) Epoch 19, validation: loss=0.3241, simple_loss=0.2677, pruned_loss=0.1902, over 1125622.00 frames. 2023-09-30 07:19:52,808 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-30 07:19:54,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=637460.0, ans=0.125 2023-09-30 07:19:55,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 07:19:55,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:19:58,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:20:01,911 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.506e+02 1.881e+02 2.156e+02 2.381e+02 5.566e+02, threshold=4.312e+02, percent-clipped=3.0 2023-09-30 07:20:06,343 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:20:06,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:20:06,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:20:07,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 07:20:09,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 07:20:12,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:20:12,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:20:17,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:20:19,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:20:19,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:20:19,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:20:20,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 07:20:20,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=637526.6666666666, ans=0.0 2023-09-30 07:20:22,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:20:22,523 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=637593.3333333334, ans=0.0 2023-09-30 07:20:31,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:20:31,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:20:34,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 07:20:39,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:20:39,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:20:41,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:20:45,712 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:20:48,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:20:49,136 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=637660.0, ans=0.125 2023-09-30 07:20:54,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=637660.0, ans=0.2 2023-09-30 07:20:55,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 07:20:59,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 07:20:59,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:20:59,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:21:01,166 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.74 vs. limit=12.0 2023-09-30 07:21:01,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:21:01,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:21:04,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 07:21:05,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:21:08,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:21:12,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:21:13,703 INFO [train.py:1039] (3/4) Epoch 19, batch 50, loss[loss=0.1991, simple_loss=0.2852, pruned_loss=0.05648, over 24546.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2565, pruned_loss=0.05175, over 1066350.34 frames. ], batch size: 71, lr: 5.50e-03, grad_scale: 16.0 2023-09-30 07:21:15,514 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 07:21:15,745 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=637793.3333333334, ans=0.2 2023-09-30 07:21:17,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:21:21,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:21:23,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:21:23,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 07:21:23,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:21:24,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:21:27,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:21:28,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:21:30,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:21:35,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 07:21:35,208 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:21:42,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 07:21:45,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 07:21:45,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=637926.6666666666, ans=0.0 2023-09-30 07:21:47,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 07:21:48,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:21:48,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:21:48,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:21:50,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:21:51,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 07:21:51,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 07:21:51,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:21:54,288 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.64 vs. limit=15.0 2023-09-30 07:22:01,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:22:01,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:22:01,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 07:22:03,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 07:22:04,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:22:06,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:22:06,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 07:22:07,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:22:10,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 07:22:10,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=637993.3333333334, ans=0.025 2023-09-30 07:22:13,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=637993.3333333334, ans=0.125 2023-09-30 07:22:16,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:22:16,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:22:16,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:22:19,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:22:19,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:22:22,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 07:22:22,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 07:22:22,521 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=638060.0, ans=0.0 2023-09-30 07:22:23,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:22:25,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:22:26,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:22:28,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:22:29,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 07:22:29,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 07:22:30,933 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 07:22:32,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:22:33,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:22:34,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 07:22:34,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 07:22:36,019 INFO [train.py:1039] (3/4) Epoch 19, batch 100, loss[loss=0.1702, simple_loss=0.2537, pruned_loss=0.04336, over 24674.00 frames. ], tot_loss[loss=0.182, simple_loss=0.2586, pruned_loss=0.05271, over 1873093.14 frames. ], batch size: 65, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:22:36,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:22:37,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:22:39,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 07:22:39,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:22:42,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:22:45,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:22:47,048 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.406e+02 1.850e+02 1.971e+02 2.245e+02 4.662e+02, threshold=3.942e+02, percent-clipped=2.0 2023-09-30 07:22:47,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=638126.6666666666, ans=0.1 2023-09-30 07:22:50,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:22:50,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 07:22:50,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:22:56,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:22:56,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:22:58,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:22:58,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:22:58,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:22:59,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 07:23:02,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:23:02,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:23:02,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:23:02,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:23:06,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 07:23:06,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:23:06,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=638260.0, ans=0.125 2023-09-30 07:23:08,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:23:09,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:23:12,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:23:15,145 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 07:23:15,177 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 07:23:16,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:23:16,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:23:21,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 07:23:23,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:23:26,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:30,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:31,740 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 07:23:33,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 07:23:35,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:23:37,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:23:39,194 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=638326.6666666666, ans=0.125 2023-09-30 07:23:40,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:40,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=638393.3333333334, ans=0.125 2023-09-30 07:23:44,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:23:47,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:23:47,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:23:47,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=638393.3333333334, ans=0.0 2023-09-30 07:23:50,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:51,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:23:53,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:23:53,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:23:55,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:23:55,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 07:23:55,307 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 07:23:55,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:23:56,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:23:56,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:23:56,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:23:58,208 INFO [train.py:1039] (3/4) Epoch 19, batch 150, loss[loss=0.1759, simple_loss=0.2593, pruned_loss=0.04626, over 24444.00 frames. ], tot_loss[loss=0.1826, simple_loss=0.2585, pruned_loss=0.05332, over 2500502.41 frames. ], batch size: 69, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:23:58,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 07:23:58,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 07:23:58,390 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:23:58,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:23:59,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:24:01,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:24:01,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:24:03,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:24:06,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:24:07,259 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=4.13 vs. limit=12.0 2023-09-30 07:24:11,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:24:11,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:24:12,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:14,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:24:15,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:17,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:24:18,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:21,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff3_skip_rate, batch_count=638526.6666666666, ans=0.0 2023-09-30 07:24:22,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 07:24:22,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 07:24:22,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 07:24:23,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=638526.6666666666, ans=0.125 2023-09-30 07:24:25,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:24:25,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:24:27,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:24:29,406 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:24:29,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:24:29,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:29,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:24:32,359 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 07:24:33,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:24:40,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:24:45,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:24:45,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 07:24:50,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:24:50,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:24:50,995 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:24:53,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:24:53,531 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=638660.0, ans=0.125 2023-09-30 07:24:54,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:24:54,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:24:57,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:24:57,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 07:24:59,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=638660.0, ans=0.0 2023-09-30 07:25:01,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:25:01,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer_ff2.min_abs, batch_count=638660.0, ans=0.1 2023-09-30 07:25:03,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:25:03,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:25:03,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:25:06,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:25:06,354 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=638726.6666666666, ans=0.0 2023-09-30 07:25:09,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 07:25:11,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:25:13,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:25:14,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:25:16,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:25:16,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 07:25:16,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:25:16,224 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 07:25:18,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=638726.6666666666, ans=0.2 2023-09-30 07:25:18,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=638726.6666666666, ans=0.1 2023-09-30 07:25:19,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:25:21,391 INFO [train.py:1039] (3/4) Epoch 19, batch 200, loss[loss=0.1509, simple_loss=0.2283, pruned_loss=0.03675, over 24315.00 frames. ], tot_loss[loss=0.1821, simple_loss=0.2576, pruned_loss=0.05334, over 2992972.19 frames. ], batch size: 56, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:25:24,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:25:24,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:25:27,514 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.68 vs. limit=15.0 2023-09-30 07:25:28,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 07:25:28,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:25:29,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:25:30,236 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=638793.3333333334, ans=0.125 2023-09-30 07:25:31,745 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=638793.3333333334, ans=0.1 2023-09-30 07:25:32,715 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.866e+02 2.060e+02 2.341e+02 3.608e+02, threshold=4.119e+02, percent-clipped=0.0 2023-09-30 07:25:33,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 07:25:34,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 07:25:36,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:25:38,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:25:40,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:25:40,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:25:40,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:25:43,646 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=638860.0, ans=0.125 2023-09-30 07:25:43,792 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=638860.0, ans=0.04949747468305833 2023-09-30 07:25:58,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:26:00,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:26:00,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:26:00,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:26:02,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 07:26:02,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:26:03,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:03,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:26:05,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:26:05,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:26:08,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 07:26:08,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 07:26:08,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:26:10,119 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=638993.3333333334, ans=0.0 2023-09-30 07:26:13,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:26:18,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:26:25,390 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:25,573 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=638993.3333333334, ans=0.0 2023-09-30 07:26:26,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:26:31,009 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.89 vs. limit=10.0 2023-09-30 07:26:33,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:35,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 07:26:36,621 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:26:36,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:26:36,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:26:39,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:26:41,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 07:26:42,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:26:42,607 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 07:26:43,979 INFO [train.py:1039] (3/4) Epoch 19, batch 250, loss[loss=0.1798, simple_loss=0.2601, pruned_loss=0.04977, over 24655.00 frames. ], tot_loss[loss=0.1828, simple_loss=0.2577, pruned_loss=0.05392, over 3363592.05 frames. ], batch size: 68, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:26:45,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:47,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:26:49,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=639126.6666666666, ans=0.125 2023-09-30 07:26:50,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:50,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:26:54,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:26:54,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:26:57,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:27:00,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:27:04,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=639193.3333333334, ans=0.125 2023-09-30 07:27:12,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:27:13,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:27:15,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:27:18,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 07:27:20,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:27:21,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:27:21,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:27:23,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:27:25,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:27:27,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:27:30,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:27:33,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 07:27:33,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:27:35,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:27:35,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:27:35,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:27:36,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:27:37,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:27:37,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:27:40,062 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:27:41,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:27:43,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:27:46,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:27:47,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=639326.6666666666, ans=0.0 2023-09-30 07:27:51,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:27:54,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:27:57,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:27:57,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=639393.3333333334, ans=0.0 2023-09-30 07:27:59,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:28:03,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 07:28:03,992 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=639393.3333333334, ans=0.0 2023-09-30 07:28:05,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:28:05,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:28:05,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 07:28:05,607 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=639460.0, ans=0.125 2023-09-30 07:28:07,409 INFO [train.py:1039] (3/4) Epoch 19, batch 300, loss[loss=0.1861, simple_loss=0.2681, pruned_loss=0.05206, over 23381.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.256, pruned_loss=0.05251, over 3679600.83 frames. ], batch size: 93, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:28:07,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 07:28:09,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:28:09,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 07:28:12,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:28:12,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=639460.0, ans=0.1 2023-09-30 07:28:13,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:28:13,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=639460.0, ans=0.125 2023-09-30 07:28:17,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:28:18,699 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.819e+02 2.024e+02 2.204e+02 2.893e+02, threshold=4.048e+02, percent-clipped=0.0 2023-09-30 07:28:18,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 07:28:20,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:28:21,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:28:21,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 07:28:21,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:28:23,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=639526.6666666666, ans=10.0 2023-09-30 07:28:26,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:28:32,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:28:32,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 07:28:37,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 07:28:37,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:28:39,280 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=639593.3333333334, ans=0.125 2023-09-30 07:28:41,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:28:42,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:28:42,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 07:28:42,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:28:44,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:28:46,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:28:47,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:28:52,398 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 07:28:52,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 07:28:53,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:28:54,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=639660.0, ans=0.125 2023-09-30 07:28:56,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:28:56,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 07:28:57,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:29:02,434 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:29:06,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:29:06,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 07:29:08,307 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=639660.0, ans=0.0 2023-09-30 07:29:11,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:29:11,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:29:14,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:29:17,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:29:17,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 07:29:17,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 07:29:18,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:29:19,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 07:29:22,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:29:22,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:23,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:29:23,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:29:25,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:25,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=639726.6666666666, ans=0.125 2023-09-30 07:29:25,577 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=639726.6666666666, ans=0.1 2023-09-30 07:29:28,608 INFO [train.py:1039] (3/4) Epoch 19, batch 350, loss[loss=0.18, simple_loss=0.2432, pruned_loss=0.05835, over 23720.00 frames. ], tot_loss[loss=0.1787, simple_loss=0.2539, pruned_loss=0.05176, over 3904599.94 frames. ], batch size: 164, lr: 5.49e-03, grad_scale: 16.0 2023-09-30 07:29:29,648 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.90 vs. limit=10.0 2023-09-30 07:29:30,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:29:30,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 07:29:33,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:36,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=639793.3333333334, ans=0.0 2023-09-30 07:29:40,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:29:41,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:29:43,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:46,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 07:29:47,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:29:48,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 07:29:50,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:29:50,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=639860.0, ans=0.125 2023-09-30 07:29:52,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 07:29:53,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:29:55,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 07:29:58,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:29:59,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:30:01,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:30:01,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:30:01,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:30:03,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:30:03,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:30:03,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:30:06,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:30:06,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:30:14,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:30:14,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:30:15,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:30:17,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:30:26,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 07:30:26,716 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:30:30,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:30:30,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:30:30,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:30:30,908 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=639993.3333333334, ans=0.0 2023-09-30 07:30:33,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 07:30:35,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:30:36,566 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 07:30:36,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 07:30:36,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:30:39,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:30:40,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 07:30:40,290 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=640060.0, ans=0.0 2023-09-30 07:30:43,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:30:48,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:30:48,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:30:50,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:30:50,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:30:52,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:30:55,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2.whitening_limit, batch_count=640126.6666666666, ans=15.0 2023-09-30 07:30:55,838 INFO [train.py:1039] (3/4) Epoch 19, batch 400, loss[loss=0.1841, simple_loss=0.2578, pruned_loss=0.05518, over 23865.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.254, pruned_loss=0.05151, over 4094255.67 frames. ], batch size: 195, lr: 5.49e-03, grad_scale: 32.0 2023-09-30 07:30:56,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:30:59,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:30:59,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 07:30:59,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:31:00,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:31:03,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:31:03,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:31:06,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:31:06,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:31:07,646 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.862e+02 2.041e+02 2.218e+02 3.370e+02, threshold=4.083e+02, percent-clipped=0.0 2023-09-30 07:31:07,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 07:31:10,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 07:31:10,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:31:13,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 07:31:13,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:31:17,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:31:17,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:31:17,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 07:31:19,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:31:19,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:31:19,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:31:20,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:31:22,303 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 07:31:25,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 07:31:30,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:31:31,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:31:32,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 07:31:32,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 07:31:35,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:31:37,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:31:46,108 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 07:31:47,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 07:31:49,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 07:31:51,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:31:53,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:31:54,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 07:31:59,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:32:03,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:32:03,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=640393.3333333334, ans=0.0 2023-09-30 07:32:05,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:32:06,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:32:08,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 07:32:11,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 07:32:11,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 07:32:12,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:32:12,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:32:14,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=640393.3333333334, ans=0.0 2023-09-30 07:32:15,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 07:32:19,081 INFO [train.py:1039] (3/4) Epoch 19, batch 450, loss[loss=0.2264, simple_loss=0.2826, pruned_loss=0.08507, over 19449.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2544, pruned_loss=0.05214, over 4221989.07 frames. ], batch size: 388, lr: 5.48e-03, grad_scale: 32.0 2023-09-30 07:32:19,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 07:32:20,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:32:20,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 07:32:22,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 07:32:22,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:32:23,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:32:25,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:32:25,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 07:32:25,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:32:26,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:32:30,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:32:39,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:32:39,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:32:40,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 07:32:42,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 07:32:46,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:32:49,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:32:51,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:32:55,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:32:55,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:32:58,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 07:32:58,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 07:33:02,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 07:33:03,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:33:03,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=640593.3333333334, ans=0.0 2023-09-30 07:33:04,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:33:05,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=640593.3333333334, ans=0.125 2023-09-30 07:33:06,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:33:06,612 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 07:33:06,626 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 07:33:06,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=640660.0, ans=0.07 2023-09-30 07:33:08,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:33:10,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:33:11,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 07:33:14,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:33:14,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:33:14,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 07:33:16,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 07:33:17,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:33:18,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=640660.0, ans=0.0 2023-09-30 07:33:19,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=640660.0, ans=0.0 2023-09-30 07:33:20,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:33:20,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:33:22,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 07:33:25,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:33:25,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=640726.6666666666, ans=0.2 2023-09-30 07:33:27,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 07:33:29,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 07:33:29,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 07:33:34,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:33:36,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:33:37,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:33:39,161 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 07:33:40,604 INFO [train.py:1039] (3/4) Epoch 19, batch 500, loss[loss=0.2084, simple_loss=0.2736, pruned_loss=0.07158, over 23710.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2557, pruned_loss=0.05267, over 4326964.91 frames. ], batch size: 179, lr: 5.48e-03, grad_scale: 32.0 2023-09-30 07:33:44,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:33:44,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:33:46,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:33:46,748 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 07:33:47,417 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.74 vs. limit=15.0 2023-09-30 07:33:49,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 07:33:49,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:33:52,506 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.803e+02 2.032e+02 2.368e+02 3.527e+02, threshold=4.065e+02, percent-clipped=0.0 2023-09-30 07:33:52,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:33:57,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 07:33:58,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:34:00,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:34:00,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:34:00,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:12,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:34:12,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 07:34:14,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:34:14,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:34:14,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 07:34:14,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.min_abs, batch_count=640926.6666666666, ans=0.5 2023-09-30 07:34:15,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 07:34:17,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:34:19,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:34:20,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:34:20,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:34:22,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 07:34:25,943 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 07:34:30,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:34:30,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:31,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:31,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:32,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:34:35,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 07:34:38,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:34:39,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:34:44,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:34:46,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:34:53,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:34:53,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=641060.0, ans=0.035 2023-09-30 07:34:57,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 07:34:57,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:34:57,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:35:00,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 07:35:00,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 07:35:00,728 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=641060.0, ans=0.0 2023-09-30 07:35:03,221 INFO [train.py:1039] (3/4) Epoch 19, batch 550, loss[loss=0.1884, simple_loss=0.2777, pruned_loss=0.04952, over 24329.00 frames. ], tot_loss[loss=0.18, simple_loss=0.256, pruned_loss=0.052, over 4421566.65 frames. ], batch size: 74, lr: 5.48e-03, grad_scale: 32.0 2023-09-30 07:35:03,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:35:07,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 07:35:09,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 07:35:09,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:35:09,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 07:35:09,915 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:35:11,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:35:11,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:35:11,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:12,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:12,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:35:13,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=641126.6666666666, ans=0.0 2023-09-30 07:35:14,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:35:17,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:35:18,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 07:35:18,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:35:18,462 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=641193.3333333334, ans=0.125 2023-09-30 07:35:24,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:35:24,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:26,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:35:28,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:33,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 07:35:33,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 07:35:33,627 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=641193.3333333334, ans=0.125 2023-09-30 07:35:36,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:35:42,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:35:42,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:35:43,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:35:44,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=641260.0, ans=0.125 2023-09-30 07:35:47,147 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:35:47,157 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 07:35:48,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:35:50,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 07:35:52,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:35:53,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 07:35:53,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:35:55,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:35:56,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 07:35:57,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 07:35:59,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:35:59,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:35:59,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:35:59,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:36:05,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:36:06,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:36:07,258 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.33 vs. limit=22.5 2023-09-30 07:36:08,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:36:09,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:36:09,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 07:36:12,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:36:12,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:36:14,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:36:14,406 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:36:15,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:36:15,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 07:36:22,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 07:36:24,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 07:36:25,846 INFO [train.py:1039] (3/4) Epoch 19, batch 600, loss[loss=0.166, simple_loss=0.2508, pruned_loss=0.04057, over 24317.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2564, pruned_loss=0.05165, over 4494318.27 frames. ], batch size: 74, lr: 5.48e-03, grad_scale: 16.0 2023-09-30 07:36:26,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:36:27,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:36:27,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:36:34,115 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.85 vs. limit=15.0 2023-09-30 07:36:34,926 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=641460.0, ans=0.1 2023-09-30 07:36:36,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:36:37,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:36:39,265 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.814e+02 2.073e+02 2.344e+02 3.797e+02, threshold=4.146e+02, percent-clipped=0.0 2023-09-30 07:36:39,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 07:36:41,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 07:36:42,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:36:45,816 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:36:47,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 07:36:47,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:36:55,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 07:36:58,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:36:58,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:36:58,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:36:59,541 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=16.79 vs. limit=15.0 2023-09-30 07:37:00,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=641593.3333333334, ans=0.125 2023-09-30 07:37:04,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:37:04,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:37:05,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:37:11,828 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:37:16,790 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:37:17,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:37:17,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:37:17,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:37:25,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 07:37:30,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:37:31,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:37:36,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 07:37:36,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:37:40,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 07:37:40,052 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:37:40,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:37:42,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=641726.6666666666, ans=0.2 2023-09-30 07:37:46,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 07:37:48,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 07:37:49,776 INFO [train.py:1039] (3/4) Epoch 19, batch 650, loss[loss=0.1626, simple_loss=0.2356, pruned_loss=0.04478, over 24489.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2547, pruned_loss=0.05151, over 4538013.82 frames. ], batch size: 58, lr: 5.48e-03, grad_scale: 16.0 2023-09-30 07:37:49,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:37:51,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:37:53,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:37:56,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 07:37:57,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:38:03,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:38:03,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:38:04,326 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=641860.0, ans=0.2 2023-09-30 07:38:08,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:38:14,749 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 07:38:17,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:38:17,096 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:38:20,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:38:20,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 07:38:23,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:38:23,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:24,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 07:38:24,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:26,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:38:27,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:38:27,978 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 07:38:27,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:38:29,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:38:29,750 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=641926.6666666666, ans=0.0 2023-09-30 07:38:32,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:32,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:38:34,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:38:34,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:38:35,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 07:38:37,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:38:37,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:38:37,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 07:38:37,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:38:39,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 07:38:41,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 07:38:43,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 07:38:43,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:43,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:38:44,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:38:44,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:38:46,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:38:51,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:38:53,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:38:54,673 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:38:59,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:38:59,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 07:39:00,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:39:08,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:39:08,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:39:08,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:39:08,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:39:11,448 INFO [train.py:1039] (3/4) Epoch 19, batch 700, loss[loss=0.1909, simple_loss=0.2758, pruned_loss=0.05302, over 24357.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.253, pruned_loss=0.05105, over 4583542.95 frames. ], batch size: 74, lr: 5.48e-03, grad_scale: 16.0 2023-09-30 07:39:14,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 07:39:16,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 07:39:19,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 07:39:20,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:39:22,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:39:23,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 07:39:25,600 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.802e+02 1.961e+02 2.175e+02 2.904e+02, threshold=3.922e+02, percent-clipped=0.0 2023-09-30 07:39:27,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:39:29,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:39:30,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:39:32,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 07:39:33,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:39:36,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:39:39,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 07:39:39,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:39:41,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 07:39:43,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=642260.0, ans=0.0 2023-09-30 07:39:44,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 07:39:47,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:39:47,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:39:50,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:39:55,272 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.58 vs. limit=15.0 2023-09-30 07:39:55,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:39:56,280 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=642260.0, ans=0.0 2023-09-30 07:39:57,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 07:39:59,027 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=642260.0, ans=0.0 2023-09-30 07:40:03,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:40:03,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:40:04,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 07:40:09,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:40:10,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:40:14,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:40:17,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:40:18,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 07:40:21,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 07:40:23,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 07:40:23,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=642393.3333333334, ans=0.1 2023-09-30 07:40:24,112 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.69 vs. limit=12.0 2023-09-30 07:40:27,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:40:29,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:40:29,825 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:40:33,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:40:33,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 07:40:34,991 INFO [train.py:1039] (3/4) Epoch 19, batch 750, loss[loss=0.1908, simple_loss=0.2692, pruned_loss=0.05621, over 23958.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2521, pruned_loss=0.05078, over 4613910.71 frames. ], batch size: 80, lr: 5.48e-03, grad_scale: 16.0 2023-09-30 07:40:38,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 07:40:38,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 07:40:39,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 07:40:41,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 07:40:41,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 07:40:41,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:40:42,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 07:40:44,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:40:44,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:40:45,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:40:46,660 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=8.37 vs. limit=15.0 2023-09-30 07:40:47,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:40:47,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=642460.0, ans=0.125 2023-09-30 07:40:48,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:40:48,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:40:50,590 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:40:50,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:40:52,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=642526.6666666666, ans=0.125 2023-09-30 07:40:53,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:40:55,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:40:55,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:40:56,718 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 07:40:58,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:40:58,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:40:59,116 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=642526.6666666666, ans=0.07 2023-09-30 07:41:00,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:41:04,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 07:41:04,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 07:41:04,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:41:07,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 07:41:07,782 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 07:41:08,757 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.22 vs. limit=15.0 2023-09-30 07:41:09,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 07:41:09,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:41:09,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 07:41:11,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:41:19,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:41:19,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:41:19,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:41:20,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:41:23,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:41:23,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 07:41:23,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:41:25,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 07:41:26,042 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=642660.0, ans=0.0 2023-09-30 07:41:27,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:41:30,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:41:30,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 07:41:30,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:41:37,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:41:39,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:41:41,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:41:43,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:41:44,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=642726.6666666666, ans=0.125 2023-09-30 07:41:47,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 07:41:47,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:41:49,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:41:53,609 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:41:53,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:41:55,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:41:55,557 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=642793.3333333334, ans=0.2 2023-09-30 07:41:56,734 INFO [train.py:1039] (3/4) Epoch 19, batch 800, loss[loss=0.1861, simple_loss=0.2637, pruned_loss=0.05421, over 23378.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2531, pruned_loss=0.05101, over 4640947.14 frames. ], batch size: 93, lr: 5.47e-03, grad_scale: 32.0 2023-09-30 07:41:56,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:41:57,419 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.45 vs. limit=22.5 2023-09-30 07:42:03,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:42:03,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:04,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:42:04,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:42:07,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:07,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:42:09,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:10,438 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.608e+02 1.847e+02 2.108e+02 2.482e+02 4.355e+02, threshold=4.217e+02, percent-clipped=1.0 2023-09-30 07:42:14,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:42:14,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:42:19,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 07:42:19,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:42:20,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:42:20,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:42:22,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:42:22,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 07:42:22,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:42:23,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 07:42:27,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:28,011 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.66 vs. limit=15.0 2023-09-30 07:42:30,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:42:33,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:42:33,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:42:34,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:42:34,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:42:40,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:42:40,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:42:42,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 07:42:43,048 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 07:42:43,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 07:42:44,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:42:44,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:42:44,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=642993.3333333334, ans=0.1 2023-09-30 07:42:46,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:42:46,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:42:52,315 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 07:42:52,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 07:42:55,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:42:56,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 07:43:01,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:43:04,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:43:04,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 07:43:06,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:43:07,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 07:43:08,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=643060.0, ans=0.1 2023-09-30 07:43:14,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:43:14,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=643060.0, ans=0.1 2023-09-30 07:43:17,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:43:19,049 INFO [train.py:1039] (3/4) Epoch 19, batch 850, loss[loss=0.1536, simple_loss=0.2261, pruned_loss=0.04051, over 24430.00 frames. ], tot_loss[loss=0.1787, simple_loss=0.2541, pruned_loss=0.05166, over 4652045.42 frames. ], batch size: 58, lr: 5.47e-03, grad_scale: 16.0 2023-09-30 07:43:19,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 07:43:19,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:43:19,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:43:20,429 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=643126.6666666666, ans=0.125 2023-09-30 07:43:21,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 07:43:21,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:43:24,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:43:26,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:43:26,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:43:28,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:43:29,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 07:43:29,868 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 07:43:29,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 07:43:32,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:43:32,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:43:34,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:43:34,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:43:35,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:43:39,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:43:40,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:43:40,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 07:43:43,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 07:43:44,173 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=643193.3333333334, ans=0.125 2023-09-30 07:43:47,432 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:43:48,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 07:43:54,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 07:43:55,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 07:43:57,822 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 07:43:57,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:43:57,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:43:59,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 07:44:01,391 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:44:02,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:44:02,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 07:44:05,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:44:05,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:44:07,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:44:08,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:44:10,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:44:11,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 07:44:13,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 07:44:17,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:44:17,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:44:19,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:44:19,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:44:19,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:44:23,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:44:24,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 07:44:26,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:44:27,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:44:28,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:44:32,047 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=643393.3333333334, ans=0.125 2023-09-30 07:44:36,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 07:44:38,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:44:38,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 07:44:38,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:44:39,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:44:41,345 INFO [train.py:1039] (3/4) Epoch 19, batch 900, loss[loss=0.1794, simple_loss=0.2644, pruned_loss=0.04726, over 24241.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2552, pruned_loss=0.05228, over 4670564.13 frames. ], batch size: 74, lr: 5.47e-03, grad_scale: 16.0 2023-09-30 07:44:42,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 07:44:48,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:44:50,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:44:52,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 07:44:53,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:44:55,223 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.916e+02 2.182e+02 2.478e+02 5.058e+02, threshold=4.365e+02, percent-clipped=1.0 2023-09-30 07:44:55,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 07:44:55,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 07:44:56,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:44:57,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:44:58,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 07:44:59,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:45:04,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=643526.6666666666, ans=0.1 2023-09-30 07:45:10,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten.whitening_limit, batch_count=643526.6666666666, ans=22.5 2023-09-30 07:45:10,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:45:10,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:45:11,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:45:14,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:45:14,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=643593.3333333334, ans=0.125 2023-09-30 07:45:17,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=643593.3333333334, ans=0.125 2023-09-30 07:45:18,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 07:45:20,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:45:24,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 07:45:26,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 07:45:26,589 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 07:45:26,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 07:45:30,698 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.42 vs. limit=12.0 2023-09-30 07:45:34,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 07:45:34,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:45:34,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:45:34,559 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=643660.0, ans=0.125 2023-09-30 07:45:34,890 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=13.59 vs. limit=15.0 2023-09-30 07:45:35,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=643660.0, ans=0.125 2023-09-30 07:45:41,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:45:41,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:45:44,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 07:45:44,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:45:48,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 07:45:50,025 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=8.86 vs. limit=15.0 2023-09-30 07:45:50,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:45:50,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:45:51,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:45:52,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:45:56,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 07:45:56,741 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 07:45:58,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 07:45:59,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 07:46:01,355 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:46:02,753 INFO [train.py:1039] (3/4) Epoch 19, batch 950, loss[loss=0.1853, simple_loss=0.2722, pruned_loss=0.04917, over 24330.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2551, pruned_loss=0.05216, over 4688111.01 frames. ], batch size: 74, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:46:04,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 07:46:10,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:46:13,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:46:13,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:46:15,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:46:17,594 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 07:46:22,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:46:23,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:46:23,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:46:24,483 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.63 vs. limit=22.5 2023-09-30 07:46:25,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:46:25,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 07:46:26,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 07:46:28,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:46:28,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 07:46:30,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:46:33,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:46:33,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:46:33,516 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=643860.0, ans=0.0 2023-09-30 07:46:34,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:46:34,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 07:46:37,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 07:46:38,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=643926.6666666666, ans=0.125 2023-09-30 07:46:39,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:46:41,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=643926.6666666666, ans=0.125 2023-09-30 07:46:43,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:46:50,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:46:50,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:46:53,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 07:46:53,539 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=643993.3333333334, ans=0.125 2023-09-30 07:46:57,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 07:46:57,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 07:46:58,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:46:58,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:46:58,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:47:02,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 07:47:03,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:47:05,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:47:06,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:47:06,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 07:47:06,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:47:06,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:47:08,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 07:47:12,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:47:15,246 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=21.47 vs. limit=22.5 2023-09-30 07:47:15,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:47:21,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:47:23,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 07:47:23,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 07:47:26,777 INFO [train.py:1039] (3/4) Epoch 19, batch 1000, loss[loss=0.1712, simple_loss=0.2425, pruned_loss=0.04994, over 23363.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2542, pruned_loss=0.0518, over 4704808.28 frames. ], batch size: 119, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:47:26,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:47:27,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=644126.6666666666, ans=0.2 2023-09-30 07:47:30,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 07:47:32,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:47:36,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:47:37,619 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=6.19 vs. limit=12.0 2023-09-30 07:47:38,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 07:47:38,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 07:47:40,134 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=644126.6666666666, ans=0.0 2023-09-30 07:47:42,700 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.497e+02 2.138e+02 2.514e+02 3.202e+02 5.752e+02, threshold=5.028e+02, percent-clipped=6.0 2023-09-30 07:47:43,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:47:43,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=644193.3333333334, ans=0.125 2023-09-30 07:47:44,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:47:45,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:47:47,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 07:47:51,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 07:47:53,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 07:47:53,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:47:54,378 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.36 vs. limit=10.0 2023-09-30 07:47:56,718 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 07:47:56,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 07:47:56,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 07:47:58,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:48:00,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:09,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:48:09,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:48:11,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:11,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:48:11,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 07:48:11,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:48:13,039 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:48:14,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:48:14,656 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 07:48:14,946 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 07:48:19,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 07:48:19,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 07:48:20,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 07:48:20,993 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=644326.6666666666, ans=0.1 2023-09-30 07:48:22,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:48:29,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:29,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:48:29,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:32,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:48:34,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 07:48:36,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:48:36,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 07:48:37,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 07:48:40,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:48:40,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:48:43,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:48:46,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:48:48,066 INFO [train.py:1039] (3/4) Epoch 19, batch 1050, loss[loss=0.174, simple_loss=0.2634, pruned_loss=0.04227, over 24340.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2531, pruned_loss=0.05175, over 4716738.15 frames. ], batch size: 74, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:48:48,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:48:50,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:48:51,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:48:53,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:48:54,170 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.07 vs. limit=15.0 2023-09-30 07:48:54,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:48:58,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:49:00,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 07:49:01,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 07:49:05,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:49:06,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:49:06,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 07:49:08,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 07:49:08,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 07:49:09,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:49:09,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 07:49:15,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:49:15,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 07:49:15,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 07:49:21,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:49:21,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:49:21,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:49:24,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 07:49:24,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 07:49:26,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:49:27,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 07:49:31,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 07:49:33,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:49:38,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 07:49:40,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 07:49:40,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:49:41,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:49:42,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=644660.0, ans=0.04949747468305833 2023-09-30 07:49:44,137 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=8.60 vs. limit=15.0 2023-09-30 07:49:45,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:49:48,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 07:49:50,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 07:49:50,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 07:49:51,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:49:51,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:49:52,285 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=644660.0, ans=0.125 2023-09-30 07:49:53,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 07:49:56,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:49:58,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 07:49:58,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:49:58,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:49:59,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:50:03,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=644726.6666666666, ans=0.2 2023-09-30 07:50:04,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:50:06,889 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 07:50:07,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 07:50:07,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 07:50:08,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 07:50:08,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:50:11,444 INFO [train.py:1039] (3/4) Epoch 19, batch 1100, loss[loss=0.1824, simple_loss=0.2575, pruned_loss=0.05369, over 24364.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2527, pruned_loss=0.05102, over 4722693.95 frames. ], batch size: 77, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:50:11,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:50:17,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:50:17,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=644793.3333333334, ans=0.1 2023-09-30 07:50:20,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=644793.3333333334, ans=0.125 2023-09-30 07:50:23,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 07:50:25,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 07:50:25,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:50:26,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 07:50:26,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:50:28,198 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.585e+02 1.773e+02 2.054e+02 2.605e+02 4.840e+02, threshold=4.108e+02, percent-clipped=0.0 2023-09-30 07:50:29,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 07:50:31,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:50:34,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 07:50:34,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 07:50:34,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=644860.0, ans=0.1 2023-09-30 07:50:36,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 07:50:39,590 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:50:39,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:50:41,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:50:44,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 07:50:49,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:50:51,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 07:50:53,158 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 07:50:53,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:50:56,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:50:58,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:50:59,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:51:00,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 07:51:01,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:51:01,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 07:51:01,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:51:01,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:51:01,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 07:51:08,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 07:51:08,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 07:51:10,435 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=644993.3333333334, ans=0.0 2023-09-30 07:51:11,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:51:18,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 07:51:21,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 07:51:21,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 07:51:22,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:51:26,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:51:26,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:51:28,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 07:51:28,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:51:28,546 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:51:30,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 07:51:30,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:51:30,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 07:51:31,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:51:33,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:51:34,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:51:35,362 INFO [train.py:1039] (3/4) Epoch 19, batch 1150, loss[loss=0.2224, simple_loss=0.2817, pruned_loss=0.08151, over 19867.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2532, pruned_loss=0.05101, over 4733415.37 frames. ], batch size: 388, lr: 5.47e-03, grad_scale: 8.0 2023-09-30 07:51:40,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:51:43,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:51:44,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:51:44,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:51:46,775 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 07:51:46,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:51:47,467 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.14 vs. limit=22.5 2023-09-30 07:51:49,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 07:51:52,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:51:52,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:51:56,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 07:51:58,624 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:52:03,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:52:05,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:52:06,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 07:52:06,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:52:06,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:52:11,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 07:52:13,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:52:14,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:52:19,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=645260.0, ans=0.1 2023-09-30 07:52:24,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:52:33,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:52:33,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 07:52:34,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:52:34,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:52:41,982 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 07:52:43,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:52:49,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=645393.3333333334, ans=0.0 2023-09-30 07:52:50,511 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 07:52:55,096 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:52:58,275 INFO [train.py:1039] (3/4) Epoch 19, batch 1200, loss[loss=0.1672, simple_loss=0.2427, pruned_loss=0.04587, over 24299.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2542, pruned_loss=0.05086, over 4742593.02 frames. ], batch size: 56, lr: 5.46e-03, grad_scale: 16.0 2023-09-30 07:52:58,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:52:58,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 07:52:58,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:53:01,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:53:05,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=645460.0, ans=0.2 2023-09-30 07:53:08,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 07:53:08,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 07:53:09,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:53:09,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:53:09,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 07:53:10,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer2.prob, batch_count=645460.0, ans=0.125 2023-09-30 07:53:11,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:53:13,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 07:53:14,392 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.937e+02 2.117e+02 2.458e+02 3.944e+02, threshold=4.235e+02, percent-clipped=0.0 2023-09-30 07:53:14,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:53:14,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:53:18,284 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 07:53:21,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 07:53:23,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:53:26,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:53:28,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=645526.6666666666, ans=0.2 2023-09-30 07:53:29,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:53:29,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:53:30,838 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 07:53:30,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:53:41,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 07:53:41,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:53:41,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 07:53:41,255 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:53:44,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 07:53:49,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=645660.0, ans=0.125 2023-09-30 07:53:51,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 07:53:51,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:53:52,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:53:54,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:53:56,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 07:53:57,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:53:57,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:53:59,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 07:53:59,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 07:53:59,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=645660.0, ans=0.125 2023-09-30 07:54:01,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:54:01,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:54:01,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 07:54:02,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:54:02,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:54:07,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 07:54:07,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=645726.6666666666, ans=0.125 2023-09-30 07:54:09,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 07:54:09,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=645726.6666666666, ans=0.125 2023-09-30 07:54:13,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 07:54:14,962 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=645726.6666666666, ans=0.0 2023-09-30 07:54:16,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=645726.6666666666, ans=0.0 2023-09-30 07:54:17,614 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 07:54:19,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:54:20,535 INFO [train.py:1039] (3/4) Epoch 19, batch 1250, loss[loss=0.1828, simple_loss=0.2432, pruned_loss=0.06118, over 23832.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2559, pruned_loss=0.05202, over 4734430.82 frames. ], batch size: 164, lr: 5.46e-03, grad_scale: 16.0 2023-09-30 07:54:22,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:54:24,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:54:24,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:54:27,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 07:54:31,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:54:32,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:54:33,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 07:54:35,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 07:54:37,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 07:54:39,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 07:54:40,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:54:42,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 07:54:43,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:54:44,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 07:54:49,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 07:54:49,803 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:54:49,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:54:51,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:54:52,179 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.34 vs. limit=15.0 2023-09-30 07:54:52,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:54:56,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:54:56,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 07:55:03,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 07:55:03,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 07:55:06,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:55:06,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 07:55:08,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:55:08,284 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 07:55:08,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:55:08,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:55:10,514 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.33 vs. limit=15.0 2023-09-30 07:55:11,768 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=645993.3333333334, ans=0.1 2023-09-30 07:55:13,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:55:17,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 07:55:19,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:55:19,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 07:55:19,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 07:55:21,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 07:55:24,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:55:25,098 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=645993.3333333334, ans=0.04949747468305833 2023-09-30 07:55:26,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 07:55:26,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:55:28,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer2.prob, batch_count=646060.0, ans=0.125 2023-09-30 07:55:31,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 07:55:31,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:55:32,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 07:55:32,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 07:55:32,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 07:55:32,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 07:55:34,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:55:35,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 07:55:37,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=646060.0, ans=0.125 2023-09-30 07:55:39,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:55:39,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:55:40,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 07:55:44,134 INFO [train.py:1039] (3/4) Epoch 19, batch 1300, loss[loss=0.182, simple_loss=0.2676, pruned_loss=0.04814, over 24645.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2568, pruned_loss=0.05231, over 4737198.99 frames. ], batch size: 73, lr: 5.46e-03, grad_scale: 16.0 2023-09-30 07:55:44,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 07:55:47,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:55:48,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 07:55:51,202 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:55:54,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 07:55:54,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:55:56,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:55:57,182 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=10.16 vs. limit=15.0 2023-09-30 07:55:59,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 07:56:00,443 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.929e+02 2.084e+02 2.401e+02 3.525e+02, threshold=4.167e+02, percent-clipped=0.0 2023-09-30 07:56:00,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 07:56:06,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:56:07,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 07:56:08,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 07:56:12,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 07:56:15,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:56:16,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:56:17,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:56:19,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:56:21,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 07:56:21,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 07:56:22,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 07:56:30,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:56:30,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 07:56:32,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 07:56:32,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 07:56:34,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:56:35,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:56:35,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 07:56:37,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:56:37,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 07:56:39,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:56:44,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:56:44,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:56:48,654 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.26 vs. limit=15.0 2023-09-30 07:56:49,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 07:56:49,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 07:56:51,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 07:56:55,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:56:57,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 07:56:59,548 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:57:06,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 07:57:07,836 INFO [train.py:1039] (3/4) Epoch 19, batch 1350, loss[loss=0.163, simple_loss=0.2168, pruned_loss=0.05456, over 23425.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2557, pruned_loss=0.05211, over 4732041.69 frames. ], batch size: 285, lr: 5.46e-03, grad_scale: 16.0 2023-09-30 07:57:10,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:57:13,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:57:14,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:57:16,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:57:17,359 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.51 vs. limit=22.5 2023-09-30 07:57:19,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:57:19,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:57:26,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 07:57:26,741 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.79 vs. limit=15.0 2023-09-30 07:57:27,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 07:57:29,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:57:29,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 07:57:31,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 07:57:32,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:57:33,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:57:33,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 07:57:34,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 07:57:36,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 07:57:36,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=646526.6666666666, ans=0.0 2023-09-30 07:57:38,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:57:38,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 07:57:54,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:58:04,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:58:05,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:58:05,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 07:58:09,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:58:09,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 07:58:09,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 07:58:11,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 07:58:14,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:58:16,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 07:58:17,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 07:58:21,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=646726.6666666666, ans=0.07 2023-09-30 07:58:24,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 07:58:26,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 07:58:31,430 INFO [train.py:1039] (3/4) Epoch 19, batch 1400, loss[loss=0.1905, simple_loss=0.2727, pruned_loss=0.0541, over 24404.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2543, pruned_loss=0.05132, over 4720532.79 frames. ], batch size: 77, lr: 5.46e-03, grad_scale: 8.0 2023-09-30 07:58:33,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 07:58:34,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 07:58:36,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:58:38,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:58:44,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 07:58:45,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 07:58:46,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=646860.0, ans=0.0 2023-09-30 07:58:49,380 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.898e+02 2.160e+02 2.671e+02 3.929e+02, threshold=4.321e+02, percent-clipped=0.0 2023-09-30 07:58:55,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 07:58:57,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:59:00,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 07:59:00,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 07:59:04,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 07:59:06,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 07:59:11,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=646926.6666666666, ans=0.125 2023-09-30 07:59:16,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:59:16,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:59:22,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 07:59:24,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 07:59:24,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 07:59:26,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 07:59:26,305 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 07:59:27,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 07:59:27,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 07:59:29,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 07:59:30,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 07:59:32,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 07:59:35,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:59:39,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 07:59:48,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 07:59:49,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 07:59:49,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 07:59:51,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 07:59:52,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 07:59:52,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 07:59:54,267 INFO [train.py:1039] (3/4) Epoch 19, batch 1450, loss[loss=0.1581, simple_loss=0.203, pruned_loss=0.05658, over 19577.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2538, pruned_loss=0.05152, over 4707392.99 frames. ], batch size: 388, lr: 5.46e-03, grad_scale: 8.0 2023-09-30 07:59:55,390 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=13.96 vs. limit=15.0 2023-09-30 07:59:55,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 07:59:59,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 07:59:59,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 07:59:59,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 08:00:04,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:00:04,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:00:06,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:00:07,850 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 08:00:07,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:00:09,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 08:00:10,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:00:11,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:00:11,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 08:00:14,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:00:14,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:00:14,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 08:00:14,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:00:15,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:00:18,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:00:21,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:00:24,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:00:24,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:00:27,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:00:27,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:00:30,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:00:30,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:00:30,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:00:32,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:00:35,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 08:00:37,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:00:40,736 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 08:00:42,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:00:43,061 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=7.85 vs. limit=15.0 2023-09-30 08:00:43,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:00:44,025 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=647326.6666666666, ans=0.125 2023-09-30 08:00:45,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:00:45,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 08:00:48,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:00:50,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 08:00:52,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 08:00:52,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=647326.6666666666, ans=0.1 2023-09-30 08:00:55,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:00:58,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:00:58,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:01:00,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 08:01:03,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 08:01:03,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 08:01:03,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=647393.3333333334, ans=0.125 2023-09-30 08:01:05,100 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:01:06,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:01:16,673 INFO [train.py:1039] (3/4) Epoch 19, batch 1500, loss[loss=0.1896, simple_loss=0.2578, pruned_loss=0.06065, over 23821.00 frames. ], tot_loss[loss=0.1792, simple_loss=0.2539, pruned_loss=0.0522, over 4692924.59 frames. ], batch size: 195, lr: 5.46e-03, grad_scale: 8.0 2023-09-30 08:01:16,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 08:01:16,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:01:16,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:01:18,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:01:18,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:01:18,680 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=647460.0, ans=0.125 2023-09-30 08:01:19,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:01:20,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=647460.0, ans=0.2 2023-09-30 08:01:22,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 08:01:25,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:01:25,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 08:01:25,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:01:27,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:01:28,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:01:31,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:01:34,838 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.480e+02 1.890e+02 2.087e+02 2.413e+02 4.629e+02, threshold=4.174e+02, percent-clipped=1.0 2023-09-30 08:01:36,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:01:36,715 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 08:01:38,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:01:38,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:01:40,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:01:42,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 08:01:47,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 08:01:50,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:01:50,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 08:01:51,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:01:53,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:01:54,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:01:54,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:01:56,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 08:01:57,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:01:57,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:01:58,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=647593.3333333334, ans=0.125 2023-09-30 08:01:59,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 08:02:00,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:02:07,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:02:07,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 08:02:12,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 08:02:12,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:02:17,760 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 08:02:19,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:02:19,160 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 08:02:20,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:02:22,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:02:22,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=647726.6666666666, ans=0.035 2023-09-30 08:02:23,530 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 08:02:25,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:02:26,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 08:02:28,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:02:31,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:02:31,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:02:33,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:02:33,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:02:35,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:02:37,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 08:02:37,596 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=647793.3333333334, ans=0.05 2023-09-30 08:02:38,631 INFO [train.py:1039] (3/4) Epoch 19, batch 1550, loss[loss=0.1682, simple_loss=0.2479, pruned_loss=0.04422, over 24565.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2546, pruned_loss=0.05254, over 4693882.90 frames. ], batch size: 60, lr: 5.45e-03, grad_scale: 8.0 2023-09-30 08:02:38,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 08:02:38,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:02:40,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 08:02:40,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 08:02:40,854 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.06 vs. limit=12.0 2023-09-30 08:02:43,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:02:44,885 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:02:45,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=647793.3333333334, ans=0.125 2023-09-30 08:02:46,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:02:46,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:02:47,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:02:47,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:02:51,456 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 08:02:51,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:02:51,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:02:53,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 08:02:55,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:02:55,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 08:02:56,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:02:57,419 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=8.85 vs. limit=15.0 2023-09-30 08:02:58,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 08:02:58,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 08:02:58,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 08:02:59,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:03:01,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:03:04,017 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=9.61 vs. limit=15.0 2023-09-30 08:03:04,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:03:04,780 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=647860.0, ans=0.05 2023-09-30 08:03:08,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 08:03:08,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 08:03:16,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:03:18,316 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=647926.6666666666, ans=0.1 2023-09-30 08:03:22,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:03:22,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:03:22,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:03:22,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 08:03:30,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:03:33,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:03:34,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:03:36,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:03:36,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=647993.3333333334, ans=0.125 2023-09-30 08:03:37,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:03:37,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 08:03:37,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:03:40,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:03:40,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:03:42,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 08:03:42,511 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 08:03:46,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:03:51,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 08:03:57,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:03:59,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:03:59,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 08:04:01,501 INFO [train.py:1039] (3/4) Epoch 19, batch 1600, loss[loss=0.1695, simple_loss=0.2457, pruned_loss=0.04668, over 23424.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2556, pruned_loss=0.05314, over 4698676.07 frames. ], batch size: 93, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:04:03,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:04:04,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:04:04,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:04:04,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:04:04,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:04:08,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:04:09,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 08:04:11,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 08:04:13,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 08:04:16,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:04:17,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 08:04:19,704 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.864e+02 2.063e+02 2.300e+02 3.333e+02, threshold=4.126e+02, percent-clipped=0.0 2023-09-30 08:04:19,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:04:21,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:04:26,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:04:28,654 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.21 vs. limit=15.0 2023-09-30 08:04:30,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 08:04:33,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:04:33,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 08:04:34,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:04:35,068 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=648260.0, ans=0.5 2023-09-30 08:04:36,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 08:04:39,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 08:04:46,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=648260.0, ans=0.2 2023-09-30 08:04:47,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:04:49,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 08:04:49,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:04:51,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:04:51,283 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:04:52,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 08:04:56,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 08:04:56,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:04:57,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:04:59,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:05:00,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:05:01,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:05:04,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:05:04,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:05:11,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:05:13,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:05:16,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 08:05:16,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:05:16,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 08:05:23,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:05:23,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:05:24,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=648460.0, ans=0.1 2023-09-30 08:05:25,225 INFO [train.py:1039] (3/4) Epoch 19, batch 1650, loss[loss=0.1636, simple_loss=0.2461, pruned_loss=0.04055, over 24352.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2553, pruned_loss=0.05288, over 4702233.71 frames. ], batch size: 61, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:05:25,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:05:25,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 08:05:25,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 08:05:25,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 08:05:26,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 08:05:30,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:05:31,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:05:31,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:05:31,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:05:33,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:05:36,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 08:05:39,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:05:39,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:05:39,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:05:40,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:05:42,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 08:05:42,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 08:05:48,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:05:50,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:05:58,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 08:05:59,417 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.36 vs. limit=15.0 2023-09-30 08:06:01,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:06:03,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 08:06:07,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:06:09,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:06:11,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:06:11,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:06:13,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:06:13,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:06:15,385 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:06:16,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:06:16,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:06:18,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:06:18,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:06:20,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:06:20,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:06:21,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:06:23,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 08:06:23,886 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=648660.0, ans=0.125 2023-09-30 08:06:24,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:06:25,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 08:06:25,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 08:06:25,453 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=648660.0, ans=0.0 2023-09-30 08:06:26,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 08:06:26,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:06:28,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:06:29,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:06:29,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:06:29,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 08:06:34,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:06:37,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:06:37,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:06:40,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 08:06:47,407 INFO [train.py:1039] (3/4) Epoch 19, batch 1700, loss[loss=0.1793, simple_loss=0.2524, pruned_loss=0.05313, over 23431.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2554, pruned_loss=0.05298, over 4698303.41 frames. ], batch size: 119, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:06:47,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:06:47,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:06:47,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 08:06:49,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:06:49,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:06:49,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:06:52,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:06:52,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:06:52,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 08:06:55,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=648793.3333333334, ans=0.1 2023-09-30 08:06:56,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:07:04,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:07:05,484 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.448e+02 1.834e+02 2.122e+02 2.379e+02 4.054e+02, threshold=4.245e+02, percent-clipped=0.0 2023-09-30 08:07:05,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:07:11,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:07:12,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:07:12,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:07:14,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:07:16,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=648860.0, ans=0.1 2023-09-30 08:07:17,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 08:07:19,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:07:19,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:07:22,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:07:24,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 08:07:26,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 08:07:26,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 08:07:27,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:07:29,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 08:07:29,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:07:31,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=648926.6666666666, ans=0.125 2023-09-30 08:07:34,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=648926.6666666666, ans=0.0 2023-09-30 08:07:40,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:07:40,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:07:41,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:07:44,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:07:44,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 08:07:44,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:07:47,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:07:47,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 08:07:48,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:07:48,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:07:48,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:07:48,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:07:49,269 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.57 vs. limit=15.0 2023-09-30 08:07:51,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:07:51,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:07:54,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:07:54,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:07:54,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:08:00,019 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:08:00,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 08:08:02,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:08:03,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:08:06,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 08:08:09,789 INFO [train.py:1039] (3/4) Epoch 19, batch 1750, loss[loss=0.1861, simple_loss=0.264, pruned_loss=0.05407, over 24671.00 frames. ], tot_loss[loss=0.1787, simple_loss=0.2533, pruned_loss=0.05207, over 4703651.02 frames. ], batch size: 65, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:08:11,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:08:14,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:08:14,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:08:14,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 08:08:14,934 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=649126.6666666666, ans=0.125 2023-09-30 08:08:16,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:08:19,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:08:19,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:08:24,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 08:08:26,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:08:29,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 08:08:29,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:08:31,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:08:34,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 08:08:35,323 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=649193.3333333334, ans=0.0 2023-09-30 08:08:36,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 08:08:36,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=649193.3333333334, ans=0.0 2023-09-30 08:08:39,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:08:39,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 08:08:50,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:08:52,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:08:52,267 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:08:55,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:08:55,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:08:58,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:09:00,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:09:02,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:09:02,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:09:04,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 08:09:06,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:09:09,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 08:09:11,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:09:11,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:09:12,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:09:13,524 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.44 vs. limit=15.0 2023-09-30 08:09:17,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 08:09:18,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 08:09:18,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:09:19,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=649393.3333333334, ans=0.125 2023-09-30 08:09:22,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:09:25,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:09:26,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=649393.3333333334, ans=0.0 2023-09-30 08:09:27,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:09:29,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:09:30,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 08:09:30,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:09:31,820 INFO [train.py:1039] (3/4) Epoch 19, batch 1800, loss[loss=0.1661, simple_loss=0.2482, pruned_loss=0.04195, over 24329.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.252, pruned_loss=0.05126, over 4714799.82 frames. ], batch size: 61, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:09:32,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:09:32,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:09:32,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:09:32,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:09:32,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:09:37,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:09:38,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:09:40,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 08:09:42,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=649460.0, ans=0.0 2023-09-30 08:09:43,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:09:47,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:09:47,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:09:50,069 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.459e+02 1.833e+02 2.073e+02 2.384e+02 3.418e+02, threshold=4.146e+02, percent-clipped=0.0 2023-09-30 08:09:50,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:09:51,106 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=8.16 vs. limit=15.0 2023-09-30 08:09:53,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:09:53,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:09:54,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:09:57,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=649526.6666666666, ans=0.125 2023-09-30 08:09:58,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:09:58,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 08:09:58,765 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:03,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:06,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 08:10:09,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 08:10:10,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 08:10:10,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:10:11,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:10:11,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:10:11,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:10:20,286 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 08:10:21,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:10:22,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=649660.0, ans=0.125 2023-09-30 08:10:23,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:24,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 08:10:25,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 08:10:26,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:10:26,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:10:28,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:10:33,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 08:10:39,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:10:41,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 08:10:41,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:10:43,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:10:43,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:10:44,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 08:10:47,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:10:47,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:10:51,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 08:10:51,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:10:53,226 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:10:54,657 INFO [train.py:1039] (3/4) Epoch 19, batch 1850, loss[loss=0.1609, simple_loss=0.2332, pruned_loss=0.04433, over 24457.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2522, pruned_loss=0.05126, over 4710830.05 frames. ], batch size: 58, lr: 5.45e-03, grad_scale: 16.0 2023-09-30 08:10:54,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:10:54,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:55,118 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=649793.3333333334, ans=0.09899494936611666 2023-09-30 08:10:56,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:10:56,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:10:59,856 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:10:59,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:11:01,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:11:01,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:11:11,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:11:11,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 08:11:15,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 08:11:18,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 08:11:22,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:11:22,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 08:11:22,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 08:11:32,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:11:34,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 08:11:38,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:11:38,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:11:41,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=649926.6666666666, ans=0.04949747468305833 2023-09-30 08:11:44,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 08:11:45,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:11:45,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:11:47,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:11:49,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:11:51,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:11:54,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:11:54,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:11:56,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 08:11:56,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:11:57,629 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:11:59,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:12:02,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 08:12:02,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:12:07,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:12:07,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:12:07,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 08:12:07,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 08:12:08,536 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=9.00 vs. limit=15.0 2023-09-30 08:12:09,583 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 08:12:11,058 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 08:12:12,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:12:12,622 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:12:12,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:12:12,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:12:12,796 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 08:12:12,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:12:12,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:12:14,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:12:16,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 08:12:16,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:12:16,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 08:12:18,010 INFO [train.py:1039] (3/4) Epoch 19, batch 1900, loss[loss=0.1798, simple_loss=0.257, pruned_loss=0.0513, over 24690.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2534, pruned_loss=0.0518, over 4716639.49 frames. ], batch size: 65, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:12:19,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:12:19,685 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 08:12:19,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:12:22,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:12:29,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:12:32,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:12:34,266 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 08:12:35,699 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.498e+02 1.831e+02 2.039e+02 2.265e+02 4.223e+02, threshold=4.078e+02, percent-clipped=1.0 2023-09-30 08:12:35,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 08:12:36,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:12:37,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:12:37,568 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 08:12:37,623 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 08:12:41,761 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.30 vs. limit=15.0 2023-09-30 08:12:42,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 08:12:44,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:12:44,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=650193.3333333334, ans=0.125 2023-09-30 08:12:48,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 08:12:52,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 08:12:53,297 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=8.39 vs. limit=10.0 2023-09-30 08:13:00,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 08:13:05,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 08:13:05,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:13:05,664 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 08:13:05,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 08:13:05,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 08:13:07,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 08:13:07,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:13:10,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 08:13:14,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:13:14,623 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=650326.6666666666, ans=0.125 2023-09-30 08:13:15,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:13:15,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 08:13:20,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:13:23,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 08:13:23,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:13:29,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=650393.3333333334, ans=0.125 2023-09-30 08:13:30,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:13:30,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:13:30,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:13:32,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:13:33,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 08:13:33,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 08:13:34,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:13:38,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:13:38,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:13:40,094 INFO [train.py:1039] (3/4) Epoch 19, batch 1950, loss[loss=0.1783, simple_loss=0.2541, pruned_loss=0.0513, over 23270.00 frames. ], tot_loss[loss=0.179, simple_loss=0.2541, pruned_loss=0.05193, over 4717305.89 frames. ], batch size: 93, lr: 5.44e-03, grad_scale: 8.0 2023-09-30 08:13:41,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:13:41,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:13:41,814 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:13:43,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:13:48,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:13:49,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:13:49,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:13:49,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:13:53,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 08:13:54,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 08:13:55,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:13:55,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:13:58,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:13:58,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:13:58,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:02,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:14:03,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:14:03,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:14:03,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:14:03,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:06,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:11,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:14:11,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:14:11,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:14:11,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 08:14:12,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:14:13,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:14:13,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:14:18,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:20,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:14:25,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:14:28,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:14:28,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:14:29,163 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.00 vs. limit=15.0 2023-09-30 08:14:30,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 08:14:30,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:14:33,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:14:35,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:14:35,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:14:40,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=650660.0, ans=0.125 2023-09-30 08:14:41,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:14:43,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:14:45,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:14:49,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:14:51,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=650726.6666666666, ans=0.1 2023-09-30 08:14:52,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:14:53,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:14:54,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 08:14:54,028 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:14:55,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:14:57,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 08:14:57,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:14:59,568 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=650726.6666666666, ans=0.125 2023-09-30 08:15:03,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:15:04,392 INFO [train.py:1039] (3/4) Epoch 19, batch 2000, loss[loss=0.1739, simple_loss=0.2427, pruned_loss=0.05258, over 24315.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2557, pruned_loss=0.0523, over 4717812.12 frames. ], batch size: 56, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:15:04,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:15:04,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:15:05,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:15:07,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:15:10,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 08:15:12,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:15:16,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:15:18,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 08:15:18,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:15:18,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:15:22,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:15:23,994 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.555e+02 2.096e+02 2.439e+02 2.971e+02 4.515e+02, threshold=4.878e+02, percent-clipped=2.0 2023-09-30 08:15:24,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 08:15:25,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:15:27,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:15:28,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:15:30,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 08:15:30,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:15:32,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 08:15:32,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:15:36,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:15:39,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 08:15:39,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:15:39,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:15:42,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:15:42,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 08:15:45,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 08:15:45,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:15:45,345 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:15:51,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:15:51,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:15:51,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:15:52,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:15:54,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=650993.3333333334, ans=0.2 2023-09-30 08:15:56,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:15:57,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:15:58,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:15:58,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:16:00,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:04,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:16:04,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 08:16:09,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:16:10,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:16:11,043 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.39 vs. limit=22.5 2023-09-30 08:16:11,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=651060.0, ans=0.125 2023-09-30 08:16:13,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:16:13,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:16:18,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:21,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:16:21,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:21,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:16:21,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:16:23,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:16:25,897 INFO [train.py:1039] (3/4) Epoch 19, batch 2050, loss[loss=0.1791, simple_loss=0.2679, pruned_loss=0.04519, over 24541.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2539, pruned_loss=0.05183, over 4713100.67 frames. ], batch size: 71, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:16:25,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:30,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:16:31,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:33,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=651126.6666666666, ans=0.0 2023-09-30 08:16:34,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:16:38,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:16:40,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:16:40,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:16:43,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 08:16:43,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:16:44,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:16:44,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:16:53,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:16:53,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:16:56,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 08:16:59,025 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=6.88 vs. limit=10.0 2023-09-30 08:16:59,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:17:01,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 08:17:01,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:17:06,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:17:06,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:17:08,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:17:08,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:17:10,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:17:12,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:17:12,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:17:15,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:17:17,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:17:20,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:17:22,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:17:26,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:17:31,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:17:33,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 08:17:40,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:17:40,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:17:42,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=651393.3333333334, ans=0.125 2023-09-30 08:17:43,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:17:44,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 08:17:47,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=651393.3333333334, ans=0.125 2023-09-30 08:17:50,113 INFO [train.py:1039] (3/4) Epoch 19, batch 2100, loss[loss=0.1832, simple_loss=0.2494, pruned_loss=0.05852, over 23638.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.2532, pruned_loss=0.05202, over 4710132.85 frames. ], batch size: 149, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:17:50,330 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 08:17:50,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:17:50,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:17:51,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:17:53,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:17:53,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 08:17:53,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 08:17:54,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:17:57,436 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=651460.0, ans=0.0 2023-09-30 08:17:58,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:17:58,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:18:00,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:18:01,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:18:01,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 08:18:02,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=651460.0, ans=0.0 2023-09-30 08:18:03,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:18:03,440 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=651460.0, ans=0.125 2023-09-30 08:18:04,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 08:18:04,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 08:18:06,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:18:07,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:18:07,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 08:18:07,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 08:18:09,220 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.858e+02 2.144e+02 2.529e+02 4.189e+02, threshold=4.288e+02, percent-clipped=0.0 2023-09-30 08:18:13,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 08:18:13,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:18:13,453 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=651526.6666666666, ans=0.125 2023-09-30 08:18:13,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=651526.6666666666, ans=0.1 2023-09-30 08:18:16,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:18:16,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:18:19,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:18:21,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 08:18:21,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:18:21,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 08:18:24,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 08:18:26,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:18:26,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 08:18:26,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 08:18:28,421 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 08:18:30,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:18:30,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=651593.3333333334, ans=0.125 2023-09-30 08:18:32,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:18:33,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer2.prob, batch_count=651593.3333333334, ans=0.125 2023-09-30 08:18:36,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:18:36,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:18:37,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:18:39,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:18:39,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 08:18:39,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:18:40,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:18:40,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:18:40,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 08:18:42,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 08:18:44,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 08:18:47,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:18:52,249 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:18:52,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 08:18:52,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=651660.0, ans=0.125 2023-09-30 08:18:58,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:19:01,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:19:03,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:19:03,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:19:03,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 08:19:03,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:19:04,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:19:04,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:19:05,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:19:05,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:08,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 08:19:09,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 08:19:09,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:19:12,616 INFO [train.py:1039] (3/4) Epoch 19, batch 2150, loss[loss=0.198, simple_loss=0.2643, pruned_loss=0.06584, over 23829.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.253, pruned_loss=0.05193, over 4706426.53 frames. ], batch size: 179, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:19:12,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:19:12,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:19:12,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:19:12,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:19:19,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 08:19:22,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:19:22,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:24,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:19:24,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:19:25,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:19:27,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:27,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:19:27,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:19:31,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=651860.0, ans=0.09899494936611666 2023-09-30 08:19:32,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:19:32,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 08:19:39,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:19:41,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:19:41,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:19:42,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:19:42,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:19:42,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:19:44,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:19:44,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:19:44,194 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:19:45,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 08:19:47,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:19:48,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:49,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:19:51,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:19:52,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:19:55,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:19:55,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:19:57,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:19:57,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 08:19:57,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:20:01,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:20:01,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:03,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:20:03,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:20:05,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:05,891 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=651993.3333333334, ans=0.2 2023-09-30 08:20:07,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:07,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 08:20:08,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 08:20:10,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:20:10,766 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 08:20:10,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:11,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=651993.3333333334, ans=0.125 2023-09-30 08:20:12,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:20:12,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 08:20:12,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:20:12,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 08:20:13,818 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 08:20:13,819 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 08:20:13,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 08:20:16,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:16,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:20:16,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:20:18,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:18,791 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=652060.0, ans=0.1 2023-09-30 08:20:20,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 08:20:21,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:21,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:24,436 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.44 vs. limit=15.0 2023-09-30 08:20:25,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=652060.0, ans=0.125 2023-09-30 08:20:29,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:20:31,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 08:20:34,267 INFO [train.py:1039] (3/4) Epoch 19, batch 2200, loss[loss=0.186, simple_loss=0.2649, pruned_loss=0.05356, over 24075.00 frames. ], tot_loss[loss=0.1794, simple_loss=0.2538, pruned_loss=0.05256, over 4706681.31 frames. ], batch size: 86, lr: 5.44e-03, grad_scale: 16.0 2023-09-30 08:20:35,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:20:42,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:20:42,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:20:44,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:20:44,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:20:46,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:20:48,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:20:48,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 08:20:49,176 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.89 vs. limit=15.0 2023-09-30 08:20:53,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 08:20:54,310 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.914e+02 2.227e+02 2.792e+02 4.256e+02, threshold=4.455e+02, percent-clipped=0.0 2023-09-30 08:20:55,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:21:01,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 08:21:02,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:21:03,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:21:04,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:21:09,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:21:09,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 08:21:09,338 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=652260.0, ans=0.125 2023-09-30 08:21:12,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:21:14,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:21:14,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 08:21:14,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=652260.0, ans=0.125 2023-09-30 08:21:19,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:21:19,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:21:22,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:21:22,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:21:25,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 08:21:27,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:21:28,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 08:21:31,789 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.95 vs. limit=10.0 2023-09-30 08:21:32,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:21:32,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:21:32,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:21:34,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:21:34,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:21:34,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=652326.6666666666, ans=0.125 2023-09-30 08:21:34,935 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.52 vs. limit=15.0 2023-09-30 08:21:35,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:21:35,771 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:21:37,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:21:37,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:21:40,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:21:42,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 08:21:42,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:21:45,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:21:47,408 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 08:21:49,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:21:49,129 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 08:21:51,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 08:21:52,955 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 08:21:54,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:21:54,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 08:21:56,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:21:57,584 INFO [train.py:1039] (3/4) Epoch 19, batch 2250, loss[loss=0.1812, simple_loss=0.2605, pruned_loss=0.05094, over 23699.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.2541, pruned_loss=0.05254, over 4707197.14 frames. ], batch size: 85, lr: 5.43e-03, grad_scale: 16.0 2023-09-30 08:21:59,386 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 08:22:00,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:22:02,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:22:08,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:22:09,559 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:22:12,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:22:12,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:22:13,173 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:22:14,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:22:15,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 08:22:15,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:22:16,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:22:19,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 08:22:21,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:22:21,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:22:22,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:22:28,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:22:30,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 08:22:31,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:22:31,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 08:22:33,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:22:33,569 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=652593.3333333334, ans=0.125 2023-09-30 08:22:34,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:22:39,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:22:41,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:22:43,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:22:43,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:22:46,401 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=652660.0, ans=0.0 2023-09-30 08:22:47,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:22:50,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:22:56,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:22:58,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 08:23:02,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=652726.6666666666, ans=0.0 2023-09-30 08:23:05,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:23:06,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:23:06,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:23:11,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 08:23:16,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 08:23:16,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 08:23:16,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:23:16,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:23:19,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 08:23:21,051 INFO [train.py:1039] (3/4) Epoch 19, batch 2300, loss[loss=0.1625, simple_loss=0.2405, pruned_loss=0.04229, over 24294.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2547, pruned_loss=0.05285, over 4708375.35 frames. ], batch size: 56, lr: 5.43e-03, grad_scale: 16.0 2023-09-30 08:23:22,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:23:22,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:23:28,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:23:29,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:23:32,490 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 08:23:34,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:23:40,220 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.428e+02 1.880e+02 2.084e+02 2.491e+02 4.260e+02, threshold=4.169e+02, percent-clipped=0.0 2023-09-30 08:23:42,417 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:23:42,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 08:23:42,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:23:42,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:23:42,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 08:23:44,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:23:47,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:23:47,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:23:51,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:23:55,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:23:58,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:24:00,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=652926.6666666666, ans=0.0 2023-09-30 08:24:03,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:24:03,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:24:07,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:24:07,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:24:10,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:24:11,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:24:12,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:24:12,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 08:24:17,457 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.93 vs. limit=15.0 2023-09-30 08:24:18,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 08:24:18,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:24:18,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:24:18,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:24:18,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:24:20,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 08:24:20,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:24:20,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 08:24:21,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:24:21,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:24:22,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 08:24:29,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:24:32,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:24:36,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:24:36,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:24:38,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 08:24:38,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 08:24:39,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:24:39,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:24:41,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 08:24:42,946 INFO [train.py:1039] (3/4) Epoch 19, batch 2350, loss[loss=0.1831, simple_loss=0.2505, pruned_loss=0.05785, over 23833.00 frames. ], tot_loss[loss=0.1809, simple_loss=0.2553, pruned_loss=0.05324, over 4721465.08 frames. ], batch size: 195, lr: 5.43e-03, grad_scale: 16.0 2023-09-30 08:24:46,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:24:46,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 08:24:53,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=653126.6666666666, ans=0.2 2023-09-30 08:24:55,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 08:24:57,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:25:00,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:25:00,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:25:01,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:25:01,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:25:03,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 08:25:07,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:25:13,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 08:25:14,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:25:16,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:25:16,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:25:19,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:25:20,497 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.00 vs. limit=6.0 2023-09-30 08:25:21,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 08:25:22,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:25:22,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:25:22,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:25:23,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:25:29,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:25:30,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 08:25:32,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:25:35,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:25:35,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:25:38,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 08:25:38,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:25:40,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=653326.6666666666, ans=0.125 2023-09-30 08:25:42,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 08:25:43,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:25:48,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 08:25:48,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=653393.3333333334, ans=0.09899494936611666 2023-09-30 08:25:51,512 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=653393.3333333334, ans=0.125 2023-09-30 08:25:54,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 08:25:55,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:25:55,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 08:25:55,568 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 08:25:55,597 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 08:25:57,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 08:25:59,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:26:04,814 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:26:06,070 INFO [train.py:1039] (3/4) Epoch 19, batch 2400, loss[loss=0.192, simple_loss=0.2591, pruned_loss=0.0625, over 23792.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2547, pruned_loss=0.05253, over 4727060.37 frames. ], batch size: 195, lr: 5.43e-03, grad_scale: 32.0 2023-09-30 08:26:10,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:26:13,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:26:13,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 08:26:13,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 08:26:15,516 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.38 vs. limit=15.0 2023-09-30 08:26:19,928 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 08:26:19,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:26:22,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 08:26:22,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:26:23,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:26:24,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 08:26:25,929 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 1.937e+02 2.110e+02 2.328e+02 3.835e+02, threshold=4.219e+02, percent-clipped=0.0 2023-09-30 08:26:29,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:26:32,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 08:26:37,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:26:43,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 08:26:44,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:26:47,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:26:53,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:26:54,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 08:26:54,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:27:02,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:27:04,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:27:07,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:27:07,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=653660.0, ans=0.125 2023-09-30 08:27:08,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:27:08,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 08:27:08,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:27:08,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:27:11,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:27:11,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 08:27:17,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:27:17,727 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.92 vs. limit=15.0 2023-09-30 08:27:18,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:27:18,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 08:27:20,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 08:27:22,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:27:23,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:27:23,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 08:27:23,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 08:27:23,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 08:27:23,253 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 08:27:25,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 08:27:26,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:27:29,780 INFO [train.py:1039] (3/4) Epoch 19, batch 2450, loss[loss=0.1836, simple_loss=0.2483, pruned_loss=0.05942, over 23727.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.253, pruned_loss=0.05209, over 4721400.36 frames. ], batch size: 232, lr: 5.43e-03, grad_scale: 32.0 2023-09-30 08:27:29,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:27:29,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:27:31,400 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 08:27:31,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:27:32,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 08:27:36,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:27:36,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:27:39,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:27:39,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:27:40,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 08:27:45,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:27:45,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:27:50,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=653860.0, ans=0.2 2023-09-30 08:27:51,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:27:51,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:27:51,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:27:51,801 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=653860.0, ans=0.125 2023-09-30 08:27:52,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 08:27:57,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:27:59,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:27:59,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:28:04,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 08:28:04,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:28:06,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:28:07,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:28:09,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 08:28:10,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:28:16,029 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.85 vs. limit=15.0 2023-09-30 08:28:18,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:28:20,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:28:20,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:28:20,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:28:20,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:28:24,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:28:24,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 08:28:28,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:28:28,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:28:31,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:28:31,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:28:38,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:28:38,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 08:28:39,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:28:39,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:28:39,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 08:28:41,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:28:41,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:28:47,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:28:50,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:28:50,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:28:51,551 INFO [train.py:1039] (3/4) Epoch 19, batch 2500, loss[loss=0.1915, simple_loss=0.2573, pruned_loss=0.06279, over 23706.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.252, pruned_loss=0.05131, over 4724208.76 frames. ], batch size: 164, lr: 5.43e-03, grad_scale: 32.0 2023-09-30 08:28:53,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 08:28:55,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:29:01,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:29:06,408 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=654126.6666666666, ans=0.125 2023-09-30 08:29:08,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=654193.3333333334, ans=0.125 2023-09-30 08:29:11,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:29:11,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:29:12,730 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.785e+02 1.970e+02 2.171e+02 3.825e+02, threshold=3.939e+02, percent-clipped=0.0 2023-09-30 08:29:12,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:29:12,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 08:29:16,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=654193.3333333334, ans=0.125 2023-09-30 08:29:20,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:29:22,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:29:22,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 08:29:22,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 08:29:22,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 08:29:25,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:29:25,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:29:25,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 08:29:25,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:29:26,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 08:29:26,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:29:33,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:29:33,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:29:33,857 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:29:36,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:29:36,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 08:29:36,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:29:38,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:29:41,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:29:47,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:29:52,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:29:56,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:29:59,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 08:30:00,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:30:00,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 08:30:03,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:30:03,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:30:03,927 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 08:30:03,928 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 08:30:03,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 08:30:09,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:30:10,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 08:30:11,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 08:30:11,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:30:12,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 08:30:15,597 INFO [train.py:1039] (3/4) Epoch 19, batch 2550, loss[loss=0.1674, simple_loss=0.2494, pruned_loss=0.04265, over 24617.00 frames. ], tot_loss[loss=0.1779, simple_loss=0.2528, pruned_loss=0.05153, over 4722592.48 frames. ], batch size: 60, lr: 5.43e-03, grad_scale: 32.0 2023-09-30 08:30:15,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 08:30:17,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:30:20,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:30:20,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:30:22,763 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=654460.0, ans=0.125 2023-09-30 08:30:23,447 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.85 vs. limit=15.0 2023-09-30 08:30:23,995 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:30:25,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 08:30:25,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:30:30,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 08:30:31,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:30:33,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:30:34,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:30:36,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 08:30:37,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:30:37,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:30:37,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:30:39,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:30:41,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 08:30:41,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 08:30:41,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:30:41,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 08:30:51,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:30:58,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:30:58,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:30:58,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:30:59,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:31:05,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:31:08,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:31:08,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:31:10,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:31:10,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:31:10,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:31:14,380 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=654660.0, ans=0.0 2023-09-30 08:31:15,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:31:15,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:31:17,589 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=654660.0, ans=0.125 2023-09-30 08:31:23,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:31:23,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 08:31:23,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:31:23,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:31:25,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:31:25,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:31:26,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:31:34,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:31:35,824 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:31:37,209 INFO [train.py:1039] (3/4) Epoch 19, batch 2600, loss[loss=0.1654, simple_loss=0.2487, pruned_loss=0.04102, over 24478.00 frames. ], tot_loss[loss=0.1786, simple_loss=0.2536, pruned_loss=0.05184, over 4725354.96 frames. ], batch size: 66, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:31:40,313 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 08:31:43,945 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 08:31:43,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:31:44,031 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 08:31:44,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 08:31:45,530 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 08:31:48,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:31:48,614 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 08:31:50,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 08:31:50,828 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 08:31:54,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:31:54,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 08:31:56,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 08:31:57,364 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.570e+02 1.901e+02 2.112e+02 2.544e+02 3.828e+02, threshold=4.224e+02, percent-clipped=0.0 2023-09-30 08:31:57,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:31:59,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 08:32:02,604 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 08:32:03,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 08:32:10,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:32:11,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:32:11,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:32:11,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 08:32:14,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:32:19,939 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 08:32:23,149 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=654926.6666666666, ans=0.1 2023-09-30 08:32:26,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:32:26,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:32:27,286 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.91 vs. limit=6.0 2023-09-30 08:32:28,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 08:32:28,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:32:28,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:32:29,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 08:32:31,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:32:33,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:32:34,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:32:37,938 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 08:32:39,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:32:39,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:32:41,856 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.21 vs. limit=15.0 2023-09-30 08:32:44,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:32:45,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:32:45,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 08:32:45,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:32:47,258 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:32:48,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:32:55,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 08:32:56,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:32:58,319 INFO [train.py:1039] (3/4) Epoch 19, batch 2650, loss[loss=0.1707, simple_loss=0.259, pruned_loss=0.04119, over 24647.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2548, pruned_loss=0.0517, over 4730909.74 frames. ], batch size: 73, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:32:58,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 08:32:59,216 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.56 vs. limit=15.0 2023-09-30 08:33:03,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 08:33:03,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:33:04,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 08:33:06,541 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 08:33:06,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:33:08,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:33:08,464 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=655126.6666666666, ans=0.2 2023-09-30 08:33:12,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 08:33:13,004 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=655126.6666666666, ans=0.125 2023-09-30 08:33:13,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=655126.6666666666, ans=0.125 2023-09-30 08:33:14,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:33:15,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:33:17,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 08:33:17,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:33:17,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:33:20,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 08:33:23,371 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 08:33:24,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:33:27,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 08:33:27,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:33:28,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 08:33:30,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:33:30,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 08:33:30,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:33:32,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:33:36,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 08:33:37,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 08:33:39,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:33:42,838 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 08:33:42,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:33:44,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:33:44,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:33:45,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:33:47,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:33:50,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:33:51,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:33:53,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:33:53,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:33:54,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:33:57,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:33:58,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:34:00,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:34:00,380 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:34:01,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:34:03,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 08:34:04,997 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=655393.3333333334, ans=0.125 2023-09-30 08:34:06,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:34:06,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:34:06,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:34:06,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=655393.3333333334, ans=0.2 2023-09-30 08:34:08,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 08:34:11,022 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=655393.3333333334, ans=0.0 2023-09-30 08:34:12,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:34:13,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:34:15,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:34:15,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:34:17,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:34:17,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:34:20,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:34:20,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 08:34:21,745 INFO [train.py:1039] (3/4) Epoch 19, batch 2700, loss[loss=0.1851, simple_loss=0.2723, pruned_loss=0.0489, over 24702.00 frames. ], tot_loss[loss=0.1799, simple_loss=0.2558, pruned_loss=0.05204, over 4726931.59 frames. ], batch size: 73, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:34:21,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:34:23,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 08:34:23,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=655460.0, ans=0.2 2023-09-30 08:34:25,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:34:25,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:34:26,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:34:28,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:34:28,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:34:28,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:34:28,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 08:34:29,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 08:34:30,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:34:31,303 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=655460.0, ans=0.1 2023-09-30 08:34:32,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:34:33,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:34:35,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:34:39,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:34:41,038 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.869e+02 2.051e+02 2.484e+02 4.492e+02, threshold=4.101e+02, percent-clipped=1.0 2023-09-30 08:34:41,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 08:34:42,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:34:46,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:34:46,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:34:51,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=655526.6666666666, ans=0.125 2023-09-30 08:34:53,147 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:34:53,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:34:53,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:34:53,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:34:57,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:34:59,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:35:00,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:35:00,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:35:05,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:35:05,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:35:14,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:35:16,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:35:21,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:35:21,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:35:26,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:35:26,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:35:28,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:35:28,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:35:29,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:35:29,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:35:34,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:35:34,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:35:34,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:35:37,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 08:35:40,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:35:40,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:35:40,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 08:35:42,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 08:35:42,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:35:43,946 INFO [train.py:1039] (3/4) Epoch 19, batch 2750, loss[loss=0.1761, simple_loss=0.2602, pruned_loss=0.04603, over 24020.00 frames. ], tot_loss[loss=0.1798, simple_loss=0.2552, pruned_loss=0.05217, over 4724066.27 frames. ], batch size: 80, lr: 5.42e-03, grad_scale: 16.0 2023-09-30 08:35:45,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:35:46,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=655793.3333333334, ans=0.0 2023-09-30 08:35:47,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:35:49,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:35:49,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:35:51,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:35:53,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:35:54,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 08:35:54,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:35:54,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:35:54,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 08:35:54,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:35:54,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:35:59,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 08:36:00,266 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=655860.0, ans=0.125 2023-09-30 08:36:01,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:36:02,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:36:03,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:36:04,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 08:36:04,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:36:04,893 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=655860.0, ans=0.125 2023-09-30 08:36:06,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:36:07,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:36:08,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:36:12,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 08:36:12,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 08:36:13,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:36:15,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:36:15,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=655926.6666666666, ans=0.125 2023-09-30 08:36:16,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:36:25,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:36:27,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 08:36:27,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:36:35,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:36:35,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:36:35,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:36:43,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:36:44,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:36:44,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 08:36:47,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:36:49,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 08:36:53,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 08:36:57,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:36:59,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 08:36:59,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:37:01,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:37:01,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 08:37:03,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:37:07,165 INFO [train.py:1039] (3/4) Epoch 19, batch 2800, loss[loss=0.1759, simple_loss=0.2501, pruned_loss=0.05084, over 23410.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2533, pruned_loss=0.0515, over 4717680.22 frames. ], batch size: 93, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:37:07,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 08:37:07,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:37:08,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:37:08,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 08:37:08,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:37:10,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:37:11,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:37:11,957 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 08:37:11,958 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 08:37:15,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:37:16,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:37:16,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:37:20,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:37:21,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 08:37:23,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 08:37:23,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 08:37:24,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:37:26,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:37:26,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:37:28,348 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.835e+02 2.025e+02 2.355e+02 3.473e+02, threshold=4.050e+02, percent-clipped=0.0 2023-09-30 08:37:32,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:37:32,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:37:32,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 08:37:32,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:37:41,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:37:42,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:37:45,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:37:47,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:37:47,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:37:52,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:37:52,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 08:37:52,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:37:52,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:37:52,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:37:58,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:37:58,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:38:02,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:38:04,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:38:04,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:38:04,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:38:06,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 08:38:06,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:38:08,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:38:08,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 08:38:08,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:38:10,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:38:10,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:38:11,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 08:38:11,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:38:11,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:38:13,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:38:14,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 08:38:19,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:38:19,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:38:21,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:38:24,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:38:27,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:38:27,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:38:28,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:38:30,212 INFO [train.py:1039] (3/4) Epoch 19, batch 2850, loss[loss=0.187, simple_loss=0.2562, pruned_loss=0.05889, over 23643.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2526, pruned_loss=0.05134, over 4710466.63 frames. ], batch size: 256, lr: 5.42e-03, grad_scale: 32.0 2023-09-30 08:38:31,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:38:31,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:38:36,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:38:36,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 08:38:44,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 08:38:44,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:38:45,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 08:38:45,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:38:48,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 08:38:50,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 08:38:51,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:39:04,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:39:05,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:39:05,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:39:07,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 08:39:07,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:39:07,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:39:09,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:39:10,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 08:39:14,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:39:14,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:39:14,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:39:14,927 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.39 vs. limit=12.0 2023-09-30 08:39:16,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:39:19,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:39:19,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:39:20,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:39:22,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:39:22,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:39:24,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:39:24,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:39:26,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:39:30,648 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=656660.0, ans=0.1 2023-09-30 08:39:31,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:39:33,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 08:39:33,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 08:39:35,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 08:39:35,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:39:35,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 08:39:36,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:39:36,996 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=656726.6666666666, ans=0.2 2023-09-30 08:39:38,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:39:38,088 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:39:39,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:39:39,496 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 08:39:39,565 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 08:39:39,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:39:39,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:39:46,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 08:39:46,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:39:48,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:39:50,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 08:39:51,789 INFO [train.py:1039] (3/4) Epoch 19, batch 2900, loss[loss=0.1559, simple_loss=0.2283, pruned_loss=0.04179, over 24449.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.2528, pruned_loss=0.05124, over 4715591.01 frames. ], batch size: 58, lr: 5.42e-03, grad_scale: 16.0 2023-09-30 08:39:53,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:39:54,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 08:39:55,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 08:39:56,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:39:56,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:39:59,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:40:01,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:40:04,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:40:05,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:40:06,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=656860.0, ans=0.0 2023-09-30 08:40:08,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 08:40:08,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 08:40:10,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:40:11,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:40:13,067 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 1.848e+02 2.073e+02 2.444e+02 4.000e+02, threshold=4.146e+02, percent-clipped=0.0 2023-09-30 08:40:14,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 08:40:14,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=656860.0, ans=0.125 2023-09-30 08:40:15,568 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.18 vs. limit=15.0 2023-09-30 08:40:16,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 08:40:19,812 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:40:19,831 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 08:40:20,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:40:23,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:40:23,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 08:40:25,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=656926.6666666666, ans=0.2 2023-09-30 08:40:27,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:40:29,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:40:32,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:40:35,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:40:37,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 08:40:37,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 08:40:37,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:40:41,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:40:44,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 08:40:46,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:40:51,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:40:59,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:41:01,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 08:41:01,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 08:41:05,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:41:05,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 08:41:05,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:41:05,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:41:12,572 INFO [train.py:1039] (3/4) Epoch 19, batch 2950, loss[loss=0.1909, simple_loss=0.2729, pruned_loss=0.05451, over 24646.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.254, pruned_loss=0.05147, over 4717087.90 frames. ], batch size: 73, lr: 5.42e-03, grad_scale: 16.0 2023-09-30 08:41:12,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:41:14,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=657126.6666666666, ans=0.0 2023-09-30 08:41:15,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 08:41:16,595 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.77 vs. limit=15.0 2023-09-30 08:41:17,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:41:17,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:41:18,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:41:20,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:41:21,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 08:41:23,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 08:41:23,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:41:23,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:41:30,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:41:33,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:41:35,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:41:35,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:41:39,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:41:41,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:41:42,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:41:44,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:41:44,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:41:45,069 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.55 vs. limit=22.5 2023-09-30 08:41:46,095 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:41:48,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 08:41:52,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=657260.0, ans=0.1 2023-09-30 08:41:53,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 08:41:53,499 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 08:41:54,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:41:56,531 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 08:41:58,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 08:41:58,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:41:58,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:41:58,301 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 08:41:58,309 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 08:41:58,571 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=657260.0, ans=0.05 2023-09-30 08:42:01,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 08:42:03,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:42:03,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:42:05,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:42:05,988 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=657326.6666666666, ans=0.125 2023-09-30 08:42:07,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:42:07,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:42:07,192 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 08:42:07,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:42:07,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 08:42:15,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:42:17,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:42:18,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 08:42:18,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:42:20,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 08:42:22,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:42:23,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:42:25,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:42:26,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:42:26,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 08:42:28,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:42:29,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:42:29,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:42:29,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:42:31,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:42:31,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:42:32,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:42:34,277 INFO [train.py:1039] (3/4) Epoch 19, batch 3000, loss[loss=0.1589, simple_loss=0.235, pruned_loss=0.04133, over 24355.00 frames. ], tot_loss[loss=0.1792, simple_loss=0.2545, pruned_loss=0.0519, over 4710933.55 frames. ], batch size: 56, lr: 5.41e-03, grad_scale: 16.0 2023-09-30 08:42:34,278 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-30 08:42:48,942 INFO [train.py:1071] (3/4) Epoch 19, validation: loss=0.3515, simple_loss=0.275, pruned_loss=0.214, over 1125622.00 frames. 2023-09-30 08:42:48,943 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-30 08:42:49,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 08:42:50,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:42:52,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:42:52,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 08:42:55,309 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 08:42:55,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 08:42:57,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:42:57,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:42:58,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 08:42:59,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:43:07,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:43:07,644 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 08:43:11,624 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.829e+02 2.117e+02 2.474e+02 3.888e+02, threshold=4.234e+02, percent-clipped=0.0 2023-09-30 08:43:16,764 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:43:25,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 08:43:25,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:43:27,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:43:27,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:43:28,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:43:30,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:43:30,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 08:43:33,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 08:43:35,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:43:35,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 08:43:37,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:43:37,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:43:39,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:43:39,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:43:43,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:43:43,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:43:43,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:43:45,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:43:46,185 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.94 vs. limit=12.0 2023-09-30 08:43:48,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 08:43:50,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:43:50,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:43:50,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:43:55,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:43:55,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:43:59,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 08:43:59,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 08:43:59,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:43:59,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 08:44:00,738 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:44:02,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 08:44:04,039 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 08:44:04,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 08:44:05,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 08:44:05,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 08:44:05,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 08:44:07,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:44:07,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:44:07,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:44:07,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:09,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:44:12,053 INFO [train.py:1039] (3/4) Epoch 19, batch 3050, loss[loss=0.1552, simple_loss=0.2314, pruned_loss=0.03951, over 24586.00 frames. ], tot_loss[loss=0.1815, simple_loss=0.2564, pruned_loss=0.05335, over 4695515.22 frames. ], batch size: 60, lr: 5.41e-03, grad_scale: 16.0 2023-09-30 08:44:13,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 08:44:15,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:44:16,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:44:16,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:44:20,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:22,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 08:44:32,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 08:44:32,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 08:44:32,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:44:36,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:44:41,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:42,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:44:43,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:44:46,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:44:46,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 08:44:46,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:44:48,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:44:48,355 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:44:48,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:50,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:44:50,717 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.28 vs. limit=22.5 2023-09-30 08:44:53,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:44:53,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 08:44:53,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:44:53,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:44:58,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:44:58,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 08:44:58,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:45:00,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:45:00,703 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=657993.3333333334, ans=0.1 2023-09-30 08:45:06,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:45:06,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:45:16,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:45:16,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:45:16,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:45:19,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:45:19,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 08:45:19,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:45:21,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 08:45:22,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:45:22,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:45:24,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 08:45:25,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=658060.0, ans=0.0 2023-09-30 08:45:27,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:45:33,004 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:45:34,285 INFO [train.py:1039] (3/4) Epoch 19, batch 3100, loss[loss=0.1748, simple_loss=0.2641, pruned_loss=0.0427, over 24342.00 frames. ], tot_loss[loss=0.1805, simple_loss=0.2561, pruned_loss=0.05241, over 4712939.72 frames. ], batch size: 74, lr: 5.41e-03, grad_scale: 16.0 2023-09-30 08:45:34,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:45:36,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 08:45:39,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 08:45:42,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 08:45:42,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 08:45:44,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:45:45,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=658126.6666666666, ans=0.125 2023-09-30 08:45:47,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:45:47,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:45:52,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 08:45:55,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:45:56,627 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.813e+02 2.094e+02 2.454e+02 3.292e+02, threshold=4.189e+02, percent-clipped=0.0 2023-09-30 08:45:57,038 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=658193.3333333334, ans=0.125 2023-09-30 08:46:01,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 08:46:05,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 08:46:05,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:07,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:46:07,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:46:09,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 08:46:10,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:46:10,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 08:46:10,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:46:12,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:46:12,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 08:46:14,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:46:14,530 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=658260.0, ans=0.0 2023-09-30 08:46:20,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:46:20,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 08:46:22,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 08:46:23,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:25,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:46:26,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:46:26,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:26,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:46:28,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:46:28,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:46:30,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:46:30,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:46:30,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:30,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 08:46:31,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=658326.6666666666, ans=0.125 2023-09-30 08:46:34,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:46:36,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 08:46:40,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:46:42,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 08:46:42,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:46:43,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:46:43,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 08:46:47,614 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=4.76 vs. limit=15.0 2023-09-30 08:46:53,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=658393.3333333334, ans=0.2 2023-09-30 08:46:54,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 08:46:56,192 INFO [train.py:1039] (3/4) Epoch 19, batch 3150, loss[loss=0.2026, simple_loss=0.2824, pruned_loss=0.06138, over 23949.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2554, pruned_loss=0.05183, over 4718282.73 frames. ], batch size: 86, lr: 5.41e-03, grad_scale: 16.0 2023-09-30 08:46:58,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:46:59,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:47:00,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=658460.0, ans=0.0 2023-09-30 08:47:01,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:47:01,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:47:01,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 08:47:03,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:47:03,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 08:47:04,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 08:47:06,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:47:10,813 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 08:47:12,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=658526.6666666666, ans=0.125 2023-09-30 08:47:13,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 08:47:14,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:47:15,486 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 08:47:15,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 08:47:18,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 08:47:18,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 08:47:18,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 08:47:19,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:47:19,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:47:20,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:47:22,267 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 08:47:23,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:47:25,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:47:25,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:47:26,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 08:47:29,235 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=19.53 vs. limit=22.5 2023-09-30 08:47:30,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 08:47:31,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:47:33,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 08:47:33,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:47:35,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 08:47:36,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 08:47:38,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:47:38,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 08:47:39,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 08:47:39,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:47:39,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:47:44,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 08:47:44,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 08:47:45,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 08:47:47,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 08:47:47,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:47:49,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:47:49,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:47:49,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 08:47:49,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:47:51,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 08:47:52,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:47:52,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 08:47:54,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 08:47:55,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:47:55,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:47:56,247 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=658660.0, ans=0.125 2023-09-30 08:47:57,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 08:48:00,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 08:48:00,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:48:03,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:48:04,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:48:06,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:48:08,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=658726.6666666666, ans=0.1 2023-09-30 08:48:11,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:48:11,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:48:15,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 08:48:20,489 INFO [train.py:1039] (3/4) Epoch 19, batch 3200, loss[loss=0.1606, simple_loss=0.2398, pruned_loss=0.04067, over 24588.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2535, pruned_loss=0.05124, over 4720716.84 frames. ], batch size: 60, lr: 5.41e-03, grad_scale: 32.0 2023-09-30 08:48:20,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:48:20,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 08:48:24,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:48:24,429 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:48:24,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 08:48:27,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:48:32,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:48:35,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:48:43,545 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.911e+02 2.162e+02 2.460e+02 4.180e+02, threshold=4.324e+02, percent-clipped=0.0 2023-09-30 08:48:43,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:48:48,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=658860.0, ans=0.04949747468305833 2023-09-30 08:48:51,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=658860.0, ans=0.2 2023-09-30 08:48:56,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 08:48:57,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:49:00,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 08:49:02,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 08:49:04,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=658926.6666666666, ans=0.125 2023-09-30 08:49:05,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:49:05,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:49:06,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:49:07,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=658926.6666666666, ans=0.125 2023-09-30 08:49:11,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 08:49:13,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 08:49:14,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 08:49:15,880 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.59 vs. limit=10.0 2023-09-30 08:49:16,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 08:49:20,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:49:21,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=658993.3333333334, ans=0.2 2023-09-30 08:49:26,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:49:27,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 08:49:27,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:49:28,555 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 08:49:28,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 08:49:28,894 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=659060.0, ans=0.1 2023-09-30 08:49:33,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:49:35,336 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 08:49:36,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 08:49:36,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 08:49:38,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 08:49:41,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:49:41,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=659126.6666666666, ans=0.0 2023-09-30 08:49:42,870 INFO [train.py:1039] (3/4) Epoch 19, batch 3250, loss[loss=0.1833, simple_loss=0.2518, pruned_loss=0.05741, over 23588.00 frames. ], tot_loss[loss=0.1782, simple_loss=0.2534, pruned_loss=0.05152, over 4718831.22 frames. ], batch size: 256, lr: 5.41e-03, grad_scale: 32.0 2023-09-30 08:49:43,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 08:49:44,486 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 08:49:44,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:49:44,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:49:46,104 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 08:49:50,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 08:49:54,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:50:04,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:50:04,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 08:50:04,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=659193.3333333334, ans=0.125 2023-09-30 08:50:05,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:50:05,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:50:05,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:50:07,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:50:07,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 08:50:09,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=659193.3333333334, ans=0.0 2023-09-30 08:50:10,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:50:10,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 08:50:11,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:50:12,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:50:12,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:50:12,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:50:15,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:50:17,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 08:50:18,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:50:18,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:50:20,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:50:20,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:50:20,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:50:25,609 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.87 vs. limit=22.5 2023-09-30 08:50:27,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 08:50:27,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:50:27,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:50:28,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:50:30,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 08:50:37,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:50:45,721 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:50:47,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:50:47,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 08:50:47,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:50:47,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 08:50:47,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:50:48,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 08:50:49,692 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.48 vs. limit=10.0 2023-09-30 08:50:50,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 08:50:50,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:50:52,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:50:53,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:50:53,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 08:50:53,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:50:58,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:50:58,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:51:01,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 08:51:01,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:51:03,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 08:51:03,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 08:51:05,726 INFO [train.py:1039] (3/4) Epoch 19, batch 3300, loss[loss=0.1485, simple_loss=0.2241, pruned_loss=0.03647, over 24329.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2539, pruned_loss=0.05192, over 4708992.00 frames. ], batch size: 56, lr: 5.41e-03, grad_scale: 32.0 2023-09-30 08:51:07,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:51:07,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 08:51:08,928 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 08:51:10,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 08:51:10,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:51:16,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:51:18,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:51:18,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=659460.0, ans=0.2 2023-09-30 08:51:19,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:51:19,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 08:51:22,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 08:51:24,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:51:25,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:51:25,478 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=659526.6666666666, ans=0.125 2023-09-30 08:51:25,502 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=659526.6666666666, ans=0.0 2023-09-30 08:51:28,245 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.450e+02 1.763e+02 1.973e+02 2.210e+02 4.562e+02, threshold=3.946e+02, percent-clipped=1.0 2023-09-30 08:51:29,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 08:51:29,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:51:30,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:51:32,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:51:32,917 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 08:51:33,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=659526.6666666666, ans=0.0 2023-09-30 08:51:35,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:51:37,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 08:51:37,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 08:51:37,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:51:38,735 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 08:51:39,899 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=659593.3333333334, ans=0.2 2023-09-30 08:51:43,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:51:43,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 08:51:44,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:51:44,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 08:51:46,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 08:51:46,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:51:47,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:51:50,094 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 08:51:51,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 08:51:51,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:51:54,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 08:51:54,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:51:59,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 08:52:01,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:52:02,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:52:02,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:52:02,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:52:02,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:52:06,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:52:06,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:52:06,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:52:07,770 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 08:52:09,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 08:52:11,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 08:52:11,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:52:11,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:52:13,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:52:13,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:52:15,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 08:52:16,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:16,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 08:52:16,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:52:19,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 08:52:23,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 08:52:23,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:52:25,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:26,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 08:52:28,517 INFO [train.py:1039] (3/4) Epoch 19, batch 3350, loss[loss=0.1904, simple_loss=0.2757, pruned_loss=0.05259, over 24004.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.2549, pruned_loss=0.05217, over 4704968.62 frames. ], batch size: 80, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:52:28,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:52:28,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:52:30,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:52:30,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:52:30,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=659793.3333333334, ans=0.1 2023-09-30 08:52:33,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:52:33,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:52:34,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:52:36,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:38,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:52:39,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:52:41,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:52:43,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 08:52:44,318 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module1.whiten, num_groups=1, num_channels=192, metric=10.18 vs. limit=15.0 2023-09-30 08:52:44,936 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 08:52:46,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:52:48,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 08:52:48,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 08:52:50,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 08:52:50,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:52:51,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:52:51,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 08:52:54,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:54,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:52:57,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:52:58,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:52:58,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:53:00,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:53:03,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:53:06,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:53:07,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:53:10,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=659926.6666666666, ans=0.035 2023-09-30 08:53:11,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:53:12,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:53:14,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:53:14,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=659926.6666666666, ans=0.5 2023-09-30 08:53:16,019 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:53:17,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=659993.3333333334, ans=0.0 2023-09-30 08:53:18,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:53:21,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 08:53:21,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 08:53:21,599 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 08:53:21,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:53:23,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 08:53:24,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:53:26,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=659993.3333333334, ans=0.0 2023-09-30 08:53:28,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:53:28,584 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=659993.3333333334, ans=0.1 2023-09-30 08:53:34,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:53:34,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 08:53:35,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:53:37,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:53:39,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:53:44,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:53:46,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 08:53:46,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 08:53:47,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:53:49,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:53:49,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 08:53:49,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:53:49,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 08:53:51,466 INFO [train.py:1039] (3/4) Epoch 19, batch 3400, loss[loss=0.2376, simple_loss=0.2974, pruned_loss=0.08886, over 19617.00 frames. ], tot_loss[loss=0.1814, simple_loss=0.2566, pruned_loss=0.05315, over 4698197.48 frames. ], batch size: 390, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:53:51,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:53:51,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:53:53,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 08:53:54,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:53:54,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 08:53:59,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=660126.6666666666, ans=0.2 2023-09-30 08:54:01,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 08:54:01,300 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 08:54:01,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:54:06,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:54:06,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 08:54:06,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:54:07,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 08:54:11,682 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.75 vs. limit=22.5 2023-09-30 08:54:14,446 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.902e+02 2.101e+02 2.445e+02 3.700e+02, threshold=4.201e+02, percent-clipped=0.0 2023-09-30 08:54:14,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:54:16,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 08:54:20,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:54:23,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:54:23,743 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:54:23,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 08:54:31,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:54:36,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 08:54:42,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:54:42,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:54:43,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 08:54:44,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:54:45,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:54:46,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:54:48,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:54:50,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:54:53,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 08:54:53,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:54:57,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=660393.3333333334, ans=0.0 2023-09-30 08:55:00,542 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:55:03,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 08:55:08,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 08:55:11,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 08:55:13,172 INFO [train.py:1039] (3/4) Epoch 19, batch 3450, loss[loss=0.1855, simple_loss=0.2586, pruned_loss=0.05615, over 23714.00 frames. ], tot_loss[loss=0.1807, simple_loss=0.2558, pruned_loss=0.05278, over 4715264.33 frames. ], batch size: 149, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:55:16,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 08:55:17,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=660460.0, ans=0.0 2023-09-30 08:55:18,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:55:19,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 08:55:19,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 08:55:21,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:55:25,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 08:55:29,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:55:31,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:55:31,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 08:55:31,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:55:35,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:55:41,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 08:55:48,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 08:55:48,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 08:55:48,255 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 08:55:51,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:55:56,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 08:55:57,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:56:01,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:56:01,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:56:02,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 08:56:04,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:56:06,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 08:56:06,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:56:08,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:56:11,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:56:11,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=660660.0, ans=0.1 2023-09-30 08:56:14,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 08:56:18,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 08:56:22,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 08:56:24,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:56:27,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:56:32,199 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.95 vs. limit=15.0 2023-09-30 08:56:32,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:56:32,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 08:56:34,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:56:34,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:56:35,714 INFO [train.py:1039] (3/4) Epoch 19, batch 3500, loss[loss=0.1606, simple_loss=0.2369, pruned_loss=0.04215, over 21985.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2544, pruned_loss=0.0525, over 4711546.50 frames. ], batch size: 48, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:56:38,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:56:42,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:56:42,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 08:56:45,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 08:56:48,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 08:56:50,785 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=660860.0, ans=0.0 2023-09-30 08:56:51,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:56:52,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=660860.0, ans=0.2 2023-09-30 08:56:53,258 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 08:56:57,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 08:56:58,822 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.559e+02 1.797e+02 1.956e+02 2.209e+02 3.007e+02, threshold=3.913e+02, percent-clipped=0.0 2023-09-30 08:56:59,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:56:59,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 08:57:01,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:57:01,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 08:57:01,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:01,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:57:01,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 08:57:01,621 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=660860.0, ans=0.07 2023-09-30 08:57:04,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:05,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 08:57:07,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:57:12,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:12,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 08:57:12,509 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=660926.6666666666, ans=0.2 2023-09-30 08:57:13,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 08:57:15,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:57:18,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 08:57:20,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:21,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 08:57:21,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:57:23,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 08:57:23,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 08:57:25,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 08:57:25,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:57:26,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:28,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:57:28,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 08:57:31,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 08:57:33,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 08:57:37,922 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:57:39,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 08:57:40,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 08:57:40,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:57:42,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:57:42,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:57:45,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:45,914 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=661060.0, ans=0.1 2023-09-30 08:57:45,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=661060.0, ans=0.125 2023-09-30 08:57:49,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 08:57:50,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 08:57:51,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=661060.0, ans=0.125 2023-09-30 08:57:52,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:57:53,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 08:57:56,634 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 08:57:56,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:57:58,213 INFO [train.py:1039] (3/4) Epoch 19, batch 3550, loss[loss=0.1547, simple_loss=0.2323, pruned_loss=0.03858, over 24438.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2528, pruned_loss=0.05169, over 4698468.92 frames. ], batch size: 58, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:57:58,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:57:58,454 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:58:00,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:58:03,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 08:58:09,332 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=661126.6666666666, ans=0.04949747468305833 2023-09-30 08:58:14,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:58:15,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 08:58:17,684 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=661193.3333333334, ans=0.0 2023-09-30 08:58:20,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:58:20,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 08:58:22,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:58:22,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:58:23,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 08:58:25,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:58:27,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:58:27,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:58:27,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 08:58:28,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 08:58:34,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 08:58:34,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 08:58:36,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:58:36,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:58:38,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 08:58:38,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 08:58:38,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:58:40,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:58:41,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 08:58:48,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:58:49,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 08:58:49,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:58:50,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 08:58:52,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 08:58:52,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 08:58:52,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 08:58:56,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 08:58:56,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 08:59:00,440 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 08:59:00,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:59:02,811 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=7.63 vs. limit=15.0 2023-09-30 08:59:06,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:59:06,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 08:59:07,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:59:11,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 08:59:12,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 08:59:18,280 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.69 vs. limit=15.0 2023-09-30 08:59:21,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 08:59:21,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 08:59:21,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 08:59:22,680 INFO [train.py:1039] (3/4) Epoch 19, batch 3600, loss[loss=0.1796, simple_loss=0.2514, pruned_loss=0.05392, over 23494.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2523, pruned_loss=0.05135, over 4707486.52 frames. ], batch size: 134, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 08:59:22,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:59:24,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 08:59:24,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 08:59:29,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:59:31,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:59:31,600 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.60 vs. limit=22.5 2023-09-30 08:59:32,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 08:59:32,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 08:59:34,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:59:34,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 08:59:35,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 08:59:37,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:59:41,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:59:43,974 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:59:45,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 08:59:45,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 08:59:45,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 08:59:46,962 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.820e+02 2.003e+02 2.240e+02 3.370e+02, threshold=4.007e+02, percent-clipped=0.0 2023-09-30 08:59:47,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 08:59:50,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 08:59:50,314 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 08:59:52,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=661526.6666666666, ans=0.1 2023-09-30 08:59:53,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 08:59:55,489 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 08:59:56,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 08:59:57,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 09:00:04,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=661593.3333333334, ans=0.125 2023-09-30 09:00:05,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:00:06,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:00:06,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 09:00:12,003 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.76 vs. limit=22.5 2023-09-30 09:00:12,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:00:19,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:00:20,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=661660.0, ans=0.2 2023-09-30 09:00:22,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:00:24,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=661660.0, ans=0.1 2023-09-30 09:00:28,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:00:28,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:00:28,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 09:00:31,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 09:00:32,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 09:00:33,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=661726.6666666666, ans=0.125 2023-09-30 09:00:34,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:00:34,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:00:36,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 09:00:37,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:00:38,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:00:38,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:00:39,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 09:00:39,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 09:00:41,492 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=661726.6666666666, ans=0.125 2023-09-30 09:00:44,169 INFO [train.py:1039] (3/4) Epoch 19, batch 3650, loss[loss=0.1741, simple_loss=0.2489, pruned_loss=0.04959, over 23656.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2527, pruned_loss=0.05118, over 4718948.66 frames. ], batch size: 256, lr: 5.40e-03, grad_scale: 32.0 2023-09-30 09:00:44,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:00:44,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=661793.3333333334, ans=0.05 2023-09-30 09:00:45,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 09:00:49,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 09:00:52,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:00:55,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=661793.3333333334, ans=0.2 2023-09-30 09:00:57,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 09:00:59,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 09:01:02,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:01:02,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:01:03,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:01:05,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 09:01:05,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:01:07,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 09:01:07,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:01:07,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:01:09,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 09:01:11,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 09:01:12,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:01:12,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:01:14,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:01:16,469 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=661926.6666666666, ans=0.0 2023-09-30 09:01:17,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 09:01:19,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 09:01:20,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:01:22,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 09:01:23,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:01:24,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=661926.6666666666, ans=0.1 2023-09-30 09:01:25,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:01:29,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 09:01:32,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:01:32,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:01:34,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:01:34,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:01:35,124 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.32 vs. limit=10.0 2023-09-30 09:01:35,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:01:40,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:01:42,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:01:42,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:01:44,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 09:01:46,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:01:46,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:01:51,001 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 09:01:55,400 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:01:55,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:01:56,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:01:58,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:02:00,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:02:00,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:02:03,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 09:02:03,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:02:06,523 INFO [train.py:1039] (3/4) Epoch 19, batch 3700, loss[loss=0.1966, simple_loss=0.261, pruned_loss=0.06611, over 22677.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2536, pruned_loss=0.0517, over 4720233.87 frames. ], batch size: 322, lr: 5.39e-03, grad_scale: 32.0 2023-09-30 09:02:06,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 09:02:10,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:02:10,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:02:13,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:02:13,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 09:02:13,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:02:14,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:02:14,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:02:17,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:02:22,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:02:22,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:02:23,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:02:23,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:02:25,154 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 09:02:26,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=662193.3333333334, ans=0.0 2023-09-30 09:02:28,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:02:28,372 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 09:02:31,272 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.890e+02 2.038e+02 2.335e+02 3.154e+02, threshold=4.075e+02, percent-clipped=0.0 2023-09-30 09:02:38,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:02:38,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 09:02:39,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:02:39,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 09:02:39,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:02:43,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:02:45,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 09:02:46,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:02:48,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:02:48,693 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.27 vs. limit=12.0 2023-09-30 09:02:49,056 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=5.22 vs. limit=10.0 2023-09-30 09:02:51,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:02:51,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:02:55,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:02:58,687 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:02:58,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 09:03:00,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:03:00,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 09:03:04,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:03:06,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:03:09,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:03:10,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 09:03:11,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:03:11,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 09:03:11,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:03:13,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:03:18,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:03:19,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 09:03:21,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 09:03:22,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:03:22,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:03:24,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:03:24,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=662393.3333333334, ans=0.0 2023-09-30 09:03:25,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:03:29,195 INFO [train.py:1039] (3/4) Epoch 19, batch 3750, loss[loss=0.1876, simple_loss=0.2529, pruned_loss=0.06117, over 23559.00 frames. ], tot_loss[loss=0.1787, simple_loss=0.2541, pruned_loss=0.05166, over 4729684.70 frames. ], batch size: 256, lr: 5.39e-03, grad_scale: 32.0 2023-09-30 09:03:29,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:03:31,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:03:32,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:03:32,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 09:03:34,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 09:03:34,899 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=662460.0, ans=0.5 2023-09-30 09:03:36,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 09:03:37,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 09:03:39,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:03:40,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:03:40,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:03:42,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:03:47,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:03:50,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:03:52,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:03:55,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:03:58,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:03:58,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 09:04:00,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:04:01,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:04:01,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:04:05,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 09:04:09,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 09:04:10,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:04:11,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:04:11,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=662593.3333333334, ans=0.125 2023-09-30 09:04:13,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:04:16,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:04:18,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 09:04:20,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 09:04:25,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:04:28,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:04:28,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:04:29,111 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=662660.0, ans=0.125 2023-09-30 09:04:33,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:04:37,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 09:04:39,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:04:40,103 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.77 vs. limit=15.0 2023-09-30 09:04:42,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:04:42,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:04:45,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 09:04:51,926 INFO [train.py:1039] (3/4) Epoch 19, batch 3800, loss[loss=0.1976, simple_loss=0.2548, pruned_loss=0.07019, over 19747.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2536, pruned_loss=0.05152, over 4729029.53 frames. ], batch size: 389, lr: 5.39e-03, grad_scale: 16.0 2023-09-30 09:04:55,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:04:55,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=662793.3333333334, ans=0.0 2023-09-30 09:05:01,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:05:01,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 09:05:03,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 09:05:03,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=662793.3333333334, ans=0.1 2023-09-30 09:05:04,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:05:04,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:05:05,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=662793.3333333334, ans=0.025 2023-09-30 09:05:06,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 09:05:08,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 09:05:08,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:05:10,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:05:11,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:05:13,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:05:13,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:05:15,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 09:05:15,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=662860.0, ans=0.125 2023-09-30 09:05:18,089 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.822e+02 1.936e+02 2.155e+02 2.834e+02, threshold=3.873e+02, percent-clipped=0.0 2023-09-30 09:05:19,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 09:05:21,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:05:23,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:05:24,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:05:26,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:05:26,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=662926.6666666666, ans=0.125 2023-09-30 09:05:28,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 09:05:28,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:05:30,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:05:31,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:05:36,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 09:05:36,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 09:05:39,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:05:41,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=662993.3333333334, ans=0.125 2023-09-30 09:05:46,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:05:53,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:05:56,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 09:05:57,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 09:05:59,258 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:06:00,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:06:00,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:06:04,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 09:06:07,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 09:06:07,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 09:06:07,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:06:09,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:06:13,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:06:14,483 INFO [train.py:1039] (3/4) Epoch 19, batch 3850, loss[loss=0.1596, simple_loss=0.2375, pruned_loss=0.04084, over 24592.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2523, pruned_loss=0.05126, over 4717283.72 frames. ], batch size: 60, lr: 5.39e-03, grad_scale: 16.0 2023-09-30 09:06:14,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:06:19,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:06:21,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 09:06:21,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:06:23,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:06:26,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 09:06:28,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:06:31,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 09:06:32,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 09:06:39,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:06:41,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:06:43,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:06:44,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:06:46,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:06:47,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:06:50,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:06:50,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:06:50,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:06:53,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:06:53,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:06:54,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:06:54,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 09:06:54,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 09:06:56,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:06:56,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:06:59,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:01,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:07:02,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 09:07:05,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 09:07:07,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:07,706 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=663326.6666666666, ans=0.05 2023-09-30 09:07:08,102 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.87 vs. limit=15.0 2023-09-30 09:07:08,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 09:07:12,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 09:07:15,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:17,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:07:22,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:22,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 09:07:26,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 09:07:26,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=663393.3333333334, ans=0.125 2023-09-30 09:07:27,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:07:29,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:07:31,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 09:07:31,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 09:07:31,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:33,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:33,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:07:33,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 09:07:34,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:07:36,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 09:07:36,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:36,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:07:37,755 INFO [train.py:1039] (3/4) Epoch 19, batch 3900, loss[loss=0.1789, simple_loss=0.2567, pruned_loss=0.05059, over 23333.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2512, pruned_loss=0.05108, over 4704468.16 frames. ], batch size: 93, lr: 5.39e-03, grad_scale: 16.0 2023-09-30 09:07:37,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:07:39,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:40,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:07:42,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:07:42,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:07:43,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:07:43,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 09:07:43,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:47,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:07:49,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 09:07:51,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:07:52,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:07:55,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 09:07:55,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:07:57,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:07:59,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 09:07:59,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:08:02,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 09:08:02,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:08:02,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 09:08:03,995 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.880e+02 2.094e+02 2.304e+02 3.533e+02, threshold=4.187e+02, percent-clipped=0.0 2023-09-30 09:08:04,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 09:08:10,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:08:11,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:08:12,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:08:12,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:08:12,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=663593.3333333334, ans=0.0 2023-09-30 09:08:17,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:08:18,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:08:22,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:08:22,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:08:22,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=663593.3333333334, ans=0.125 2023-09-30 09:08:23,916 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:08:29,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:08:29,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:08:31,892 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=11.82 vs. limit=15.0 2023-09-30 09:08:36,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:08:37,814 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:08:49,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:08:53,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:08:53,740 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 09:08:53,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 09:08:53,831 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:08:55,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 09:08:57,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:08:58,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 09:09:00,564 INFO [train.py:1039] (3/4) Epoch 19, batch 3950, loss[loss=0.1898, simple_loss=0.2575, pruned_loss=0.06105, over 23442.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2516, pruned_loss=0.05123, over 4705030.37 frames. ], batch size: 285, lr: 5.39e-03, grad_scale: 16.0 2023-09-30 09:09:01,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=663793.3333333334, ans=0.2 2023-09-30 09:09:04,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:09:06,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 09:09:06,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:09:08,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:09:09,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:09:13,889 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.11 vs. limit=15.0 2023-09-30 09:09:15,966 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 09:09:17,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:09:18,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 09:09:19,441 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 09:09:19,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:09:21,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:09:22,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:09:22,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:09:24,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 09:09:27,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:09:27,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:09:27,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:09:29,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:09:29,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:09:43,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:09:43,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:09:45,581 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=663926.6666666666, ans=0.2 2023-09-30 09:09:51,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 09:09:56,164 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=12.45 vs. limit=15.0 2023-09-30 09:09:58,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 09:09:58,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 09:09:58,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:10:00,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:10:00,752 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.93 vs. limit=22.5 2023-09-30 09:10:02,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=663993.3333333334, ans=0.0 2023-09-30 09:10:03,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=663993.3333333334, ans=0.125 2023-09-30 09:10:06,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:10:06,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:10:08,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:10:08,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:10:08,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 09:10:13,397 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=664060.0, ans=0.1 2023-09-30 09:10:14,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:10:15,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.min_positive, batch_count=664060.0, ans=0.025 2023-09-30 09:10:16,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:10:19,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 09:10:23,291 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=664126.6666666666, ans=0.1 2023-09-30 09:10:24,391 INFO [train.py:1039] (3/4) Epoch 19, batch 4000, loss[loss=0.1637, simple_loss=0.2446, pruned_loss=0.04139, over 24492.00 frames. ], tot_loss[loss=0.1782, simple_loss=0.2527, pruned_loss=0.05186, over 4703585.80 frames. ], batch size: 66, lr: 5.39e-03, grad_scale: 32.0 2023-09-30 09:10:24,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=664126.6666666666, ans=0.125 2023-09-30 09:10:26,669 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.24 vs. limit=15.0 2023-09-30 09:10:28,230 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.32 vs. limit=15.0 2023-09-30 09:10:31,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:10:38,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:10:42,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:10:42,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:10:44,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:10:44,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 09:10:45,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:10:45,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 09:10:45,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:10:45,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 09:10:48,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:10:51,346 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.607e+02 1.842e+02 2.148e+02 2.341e+02 3.331e+02, threshold=4.296e+02, percent-clipped=0.0 2023-09-30 09:10:53,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:10:53,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:10:53,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:10:53,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:10:53,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 09:10:56,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:10:57,688 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 09:10:57,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:10:59,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:11:02,391 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 09:11:03,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 09:11:03,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:11:09,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 09:11:09,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:11:12,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:11:14,363 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 09:11:15,850 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:11:17,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 09:11:17,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:11:18,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:11:18,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:11:20,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:11:20,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:11:20,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:11:24,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 09:11:24,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:11:24,961 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:11:27,583 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 09:11:29,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=664393.3333333334, ans=0.0 2023-09-30 09:11:30,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:11:34,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 09:11:37,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:11:37,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:11:38,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:11:40,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:11:45,276 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:11:46,664 INFO [train.py:1039] (3/4) Epoch 19, batch 4050, loss[loss=0.1699, simple_loss=0.2509, pruned_loss=0.04448, over 24355.00 frames. ], tot_loss[loss=0.1797, simple_loss=0.2544, pruned_loss=0.05251, over 4697552.98 frames. ], batch size: 61, lr: 5.39e-03, grad_scale: 32.0 2023-09-30 09:11:48,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 09:11:48,706 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=664460.0, ans=0.0 2023-09-30 09:11:50,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 09:11:51,922 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:11:51,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:11:52,346 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=664460.0, ans=0.125 2023-09-30 09:11:53,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:11:53,714 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=664460.0, ans=0.0 2023-09-30 09:11:55,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:11:55,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=664460.0, ans=0.125 2023-09-30 09:11:56,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:12:02,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:12:03,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:12:05,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 09:12:06,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:12:06,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:12:12,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:12:14,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:12:17,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 09:12:17,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=664593.3333333334, ans=0.0 2023-09-30 09:12:19,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 09:12:19,400 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 09:12:22,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:12:27,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 09:12:29,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:12:32,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:12:38,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:12:38,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:12:38,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:12:42,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:12:45,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 09:12:47,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 09:12:47,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:12:47,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=664660.0, ans=0.0 2023-09-30 09:12:49,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 09:12:55,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:12:56,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass_mid.scale_min, batch_count=664726.6666666666, ans=0.2 2023-09-30 09:13:03,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 09:13:05,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:13:05,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:13:07,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 09:13:07,712 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 09:13:07,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:13:09,192 INFO [train.py:1039] (3/4) Epoch 19, batch 4100, loss[loss=0.1735, simple_loss=0.2442, pruned_loss=0.05141, over 23476.00 frames. ], tot_loss[loss=0.1801, simple_loss=0.2553, pruned_loss=0.05249, over 4707073.57 frames. ], batch size: 134, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:13:10,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:13:10,454 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=664793.3333333334, ans=0.125 2023-09-30 09:13:11,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:11,417 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:13:15,402 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.33 vs. limit=6.0 2023-09-30 09:13:16,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 09:13:16,490 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 09:13:19,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 09:13:20,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 09:13:20,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:13:21,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:22,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:22,429 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:13:22,547 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 09:13:27,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:13:29,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:13:29,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:13:29,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:13:33,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:13:34,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:13:34,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:13:34,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 09:13:36,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:36,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:13:36,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:13:36,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:13:38,156 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.904e+02 2.164e+02 2.650e+02 3.755e+02, threshold=4.328e+02, percent-clipped=0.0 2023-09-30 09:13:38,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 09:13:41,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:13:43,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 09:13:45,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:13:48,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:13:48,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 09:13:48,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:13:50,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:13:50,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:13:51,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 09:13:53,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:13:54,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:13:57,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 09:13:57,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:13:57,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:14:01,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:14:07,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:14:10,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:14:11,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:14:18,118 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=9.32 vs. limit=12.0 2023-09-30 09:14:19,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:14:19,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:14:22,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:14:23,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:14:28,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:14:29,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:14:31,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:14:31,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:14:33,073 INFO [train.py:1039] (3/4) Epoch 19, batch 4150, loss[loss=0.1758, simple_loss=0.2647, pruned_loss=0.04349, over 24645.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2553, pruned_loss=0.05237, over 4707491.75 frames. ], batch size: 68, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:14:34,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 09:14:34,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:14:35,668 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.94 vs. limit=22.5 2023-09-30 09:14:36,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 09:14:37,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 09:14:37,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 09:14:38,210 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:14:39,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:14:43,901 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=8.43 vs. limit=15.0 2023-09-30 09:14:45,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:14:45,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:14:46,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=665126.6666666666, ans=0.2 2023-09-30 09:14:50,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:14:51,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=665193.3333333334, ans=0.125 2023-09-30 09:14:52,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:14:52,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:14:54,650 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.22 vs. limit=22.5 2023-09-30 09:14:55,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:14:55,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:14:56,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 09:15:02,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:15:08,117 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:15:08,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 09:15:08,676 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=665260.0, ans=0.1 2023-09-30 09:15:11,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 09:15:11,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:15:12,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 09:15:12,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:15:12,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:15:15,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:15:15,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:15:22,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 09:15:26,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:15:28,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:15:28,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 09:15:29,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:15:31,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 09:15:34,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:15:34,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:15:35,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:15:37,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 09:15:37,526 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:15:38,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 09:15:39,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 09:15:41,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 09:15:42,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:15:42,718 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:15:42,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 09:15:44,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 09:15:44,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:15:45,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 09:15:45,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:15:48,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:15:50,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 09:15:50,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:15:55,465 INFO [train.py:1039] (3/4) Epoch 19, batch 4200, loss[loss=0.167, simple_loss=0.24, pruned_loss=0.04694, over 17910.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2542, pruned_loss=0.05183, over 4707440.92 frames. ], batch size: 38, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:15:55,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:15:55,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 09:15:58,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:15:58,925 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=665460.0, ans=0.2 2023-09-30 09:16:00,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:16:02,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:16:02,462 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:16:02,465 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:16:05,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 09:16:08,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 09:16:08,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:16:11,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:16:15,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:16:18,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 09:16:19,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:16:20,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:16:20,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer1.prob, batch_count=665526.6666666666, ans=0.125 2023-09-30 09:16:21,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 09:16:21,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:16:22,942 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.443e+02 1.959e+02 2.201e+02 2.604e+02 4.093e+02, threshold=4.401e+02, percent-clipped=0.0 2023-09-30 09:16:23,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:16:23,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:16:23,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:16:23,594 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=665526.6666666666, ans=0.125 2023-09-30 09:16:25,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:16:27,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 09:16:27,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:16:27,506 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=665593.3333333334, ans=0.0 2023-09-30 09:16:32,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 09:16:33,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:16:37,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:16:38,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:16:40,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:16:40,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 09:16:40,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:16:42,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:16:48,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:16:49,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:16:55,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:16:58,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 09:17:01,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:17:06,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 09:17:08,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:17:09,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=665726.6666666666, ans=0.125 2023-09-30 09:17:10,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 09:17:16,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:17:17,689 INFO [train.py:1039] (3/4) Epoch 19, batch 4250, loss[loss=0.172, simple_loss=0.2468, pruned_loss=0.04861, over 23259.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2521, pruned_loss=0.05122, over 4694426.98 frames. ], batch size: 105, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:17:19,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:17:19,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:17:19,775 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:17:22,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:17:27,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:17:29,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 09:17:29,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:17:32,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:17:36,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:17:40,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:17:40,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:17:43,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:17:43,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:17:46,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:17:48,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:17:49,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:17:51,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:17:52,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:17:54,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 09:17:58,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 09:17:58,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:17:58,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:17:59,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:18:00,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:18:00,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:18:01,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:18:05,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 09:18:07,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:18:11,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:18:11,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=665993.3333333334, ans=0.2 2023-09-30 09:18:13,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:18:13,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 09:18:13,739 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=2.05 vs. limit=12.0 2023-09-30 09:18:14,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:18:14,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 09:18:16,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:18:17,178 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=665993.3333333334, ans=0.125 2023-09-30 09:18:19,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:18:21,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:18:21,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:18:22,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 09:18:24,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:18:24,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:18:24,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=666060.0, ans=0.2 2023-09-30 09:18:27,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:18:30,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:18:31,294 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:18:33,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:18:35,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:18:35,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:18:37,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:18:38,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:18:38,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 09:18:41,244 INFO [train.py:1039] (3/4) Epoch 19, batch 4300, loss[loss=0.1864, simple_loss=0.2571, pruned_loss=0.05785, over 23252.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2516, pruned_loss=0.05058, over 4689799.80 frames. ], batch size: 93, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:18:41,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:18:44,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:18:45,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=666126.6666666666, ans=0.05 2023-09-30 09:18:46,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:18:51,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:18:52,589 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.10 vs. limit=15.0 2023-09-30 09:18:56,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=666193.3333333334, ans=0.1 2023-09-30 09:18:58,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:18:58,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 09:18:59,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:19:01,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:19:01,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:19:01,399 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 09:19:04,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 09:19:04,723 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=666193.3333333334, ans=0.125 2023-09-30 09:19:07,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:19:08,730 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.820e+02 2.101e+02 2.491e+02 4.654e+02, threshold=4.202e+02, percent-clipped=1.0 2023-09-30 09:19:09,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 09:19:09,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:19:09,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 09:19:12,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 09:19:14,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:19:17,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:19:18,098 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=666260.0, ans=0.0 2023-09-30 09:19:19,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:19:19,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:19:20,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:19:22,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:19:24,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 09:19:24,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 09:19:27,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:19:30,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:19:30,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 09:19:30,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:19:30,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:19:30,654 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=666326.6666666666, ans=0.0 2023-09-30 09:19:31,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 09:19:31,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 09:19:31,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 09:19:31,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:19:32,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 09:19:32,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 09:19:32,996 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.86 vs. limit=10.0 2023-09-30 09:19:36,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:19:38,165 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 09:19:40,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:19:42,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:19:42,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:19:46,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 09:19:46,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:19:46,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:19:47,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:19:47,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:19:49,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:19:51,668 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=666393.3333333334, ans=0.125 2023-09-30 09:19:52,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:19:55,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:19:57,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:19:57,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:20:02,593 INFO [train.py:1039] (3/4) Epoch 19, batch 4350, loss[loss=0.1555, simple_loss=0.2293, pruned_loss=0.04089, over 24324.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2527, pruned_loss=0.05092, over 4708762.76 frames. ], batch size: 56, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:20:02,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 09:20:02,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 09:20:02,980 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=666460.0, ans=0.0 2023-09-30 09:20:08,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=666460.0, ans=0.0 2023-09-30 09:20:10,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:20:13,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:20:16,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:20:16,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:20:20,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:20:21,216 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=666526.6666666666, ans=0.1 2023-09-30 09:20:24,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:20:27,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:20:27,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:20:30,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:20:33,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:20:35,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:20:39,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 09:20:41,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:20:42,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:20:47,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:20:50,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 09:20:55,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:20:57,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:21:02,697 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 09:21:02,936 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=666660.0, ans=0.0 2023-09-30 09:21:04,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:21:04,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:21:05,090 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=666660.0, ans=0.2 2023-09-30 09:21:06,167 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 09:21:06,296 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 09:21:07,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:21:07,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:21:09,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:21:09,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:21:10,088 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:21:10,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:21:13,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 09:21:13,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:13,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:21:13,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:14,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 09:21:16,350 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 09:21:16,358 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 09:21:16,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 09:21:20,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:21:20,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:21:22,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:21:23,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:21:23,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 09:21:25,467 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 09:21:25,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:26,833 INFO [train.py:1039] (3/4) Epoch 19, batch 4400, loss[loss=0.1785, simple_loss=0.2631, pruned_loss=0.04697, over 24557.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2533, pruned_loss=0.05147, over 4715217.53 frames. ], batch size: 71, lr: 5.38e-03, grad_scale: 16.0 2023-09-30 09:21:29,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:21:30,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:31,536 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:21:35,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 09:21:35,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 09:21:37,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 09:21:37,052 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 09:21:37,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:21:37,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:21:40,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 09:21:43,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:21:45,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:21:45,248 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 09:21:49,687 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:21:49,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 09:21:49,749 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 09:21:53,090 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=666860.0, ans=0.09899494936611666 2023-09-30 09:21:54,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 09:21:54,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 09:21:54,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 09:21:54,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:21:55,657 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.605e+02 1.851e+02 2.031e+02 2.280e+02 3.356e+02, threshold=4.061e+02, percent-clipped=0.0 2023-09-30 09:21:55,894 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:21:57,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:21:57,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:22:00,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 09:22:00,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 09:22:00,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:22:02,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:22:02,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:22:03,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:22:05,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:22:05,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 09:22:06,770 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 09:22:10,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:22:16,305 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:22:17,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 09:22:18,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=666993.3333333334, ans=0.2 2023-09-30 09:22:23,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:22:24,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:22:29,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:22:29,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 09:22:29,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:22:29,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:22:29,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:22:30,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:22:35,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 09:22:37,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 09:22:37,847 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=6.89 vs. limit=10.0 2023-09-30 09:22:38,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 09:22:38,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:22:38,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 09:22:40,721 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:22:45,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:22:48,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 09:22:49,474 INFO [train.py:1039] (3/4) Epoch 19, batch 4450, loss[loss=0.182, simple_loss=0.2523, pruned_loss=0.05588, over 23379.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2536, pruned_loss=0.05212, over 4709799.96 frames. ], batch size: 285, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:22:51,221 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.56 vs. limit=15.0 2023-09-30 09:22:51,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:22:53,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:22:53,595 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:22:54,240 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.65 vs. limit=22.5 2023-09-30 09:22:59,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=667126.6666666666, ans=0.125 2023-09-30 09:23:02,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:23:02,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:23:05,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:23:07,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:23:10,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:23:10,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:23:11,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 09:23:11,596 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:23:11,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:23:11,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:23:11,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:23:15,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 09:23:17,997 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.47 vs. limit=22.5 2023-09-30 09:23:21,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:23:22,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:23:24,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:23:24,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:23:26,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:23:28,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=667260.0, ans=0.0 2023-09-30 09:23:30,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 09:23:31,000 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 09:23:31,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 09:23:31,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:23:34,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:23:34,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=667260.0, ans=0.125 2023-09-30 09:23:35,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 09:23:35,926 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=667260.0, ans=0.125 2023-09-30 09:23:40,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:23:44,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:23:44,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 09:23:44,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:23:44,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:23:44,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:23:46,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:23:48,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:23:52,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:23:54,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 09:23:55,049 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=13.31 vs. limit=15.0 2023-09-30 09:23:55,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:23:59,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:23:59,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:24:01,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:24:01,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 09:24:02,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:24:04,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=667393.3333333334, ans=0.1 2023-09-30 09:24:06,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 09:24:08,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:24:10,386 INFO [train.py:1039] (3/4) Epoch 19, batch 4500, loss[loss=0.1795, simple_loss=0.2507, pruned_loss=0.05413, over 23662.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.2542, pruned_loss=0.05244, over 4700914.07 frames. ], batch size: 149, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:24:12,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:24:13,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 09:24:13,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 09:24:16,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:24:21,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:24:23,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:24:23,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:24:24,390 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=6.55 vs. limit=15.0 2023-09-30 09:24:25,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:24:25,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:24:25,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:24:39,995 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.619e+02 1.864e+02 2.108e+02 2.361e+02 3.088e+02, threshold=4.216e+02, percent-clipped=0.0 2023-09-30 09:24:40,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:24:41,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:24:43,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:24:43,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:24:44,948 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:24:48,512 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=667593.3333333334, ans=0.125 2023-09-30 09:24:51,208 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 09:24:54,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:24:58,443 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=667660.0, ans=0.2 2023-09-30 09:24:59,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:25:01,431 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:25:02,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:25:02,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 09:25:04,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:25:05,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:25:08,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:25:08,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:25:11,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:25:11,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 09:25:11,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:25:11,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:25:15,167 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.22 vs. limit=15.0 2023-09-30 09:25:17,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:25:17,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:25:20,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:25:22,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:25:22,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:25:23,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 09:25:26,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 09:25:26,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 09:25:31,122 INFO [train.py:1039] (3/4) Epoch 19, batch 4550, loss[loss=0.1805, simple_loss=0.2498, pruned_loss=0.0556, over 23450.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2541, pruned_loss=0.05244, over 4704917.06 frames. ], batch size: 106, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:25:31,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 09:25:33,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 09:25:35,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:25:38,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:25:40,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:25:42,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:25:45,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:25:47,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:25:48,788 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:25:48,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:25:48,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:25:51,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:25:51,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:25:54,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:25:56,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 09:25:58,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 09:25:58,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:25:59,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 09:26:05,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 09:26:07,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:26:10,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 09:26:10,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=667926.6666666666, ans=0.0 2023-09-30 09:26:13,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:26:15,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:15,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:15,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:26:18,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 09:26:21,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:26:24,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:24,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:26:26,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:26:27,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 09:26:28,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 09:26:28,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:26:29,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 09:26:32,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 09:26:32,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:26:34,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:26:35,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:26:35,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:35,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:26:38,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 09:26:39,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 09:26:40,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:26:40,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 09:26:42,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 09:26:42,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:26:42,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 09:26:45,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:26:45,455 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:26:47,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:26:48,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:26:49,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 09:26:49,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=668060.0, ans=0.2 2023-09-30 09:26:50,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:26:53,377 INFO [train.py:1039] (3/4) Epoch 19, batch 4600, loss[loss=0.184, simple_loss=0.2515, pruned_loss=0.0583, over 23566.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2513, pruned_loss=0.0521, over 4689920.53 frames. ], batch size: 256, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:26:53,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:26:56,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:26:56,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:26:58,521 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:26:58,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:26:59,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:27:01,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 09:27:03,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:27:08,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:27:09,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:27:10,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:18,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 09:27:18,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:22,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:25,422 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.849e+02 2.100e+02 2.657e+02 4.568e+02, threshold=4.200e+02, percent-clipped=2.0 2023-09-30 09:27:27,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:27:27,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:27:27,878 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.42 vs. limit=10.0 2023-09-30 09:27:32,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 09:27:32,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 09:27:33,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:27:38,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:38,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:27:40,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:27:45,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 09:27:47,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 09:27:52,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:27:53,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:27:55,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:27:55,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 09:27:56,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:27:58,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 09:27:58,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:28:00,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:28:00,566 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:28:01,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:28:01,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:28:01,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:28:03,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 09:28:04,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 09:28:04,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 09:28:04,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:28:06,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:28:07,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:28:08,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:28:11,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=668393.3333333334, ans=0.125 2023-09-30 09:28:16,116 INFO [train.py:1039] (3/4) Epoch 19, batch 4650, loss[loss=0.1954, simple_loss=0.2663, pruned_loss=0.06223, over 23181.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.2513, pruned_loss=0.05205, over 4690638.67 frames. ], batch size: 105, lr: 5.37e-03, grad_scale: 8.0 2023-09-30 09:28:18,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=668460.0, ans=0.05 2023-09-30 09:28:19,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:28:22,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:28:22,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:28:24,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:28:24,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:28:24,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:28:24,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:28:24,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=668460.0, ans=0.1 2023-09-30 09:28:29,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 09:28:31,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:28:31,532 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:28:34,771 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 09:28:34,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:28:36,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 09:28:36,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:28:36,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 09:28:37,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 09:28:37,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:28:39,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:28:41,204 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=668526.6666666666, ans=0.1 2023-09-30 09:28:42,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:28:43,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:28:43,946 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 09:28:46,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:28:49,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 09:28:52,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:28:52,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:28:53,554 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 09:28:55,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:28:58,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:29:02,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:29:07,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:29:10,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:29:10,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:29:12,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:29:15,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 09:29:15,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=668660.0, ans=0.125 2023-09-30 09:29:16,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 09:29:16,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 09:29:16,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 09:29:17,081 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=668660.0, ans=0.2 2023-09-30 09:29:18,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:29:20,267 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:29:25,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:29:25,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:29:26,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 09:29:26,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:29:26,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:29:26,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:29:30,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:29:30,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=668726.6666666666, ans=0.125 2023-09-30 09:29:31,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:29:31,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:29:34,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:29:38,512 INFO [train.py:1039] (3/4) Epoch 19, batch 4700, loss[loss=0.1674, simple_loss=0.2526, pruned_loss=0.04114, over 24660.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.252, pruned_loss=0.05174, over 4701320.94 frames. ], batch size: 65, lr: 5.37e-03, grad_scale: 8.0 2023-09-30 09:29:38,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:29:38,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:29:38,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:29:38,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 09:29:40,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:29:40,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=668793.3333333334, ans=0.125 2023-09-30 09:29:41,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 09:29:42,326 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=668793.3333333334, ans=0.2 2023-09-30 09:29:42,535 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.58 vs. limit=15.0 2023-09-30 09:29:51,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:29:51,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:29:52,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:29:52,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:29:55,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 09:30:00,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 09:30:02,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 09:30:03,894 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:30:05,914 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:30:05,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:30:07,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=668860.0, ans=0.0 2023-09-30 09:30:09,360 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.848e+02 1.982e+02 2.168e+02 3.287e+02, threshold=3.963e+02, percent-clipped=0.0 2023-09-30 09:30:11,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:30:16,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 09:30:17,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 09:30:19,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=668926.6666666666, ans=0.0 2023-09-30 09:30:21,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:30:27,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 09:30:28,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:30:31,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:35,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 09:30:37,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:30:41,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:30:42,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 09:30:44,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:44,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:30:47,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:30:48,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:30:49,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 09:30:49,141 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 09:30:50,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:30:52,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:52,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:52,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 09:30:52,457 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=669060.0, ans=0.125 2023-09-30 09:30:53,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:30:58,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 09:30:59,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=669126.6666666666, ans=0.125 2023-09-30 09:31:00,080 INFO [train.py:1039] (3/4) Epoch 19, batch 4750, loss[loss=0.1816, simple_loss=0.261, pruned_loss=0.05106, over 23969.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2529, pruned_loss=0.05188, over 4709797.68 frames. ], batch size: 80, lr: 5.37e-03, grad_scale: 8.0 2023-09-30 09:31:01,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:31:03,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:31:06,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:31:06,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:31:09,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 09:31:10,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:31:15,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 09:31:16,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:31:16,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:31:16,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:31:24,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 09:31:28,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:31:31,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 09:31:31,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:31:34,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:31:34,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:31:34,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:31:36,526 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 09:31:36,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 09:31:42,860 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.01 vs. limit=15.0 2023-09-30 09:31:43,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 09:31:45,474 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.76 vs. limit=22.5 2023-09-30 09:31:46,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:31:49,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:31:51,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:31:51,535 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 09:31:51,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:31:53,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:31:56,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:31:57,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=669326.6666666666, ans=0.0 2023-09-30 09:31:58,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 09:31:58,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 09:31:58,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:31:58,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:31:58,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:32:00,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 09:32:01,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 09:32:04,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 09:32:09,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:32:11,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:32:11,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 09:32:11,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:32:13,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:32:16,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:32:16,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:32:18,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 09:32:21,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:32:21,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 09:32:21,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 09:32:23,296 INFO [train.py:1039] (3/4) Epoch 19, batch 4800, loss[loss=0.1804, simple_loss=0.2491, pruned_loss=0.05582, over 23549.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.2545, pruned_loss=0.05241, over 4708004.56 frames. ], batch size: 106, lr: 5.37e-03, grad_scale: 16.0 2023-09-30 09:32:23,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 09:32:25,683 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.03 vs. limit=15.0 2023-09-30 09:32:26,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:32:26,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:32:27,170 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.42 vs. limit=22.5 2023-09-30 09:32:28,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 09:32:29,890 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=669460.0, ans=0.0 2023-09-30 09:32:29,904 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:32:34,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:32:35,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=669460.0, ans=0.125 2023-09-30 09:32:36,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:32:40,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:32:42,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:32:42,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:32:42,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 09:32:44,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:32:44,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:32:44,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:32:48,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass_mid.scale_min, batch_count=669526.6666666666, ans=0.2 2023-09-30 09:32:51,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:32:52,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:32:52,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:32:54,171 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.880e+02 2.098e+02 2.403e+02 3.356e+02, threshold=4.196e+02, percent-clipped=0.0 2023-09-30 09:32:54,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:32:55,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 09:32:55,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:32:56,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:32:58,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:32:59,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:33:01,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:33:01,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:33:03,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=669593.3333333334, ans=0.0 2023-09-30 09:33:04,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 09:33:06,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:33:07,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 09:33:07,828 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 09:33:09,293 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:33:09,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:33:09,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:33:09,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:33:09,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:33:12,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:33:13,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:33:17,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:33:20,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:24,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:33:27,515 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=669726.6666666666, ans=0.125 2023-09-30 09:33:28,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 09:33:28,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:33:30,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:30,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:33:30,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:33:34,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:33:35,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:33:35,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:35,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:33:37,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:33:37,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:33:39,706 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=669726.6666666666, ans=0.2 2023-09-30 09:33:42,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:33:43,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:43,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:33:45,379 INFO [train.py:1039] (3/4) Epoch 19, batch 4850, loss[loss=0.1682, simple_loss=0.2572, pruned_loss=0.03957, over 24054.00 frames. ], tot_loss[loss=0.1802, simple_loss=0.2552, pruned_loss=0.05264, over 4707059.40 frames. ], batch size: 80, lr: 5.36e-03, grad_scale: 16.0 2023-09-30 09:33:45,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 09:33:47,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 09:33:47,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:33:47,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:33:47,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:33:47,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:33:50,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:34:00,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 09:34:00,900 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:34:01,579 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=11.92 vs. limit=15.0 2023-09-30 09:34:05,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:34:07,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 09:34:07,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:34:08,009 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=669860.0, ans=0.035 2023-09-30 09:34:11,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:34:13,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:34:14,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:34:14,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 09:34:19,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:34:20,842 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:34:20,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 09:34:22,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:34:22,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 09:34:25,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:34:25,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:34:26,210 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.18 vs. limit=22.5 2023-09-30 09:34:28,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:34:29,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 09:34:30,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 09:34:31,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:34:39,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:34:39,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 09:34:42,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:34:42,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:34:44,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:34:46,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 09:34:46,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:34:47,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 09:34:47,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:34:49,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:34:49,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 09:34:58,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:35:05,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:35:05,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:35:08,140 INFO [train.py:1039] (3/4) Epoch 19, batch 4900, loss[loss=0.1853, simple_loss=0.2681, pruned_loss=0.05123, over 24439.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2534, pruned_loss=0.05175, over 4709236.44 frames. ], batch size: 69, lr: 5.36e-03, grad_scale: 16.0 2023-09-30 09:35:11,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 09:35:11,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:35:16,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:35:16,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=670126.6666666666, ans=0.07 2023-09-30 09:35:18,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:35:18,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:35:21,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 09:35:25,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 09:35:29,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 09:35:31,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 09:35:31,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:35:31,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:35:31,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:35:31,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:35:31,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:35:33,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 09:35:38,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 09:35:38,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:35:39,975 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.500e+02 2.060e+02 2.332e+02 2.793e+02 4.381e+02, threshold=4.664e+02, percent-clipped=2.0 2023-09-30 09:35:41,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:35:41,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:35:43,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:35:44,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:35:46,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:35:46,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 09:35:46,981 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.min_positive, batch_count=670260.0, ans=0.05 2023-09-30 09:35:49,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:35:49,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:35:49,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 09:35:49,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 09:35:55,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 09:35:56,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:35:58,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:35:58,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:35:59,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:35:59,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 09:35:59,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:36:01,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 09:36:01,696 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=670326.6666666666, ans=0.125 2023-09-30 09:36:03,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:36:04,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 09:36:04,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:36:07,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 09:36:09,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:36:11,468 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 09:36:11,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 09:36:20,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:36:21,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:36:23,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 09:36:23,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 09:36:23,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:36:25,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:36:28,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=670393.3333333334, ans=0.2 2023-09-30 09:36:29,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:36:29,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:36:31,127 INFO [train.py:1039] (3/4) Epoch 19, batch 4950, loss[loss=0.1697, simple_loss=0.2569, pruned_loss=0.0413, over 24459.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2523, pruned_loss=0.05087, over 4722871.98 frames. ], batch size: 69, lr: 5.36e-03, grad_scale: 16.0 2023-09-30 09:36:31,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:36:31,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 09:36:32,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 09:36:34,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:36:35,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 09:36:37,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 09:36:37,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 09:36:37,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:36:39,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 09:36:39,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:36:39,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:36:39,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:36:39,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:36:40,931 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:36:41,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=670460.0, ans=0.0 2023-09-30 09:36:42,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:36:45,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:36:46,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:36:49,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:36:49,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:36:54,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 09:36:58,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:36:59,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:37:00,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=670526.6666666666, ans=0.125 2023-09-30 09:37:01,482 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:37:02,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:04,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:37:06,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 09:37:07,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 09:37:10,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:11,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:37:12,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:37:12,288 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.hidden_balancer.prob, batch_count=670593.3333333334, ans=0.125 2023-09-30 09:37:13,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:37:13,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:37:15,050 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 09:37:16,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:37:18,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:37:20,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:37:22,044 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.04 vs. limit=15.0 2023-09-30 09:37:22,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:37:22,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:24,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 09:37:24,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:37:26,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:37:30,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:37:32,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:37:32,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:37:34,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:34,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:37:34,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:37:35,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:37:37,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:37:37,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:37:39,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 09:37:39,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=670726.6666666666, ans=0.125 2023-09-30 09:37:43,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:37:49,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 09:37:49,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 09:37:53,430 INFO [train.py:1039] (3/4) Epoch 19, batch 5000, loss[loss=0.1427, simple_loss=0.2185, pruned_loss=0.0335, over 24406.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2521, pruned_loss=0.05034, over 4732057.45 frames. ], batch size: 58, lr: 5.36e-03, grad_scale: 8.0 2023-09-30 09:37:57,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:37:57,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:37:59,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 09:38:00,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 09:38:02,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:38:04,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 09:38:04,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:38:04,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:38:05,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 09:38:07,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:38:08,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:38:08,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 09:38:08,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:38:10,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:38:10,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 09:38:11,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 09:38:12,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:38:13,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 09:38:13,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:38:14,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:38:15,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:38:15,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 09:38:15,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 09:38:15,677 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.26 vs. limit=10.0 2023-09-30 09:38:18,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 09:38:18,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:38:18,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:38:19,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 09:38:19,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:38:22,341 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn2.whiten, num_groups=1, num_channels=192, metric=11.05 vs. limit=22.5 2023-09-30 09:38:22,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:38:22,889 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:38:24,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 09:38:26,581 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.806e+02 2.052e+02 2.286e+02 3.432e+02, threshold=4.103e+02, percent-clipped=0.0 2023-09-30 09:38:26,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 09:38:28,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:38:30,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:38:34,761 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 09:38:38,619 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:38:38,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:38:38,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:38:42,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=670993.3333333334, ans=0.1 2023-09-30 09:38:43,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 09:38:43,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:38:43,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:38:43,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:38:46,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 09:38:46,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:38:49,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:38:51,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:38:55,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 09:38:59,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:39:08,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:39:08,337 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=671060.0, ans=0.0 2023-09-30 09:39:10,100 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:39:10,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:39:10,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:39:10,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:39:10,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:39:11,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:39:14,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:39:16,092 INFO [train.py:1039] (3/4) Epoch 19, batch 5050, loss[loss=0.1675, simple_loss=0.2509, pruned_loss=0.04199, over 24654.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2528, pruned_loss=0.05079, over 4723835.24 frames. ], batch size: 65, lr: 5.36e-03, grad_scale: 8.0 2023-09-30 09:39:16,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 09:39:16,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:39:17,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:39:19,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:39:19,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 09:39:20,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:39:20,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:39:22,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:39:24,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:39:24,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=671126.6666666666, ans=0.125 2023-09-30 09:39:25,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:39:38,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 09:39:39,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 09:39:39,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:39:41,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 09:39:41,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:39:44,117 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.05 vs. limit=15.0 2023-09-30 09:39:44,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:39:44,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:39:46,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:39:46,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 09:39:47,713 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 09:39:47,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:39:50,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:39:54,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:39:54,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 09:39:55,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:39:57,326 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=671260.0, ans=0.2 2023-09-30 09:40:00,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 09:40:01,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:40:01,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:40:03,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:40:03,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:40:06,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:40:06,498 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=671326.6666666666, ans=0.0 2023-09-30 09:40:07,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:40:09,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:40:09,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:40:09,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:40:09,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 09:40:11,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:40:12,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=671326.6666666666, ans=0.125 2023-09-30 09:40:13,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:40:19,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:40:19,143 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 09:40:19,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 09:40:20,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:40:22,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:40:22,157 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 09:40:23,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:40:23,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 09:40:23,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:40:24,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=671393.3333333334, ans=0.0 2023-09-30 09:40:28,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:40:28,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:40:28,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 09:40:30,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 09:40:31,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:40:31,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:40:33,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:40:36,317 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 09:40:37,730 INFO [train.py:1039] (3/4) Epoch 19, batch 5100, loss[loss=0.184, simple_loss=0.2519, pruned_loss=0.0581, over 23974.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2537, pruned_loss=0.05125, over 4722807.34 frames. ], batch size: 196, lr: 5.36e-03, grad_scale: 8.0 2023-09-30 09:40:37,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:40:40,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 09:40:41,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 09:40:42,646 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.43 vs. limit=15.0 2023-09-30 09:40:43,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:40:44,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:40:48,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:40:48,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 09:40:50,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 09:40:50,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=671460.0, ans=0.125 2023-09-30 09:40:54,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:40:54,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:40:57,466 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.prob, batch_count=671526.6666666666, ans=0.125 2023-09-30 09:40:58,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:40:58,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=671526.6666666666, ans=0.125 2023-09-30 09:41:01,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 09:41:03,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:41:04,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:41:04,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 09:41:07,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:41:07,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:41:07,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 09:41:10,883 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.887e+02 2.088e+02 2.426e+02 5.296e+02, threshold=4.177e+02, percent-clipped=1.0 2023-09-30 09:41:11,052 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 09:41:11,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:41:12,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 09:41:12,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 09:41:17,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:41:26,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:41:30,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 09:41:30,220 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 09:41:30,233 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 09:41:32,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=671660.0, ans=0.04949747468305833 2023-09-30 09:41:33,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 09:41:33,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:41:34,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 09:41:39,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 09:41:41,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 09:41:42,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 09:41:45,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 09:41:47,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 09:41:47,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 09:41:49,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=671726.6666666666, ans=0.0 2023-09-30 09:41:51,483 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=671726.6666666666, ans=0.125 2023-09-30 09:41:53,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:41:53,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:41:53,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:41:54,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:41:54,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 09:41:55,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:41:57,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 09:41:57,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 09:41:57,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 09:41:59,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:41:59,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 09:42:00,991 INFO [train.py:1039] (3/4) Epoch 19, batch 5150, loss[loss=0.186, simple_loss=0.2626, pruned_loss=0.05468, over 23212.00 frames. ], tot_loss[loss=0.18, simple_loss=0.2553, pruned_loss=0.05234, over 4712504.05 frames. ], batch size: 93, lr: 5.36e-03, grad_scale: 8.0 2023-09-30 09:42:01,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:42:01,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 09:42:03,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:42:04,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:42:10,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:42:10,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 09:42:10,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:42:12,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:42:14,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 09:42:14,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:42:14,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:42:15,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:42:15,569 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:42:17,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 09:42:18,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:42:18,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:42:21,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 09:42:22,877 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten.whitening_limit, batch_count=671860.0, ans=15.0 2023-09-30 09:42:25,194 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 09:42:25,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:42:29,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:42:34,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 09:42:39,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:42:42,601 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=671926.6666666666, ans=0.2 2023-09-30 09:42:43,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:42:45,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:42:51,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:42:53,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:42:54,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 09:42:59,462 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:42:59,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:42:59,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 09:43:03,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:43:03,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:43:04,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 09:43:10,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:43:12,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 09:43:15,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:43:15,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:43:15,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 09:43:16,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:43:16,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:43:17,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:43:20,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:43:21,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:43:23,333 INFO [train.py:1039] (3/4) Epoch 19, batch 5200, loss[loss=0.1674, simple_loss=0.2445, pruned_loss=0.04522, over 20265.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2551, pruned_loss=0.05191, over 4715218.60 frames. ], batch size: 44, lr: 5.35e-03, grad_scale: 16.0 2023-09-30 09:43:23,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:43:23,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=672126.6666666666, ans=0.1 2023-09-30 09:43:28,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=672126.6666666666, ans=0.2 2023-09-30 09:43:29,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 09:43:30,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:43:30,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:43:33,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:43:36,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:43:37,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:43:39,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 09:43:42,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:43:44,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:43:48,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 09:43:48,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=672193.3333333334, ans=0.2 2023-09-30 09:43:49,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:43:51,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:43:51,417 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 09:43:52,980 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 09:43:54,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 09:43:56,148 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.506e+02 1.839e+02 2.019e+02 2.213e+02 3.440e+02, threshold=4.038e+02, percent-clipped=0.0 2023-09-30 09:43:56,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:43:56,288 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 09:43:56,309 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:43:57,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:43:58,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:43:59,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 09:44:00,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:44:02,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:44:05,753 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 09:44:05,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 09:44:07,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 09:44:12,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 09:44:14,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:44:21,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:44:21,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:44:22,442 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.19 vs. limit=15.0 2023-09-30 09:44:23,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 09:44:23,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:44:23,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 09:44:23,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:44:23,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:44:28,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:44:28,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:44:33,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:44:33,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:44:33,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:44:38,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:44:39,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 09:44:39,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:44:40,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=672393.3333333334, ans=0.0 2023-09-30 09:44:41,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:44:41,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:44:42,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 09:44:43,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:44:45,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:44:46,784 INFO [train.py:1039] (3/4) Epoch 19, batch 5250, loss[loss=0.1627, simple_loss=0.2517, pruned_loss=0.03682, over 24549.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2538, pruned_loss=0.05161, over 4710517.64 frames. ], batch size: 71, lr: 5.35e-03, grad_scale: 16.0 2023-09-30 09:44:48,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:44:48,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:44:50,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:44:55,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=672460.0, ans=0.1 2023-09-30 09:44:56,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:44:56,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:44:58,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:45:00,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=672460.0, ans=0.125 2023-09-30 09:45:01,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:45:02,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 09:45:02,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:45:04,485 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:45:06,299 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=672526.6666666666, ans=0.2 2023-09-30 09:45:06,869 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=5.50 vs. limit=12.0 2023-09-30 09:46:02,578 INFO [train.py:1039] (3/4) Epoch 19, batch 5300, loss[loss=0.1607, simple_loss=0.2313, pruned_loss=0.0451, over 24446.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2522, pruned_loss=0.05121, over 4714791.52 frames. ], batch size: 58, lr: 5.35e-03, grad_scale: 16.0 2023-09-30 09:46:04,837 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.56 vs. limit=15.0 2023-09-30 09:46:14,564 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=8.30 vs. limit=15.0 2023-09-30 09:46:16,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:46:16,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 09:46:16,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 09:46:16,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:46:16,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:46:16,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:46:16,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:46:16,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:46:16,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:46:17,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:46:17,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 09:46:18,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:46:18,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 09:46:18,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 09:46:18,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 09:46:18,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 09:46:18,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 09:46:18,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 09:46:18,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:46:19,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:46:19,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:46:19,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:46:19,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:46:20,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:46:20,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:46:20,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:46:20,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:46:20,400 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:46:20,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:46:20,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:46:20,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:46:21,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 09:46:21,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:46:22,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:46:22,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 09:46:22,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 09:46:22,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:46:22,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:46:22,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 09:46:22,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 09:46:23,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:46:23,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:46:23,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:46:24,102 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 09:46:24,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 09:46:24,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:46:24,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:46:24,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 09:46:24,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 09:46:24,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 09:46:24,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:46:34,299 INFO [train.py:1039] (3/4) Epoch 20, batch 0, loss[loss=0.1706, simple_loss=0.2394, pruned_loss=0.05096, over 24265.00 frames. ], tot_loss[loss=0.1706, simple_loss=0.2394, pruned_loss=0.05096, over 24265.00 frames. ], batch size: 56, lr: 5.21e-03, grad_scale: 32.0 2023-09-30 09:46:34,300 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-30 09:46:47,942 INFO [train.py:1071] (3/4) Epoch 20, validation: loss=0.2867, simple_loss=0.2695, pruned_loss=0.152, over 1125622.00 frames. 2023-09-30 09:46:47,943 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-30 09:46:49,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 09:46:49,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:46:52,584 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:46:55,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:46:57,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:46:57,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:46:57,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=672866.6666666666, ans=0.2 2023-09-30 09:46:58,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 09:47:01,570 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.839e+02 2.043e+02 2.275e+02 5.407e+02, threshold=4.087e+02, percent-clipped=3.0 2023-09-30 09:47:01,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 09:47:05,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:47:05,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:47:07,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=672933.3333333334, ans=0.1 2023-09-30 09:47:08,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:47:10,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:47:10,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:47:10,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:47:13,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 09:47:15,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:47:24,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:47:24,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:47:25,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 09:47:30,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:47:30,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:47:31,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:47:35,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:47:40,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:47:43,620 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=673066.6666666666, ans=0.09899494936611666 2023-09-30 09:47:46,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 09:47:50,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 09:47:50,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:47:50,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:47:51,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:47:53,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:47:56,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 09:47:59,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:48:01,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:48:04,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:48:08,966 INFO [train.py:1039] (3/4) Epoch 20, batch 50, loss[loss=0.189, simple_loss=0.2581, pruned_loss=0.05991, over 23593.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2521, pruned_loss=0.04907, over 1080509.25 frames. ], batch size: 256, lr: 5.21e-03, grad_scale: 16.0 2023-09-30 09:48:09,072 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 09:48:10,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:48:13,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:48:16,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:48:16,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 09:48:17,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 09:48:17,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:48:19,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:48:23,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:48:24,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:48:29,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 09:48:29,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:48:36,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 09:48:38,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 09:48:39,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 09:48:41,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:48:43,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:48:43,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:48:44,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:48:45,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 09:48:46,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 09:48:46,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:48:54,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:48:56,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:48:57,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:48:57,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 09:49:00,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 09:49:00,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:49:00,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 09:49:00,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:49:00,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=673400.0, ans=0.0 2023-09-30 09:49:03,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 09:49:10,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:49:10,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:49:12,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:49:12,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:49:12,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:49:15,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 09:49:15,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 09:49:17,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:49:17,096 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:49:18,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:49:20,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:49:20,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 09:49:21,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 09:49:21,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 09:49:23,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:49:24,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:49:25,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 09:49:25,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 09:49:27,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:49:28,554 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:49:29,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:49:30,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:49:31,410 INFO [train.py:1039] (3/4) Epoch 20, batch 100, loss[loss=0.1891, simple_loss=0.2685, pruned_loss=0.05486, over 23358.00 frames. ], tot_loss[loss=0.1788, simple_loss=0.2554, pruned_loss=0.0511, over 1872410.94 frames. ], batch size: 93, lr: 5.21e-03, grad_scale: 16.0 2023-09-30 09:49:33,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:49:34,783 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.19 vs. limit=12.0 2023-09-30 09:49:35,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:49:39,169 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.36 vs. limit=10.0 2023-09-30 09:49:40,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:49:42,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 09:49:42,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:49:46,921 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 09:49:46,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:49:48,171 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.472e+02 1.857e+02 2.032e+02 2.240e+02 3.945e+02, threshold=4.064e+02, percent-clipped=0.0 2023-09-30 09:49:48,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:49:48,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:49:48,377 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:49:49,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 09:49:50,484 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.50 vs. limit=15.0 2023-09-30 09:49:51,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 09:49:51,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:49:53,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:49:53,022 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:49:57,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 09:49:59,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:50:00,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:50:02,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 09:50:05,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:50:07,521 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=673666.6666666666, ans=0.125 2023-09-30 09:50:08,837 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 09:50:08,875 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 09:50:10,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:50:10,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:50:14,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 09:50:15,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:50:16,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=673666.6666666666, ans=0.0 2023-09-30 09:50:19,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:24,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:25,750 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 09:50:26,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=673733.3333333334, ans=0.1 2023-09-30 09:50:27,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 09:50:30,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:50:31,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:50:33,736 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=673733.3333333334, ans=0.125 2023-09-30 09:50:34,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:37,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:50:38,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=673800.0, ans=0.125 2023-09-30 09:50:41,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:50:41,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=673800.0, ans=0.125 2023-09-30 09:50:43,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:50:46,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:46,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=673800.0, ans=0.2 2023-09-30 09:50:48,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:50:49,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:50:49,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:50:49,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:50:51,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 09:50:51,200 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 09:50:51,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:50:52,555 INFO [train.py:1039] (3/4) Epoch 20, batch 150, loss[loss=0.189, simple_loss=0.2626, pruned_loss=0.05769, over 23643.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2546, pruned_loss=0.05067, over 2518846.05 frames. ], batch size: 256, lr: 5.21e-03, grad_scale: 8.0 2023-09-30 09:50:52,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:50:54,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:50:54,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:50:54,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 09:50:54,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:50:54,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 09:50:56,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:50:56,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:50:57,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=673866.6666666666, ans=0.125 2023-09-30 09:50:58,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:50:59,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:50:59,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:51:02,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:51:04,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:51:04,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:51:06,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:09,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:51:09,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:12,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:51:14,386 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:17,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 09:51:17,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 09:51:17,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 09:51:22,039 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:51:22,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:51:22,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:51:24,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:51:24,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:51:25,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:25,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:51:29,373 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 09:51:30,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:51:31,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=674000.0, ans=0.0 2023-09-30 09:51:36,261 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=674000.0, ans=0.125 2023-09-30 09:51:37,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:51:42,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 09:51:42,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 09:51:46,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:51:46,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:51:47,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:51:50,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:51:52,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:51:52,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:51:53,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:51:55,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 09:51:59,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:52:01,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:52:01,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:52:01,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:52:04,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:52:07,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 09:52:09,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 09:52:10,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:52:10,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:52:13,721 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:52:13,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 09:52:13,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:52:15,045 INFO [train.py:1039] (3/4) Epoch 20, batch 200, loss[loss=0.1829, simple_loss=0.2614, pruned_loss=0.0522, over 24500.00 frames. ], tot_loss[loss=0.1791, simple_loss=0.2554, pruned_loss=0.05145, over 3008038.02 frames. ], batch size: 63, lr: 5.21e-03, grad_scale: 8.0 2023-09-30 09:52:15,138 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 09:52:18,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:52:21,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:52:21,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:52:24,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 09:52:26,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:52:26,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:52:28,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 09:52:30,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 09:52:31,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:52:33,172 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.908e+02 2.091e+02 2.356e+02 3.035e+02, threshold=4.181e+02, percent-clipped=0.0 2023-09-30 09:52:34,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:52:37,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:52:37,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:52:37,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:52:45,939 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.25 vs. limit=10.0 2023-09-30 09:52:46,478 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=6.82 vs. limit=8.0 2023-09-30 09:52:58,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:52:58,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:53:00,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 09:53:01,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:53:02,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 09:53:02,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:53:05,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:05,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:53:07,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:53:07,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:53:10,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 09:53:10,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 09:53:10,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:53:12,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=674400.0, ans=0.125 2023-09-30 09:53:14,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:53:21,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:53:27,995 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:29,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:53:34,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:36,198 INFO [train.py:1039] (3/4) Epoch 20, batch 250, loss[loss=0.2325, simple_loss=0.2957, pruned_loss=0.08464, over 19938.00 frames. ], tot_loss[loss=0.179, simple_loss=0.2549, pruned_loss=0.05154, over 3381655.84 frames. ], batch size: 388, lr: 5.21e-03, grad_scale: 8.0 2023-09-30 09:53:37,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 09:53:37,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:53:37,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:53:37,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:53:39,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 09:53:41,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 09:53:42,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:53:42,865 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 09:53:46,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:48,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 09:53:49,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:49,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:53:51,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:53:51,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:53:54,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:53:57,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:54:09,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:54:09,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=674666.6666666666, ans=0.07 2023-09-30 09:54:10,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:54:10,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:54:16,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 09:54:18,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 09:54:19,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:54:19,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:54:21,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 09:54:21,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 09:54:21,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:54:24,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:54:28,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 09:54:28,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:54:29,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 09:54:29,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 09:54:29,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:54:29,841 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=674733.3333333334, ans=0.125 2023-09-30 09:54:31,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:54:32,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:54:32,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 09:54:34,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:54:37,377 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 09:54:37,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:54:41,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 09:54:42,187 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.05 vs. limit=15.0 2023-09-30 09:54:45,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:54:50,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 09:54:54,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:54:56,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:54:59,541 INFO [train.py:1039] (3/4) Epoch 20, batch 300, loss[loss=0.1616, simple_loss=0.2415, pruned_loss=0.04089, over 23511.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2528, pruned_loss=0.05161, over 3669387.62 frames. ], batch size: 93, lr: 5.21e-03, grad_scale: 8.0 2023-09-30 09:54:59,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 09:55:01,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:55:01,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 09:55:01,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 09:55:03,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 09:55:03,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:55:04,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 09:55:09,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:55:10,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:55:13,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 09:55:14,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 09:55:16,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:55:16,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 09:55:17,751 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.494e+02 1.911e+02 2.106e+02 2.458e+02 4.276e+02, threshold=4.211e+02, percent-clipped=1.0 2023-09-30 09:55:17,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 09:55:17,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:55:21,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 09:55:25,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 09:55:25,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 09:55:29,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 09:55:29,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:55:32,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:55:36,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:55:36,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 09:55:36,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 09:55:39,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:55:39,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=675000.0, ans=0.2 2023-09-30 09:55:40,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:55:40,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:55:46,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 09:55:46,850 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 09:55:49,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 09:55:52,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:55:53,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 09:55:55,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:55:59,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:56:01,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:56:01,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 09:56:07,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:56:07,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 09:56:08,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:56:11,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:56:12,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 09:56:12,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 09:56:12,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:56:14,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 09:56:16,779 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.76 vs. limit=6.0 2023-09-30 09:56:17,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:56:17,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:19,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:56:19,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:56:20,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:22,155 INFO [train.py:1039] (3/4) Epoch 20, batch 350, loss[loss=0.1994, simple_loss=0.2748, pruned_loss=0.06206, over 23298.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2523, pruned_loss=0.05133, over 3906827.70 frames. ], batch size: 105, lr: 5.20e-03, grad_scale: 4.0 2023-09-30 09:56:24,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:56:24,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 09:56:27,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:34,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:56:37,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:56:38,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:42,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 09:56:42,327 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=675266.6666666666, ans=0.125 2023-09-30 09:56:43,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:56:44,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 09:56:47,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:56:48,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 09:56:48,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:56:51,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 09:56:53,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:56:55,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 09:56:56,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:56:58,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:56:58,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:56:58,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:56:58,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:56:58,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 09:57:01,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:57:01,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:57:09,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:57:09,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 09:57:09,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:57:11,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:57:15,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 09:57:15,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:57:17,771 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=675400.0, ans=0.0 2023-09-30 09:57:22,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:57:22,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:57:22,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:57:23,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 09:57:24,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=675400.0, ans=0.125 2023-09-30 09:57:25,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:57:27,023 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 09:57:28,514 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 09:57:28,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:57:30,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=675466.6666666666, ans=0.1 2023-09-30 09:57:31,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 09:57:31,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 09:57:35,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:57:36,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 09:57:38,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:57:40,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:57:40,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:57:42,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 09:57:43,417 INFO [train.py:1039] (3/4) Epoch 20, batch 400, loss[loss=0.1983, simple_loss=0.2466, pruned_loss=0.07505, over 19290.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2513, pruned_loss=0.05083, over 4081961.80 frames. ], batch size: 388, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 09:57:43,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:57:45,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 09:57:47,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 09:57:48,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:57:48,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:57:50,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:57:51,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:57:53,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:57:55,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:57:57,010 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 09:57:57,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 09:57:57,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:57:57,532 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 09:58:00,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 09:58:00,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:58:03,702 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.898e+02 2.066e+02 2.335e+02 3.981e+02, threshold=4.133e+02, percent-clipped=0.0 2023-09-30 09:58:04,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 09:58:04,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:58:04,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 09:58:05,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:58:05,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 09:58:05,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:58:06,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:58:09,339 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 09:58:10,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 09:58:15,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:58:16,298 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.83 vs. limit=6.0 2023-09-30 09:58:18,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:58:18,765 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=675666.6666666666, ans=0.0 2023-09-30 09:58:19,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 09:58:20,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 09:58:23,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 09:58:24,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:58:27,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=675666.6666666666, ans=0.2 2023-09-30 09:58:33,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 09:58:34,718 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 09:58:36,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 09:58:38,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 09:58:41,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 09:58:41,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 09:58:43,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=675733.3333333334, ans=0.125 2023-09-30 09:58:45,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 09:58:47,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 09:58:48,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:58:51,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:58:51,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 09:58:53,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 09:58:54,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 09:58:58,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 09:58:58,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 09:59:01,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 09:59:03,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 09:59:03,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 09:59:04,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 09:59:04,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 09:59:04,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 09:59:06,081 INFO [train.py:1039] (3/4) Epoch 20, batch 450, loss[loss=0.1907, simple_loss=0.2604, pruned_loss=0.06051, over 23744.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2522, pruned_loss=0.05117, over 4223227.67 frames. ], batch size: 195, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 09:59:06,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 09:59:07,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 09:59:07,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 09:59:09,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 09:59:11,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 09:59:13,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 09:59:25,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:59:25,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 09:59:26,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 09:59:26,888 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=675933.3333333334, ans=0.0 2023-09-30 09:59:28,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 09:59:32,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 09:59:33,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:59:34,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:59:40,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:59:41,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 09:59:44,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 09:59:44,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 09:59:46,727 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=676000.0, ans=0.0 2023-09-30 09:59:48,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 09:59:49,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 09:59:49,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 09:59:51,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 09:59:53,449 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 09:59:53,463 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 09:59:53,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 09:59:55,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=676066.6666666666, ans=0.125 2023-09-30 09:59:57,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 09:59:58,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 10:00:01,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 10:00:01,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:00:03,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 10:00:05,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 10:00:06,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:00:09,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:00:09,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:00:12,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 10:00:14,544 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=676133.3333333334, ans=0.125 2023-09-30 10:00:16,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:00:17,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 10:00:19,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 10:00:19,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:00:26,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:00:27,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:00:29,610 INFO [train.py:1039] (3/4) Epoch 20, batch 500, loss[loss=0.1805, simple_loss=0.2516, pruned_loss=0.05476, over 23727.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2528, pruned_loss=0.05138, over 4341810.62 frames. ], batch size: 232, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:00:29,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:00:29,747 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 10:00:33,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:00:34,353 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.76 vs. limit=22.5 2023-09-30 10:00:34,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:00:35,061 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:00:35,086 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 10:00:38,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 10:00:38,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:00:41,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 10:00:47,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 10:00:48,764 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.475e+02 1.813e+02 1.975e+02 2.252e+02 5.149e+02, threshold=3.950e+02, percent-clipped=1.0 2023-09-30 10:00:48,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:00:51,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:00:51,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:00:52,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:05,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:01:05,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:01:05,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 10:01:05,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:01:05,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 10:01:05,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:01:10,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:01:10,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:01:10,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:01:10,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:01:12,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 10:01:15,470 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 10:01:17,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:01:18,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:20,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:21,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:21,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:01:23,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 10:01:26,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:01:28,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:01:33,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:01:35,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:01:37,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=676466.6666666666, ans=0.1 2023-09-30 10:01:42,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:01:47,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 10:01:47,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:01:47,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:01:49,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 10:01:50,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 10:01:51,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=676533.3333333334, ans=0.0 2023-09-30 10:01:52,022 INFO [train.py:1039] (3/4) Epoch 20, batch 550, loss[loss=0.1685, simple_loss=0.2456, pruned_loss=0.04566, over 24433.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2532, pruned_loss=0.05116, over 4428441.34 frames. ], batch size: 63, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:01:52,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:01:58,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 10:01:59,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 10:01:59,828 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:01:59,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 10:02:01,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:02:01,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:02:02,805 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:04,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:04,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:02:06,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:02:08,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:02:09,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 10:02:10,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:02:13,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:02:13,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:17,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:02:17,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:23,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 10:02:23,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 10:02:24,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:02:26,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer_na.min_abs, batch_count=676666.6666666666, ans=0.02 2023-09-30 10:02:30,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:02:30,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:02:32,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:02:36,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:02:36,852 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 10:02:36,989 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:02:39,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 10:02:42,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:02:42,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:02:42,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:02:44,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:02:45,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 10:02:48,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 10:02:49,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:02:49,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:02:49,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:02:49,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:02:52,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:02:54,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:02:57,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:02:58,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:02:59,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 10:03:00,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:03:00,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:03:02,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 10:03:02,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:03:03,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 10:03:03,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 10:03:10,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 10:03:12,620 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:03:13,671 INFO [train.py:1039] (3/4) Epoch 20, batch 600, loss[loss=0.1791, simple_loss=0.2661, pruned_loss=0.04601, over 24683.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2539, pruned_loss=0.05143, over 4496248.50 frames. ], batch size: 73, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:03:14,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 10:03:14,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:03:14,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=676866.6666666666, ans=0.125 2023-09-30 10:03:15,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:03:15,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:03:25,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:03:25,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 10:03:28,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 10:03:31,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:03:33,013 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.840e+02 2.105e+02 2.465e+02 3.570e+02, threshold=4.210e+02, percent-clipped=0.0 2023-09-30 10:03:33,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:03:36,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:03:37,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 10:03:37,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:03:41,259 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=676933.3333333334, ans=0.125 2023-09-30 10:03:42,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 10:03:47,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:03:47,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:03:47,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=677000.0, ans=0.125 2023-09-30 10:03:48,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:03:54,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:03:54,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:03:54,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:04:02,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:04:06,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:04:06,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:04:06,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:04:10,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=677066.6666666666, ans=0.125 2023-09-30 10:04:12,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 10:04:18,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 10:04:18,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:04:22,477 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.87 vs. limit=6.0 2023-09-30 10:04:23,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 10:04:25,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:04:26,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 10:04:27,022 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:04:28,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:04:30,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=677133.3333333334, ans=0.0 2023-09-30 10:04:34,464 INFO [train.py:1039] (3/4) Epoch 20, batch 650, loss[loss=0.177, simple_loss=0.2626, pruned_loss=0.04574, over 24422.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.253, pruned_loss=0.05128, over 4534818.69 frames. ], batch size: 69, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:04:36,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 10:04:37,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 10:04:39,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:04:42,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:04:43,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:04:44,312 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=677200.0, ans=0.125 2023-09-30 10:04:46,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 10:04:47,186 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=677200.0, ans=0.125 2023-09-30 10:04:48,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:04:53,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:04:53,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:04:57,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:05:01,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 10:05:04,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:05:04,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:05:06,936 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.84 vs. limit=6.0 2023-09-30 10:05:09,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:05:09,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 10:05:11,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:05:11,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:12,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 10:05:14,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:15,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:05:17,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 10:05:17,352 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 10:05:17,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:05:17,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:05:20,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:20,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:05:22,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:05:22,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:05:23,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 10:05:23,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:05:23,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:05:28,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:05:28,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:05:29,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:05:31,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 10:05:33,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 10:05:33,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:33,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:05:33,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:05:34,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:05:35,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:05:41,489 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:05:41,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:05:43,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:05:46,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:05:46,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:05:46,171 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:05:46,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=677466.6666666666, ans=0.125 2023-09-30 10:05:52,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=677466.6666666666, ans=0.125 2023-09-30 10:05:53,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:05:53,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:05:53,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:05:54,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=677466.6666666666, ans=0.125 2023-09-30 10:05:55,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:05:56,757 INFO [train.py:1039] (3/4) Epoch 20, batch 700, loss[loss=0.1793, simple_loss=0.2414, pruned_loss=0.05864, over 22773.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2524, pruned_loss=0.05081, over 4576183.97 frames. ], batch size: 322, lr: 5.20e-03, grad_scale: 8.0 2023-09-30 10:06:00,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 10:06:02,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 10:06:04,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 10:06:04,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:06:06,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:06:08,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 10:06:12,066 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.25 vs. limit=10.0 2023-09-30 10:06:14,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:06:15,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:06:17,043 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.862e+02 2.095e+02 2.460e+02 3.900e+02, threshold=4.189e+02, percent-clipped=0.0 2023-09-30 10:06:18,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:06:18,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:06:20,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:06:23,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:06:24,271 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=6.11 vs. limit=12.0 2023-09-30 10:06:24,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 10:06:25,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:06:26,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 10:06:29,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 10:06:35,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:06:36,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:06:37,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:06:41,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:06:41,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 10:06:45,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:06:47,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:06:47,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 10:06:50,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:06:52,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:06:52,801 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.62 vs. limit=15.0 2023-09-30 10:06:55,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:07:00,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:07:00,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 10:07:06,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 10:07:07,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 10:07:10,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:07:10,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:07:12,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:07:13,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:07:13,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 10:07:18,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 10:07:18,343 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 10:07:18,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 10:07:19,754 INFO [train.py:1039] (3/4) Epoch 20, batch 750, loss[loss=0.1686, simple_loss=0.2389, pruned_loss=0.04921, over 24345.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2521, pruned_loss=0.05074, over 4610155.51 frames. ], batch size: 56, lr: 5.19e-03, grad_scale: 8.0 2023-09-30 10:07:21,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 10:07:21,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 10:07:21,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:07:23,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 10:07:24,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:07:24,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:07:27,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:07:30,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:07:30,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:07:32,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:07:33,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:07:35,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:07:36,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:07:40,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:07:40,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:07:40,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 10:07:43,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:07:43,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:07:45,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:07:46,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 10:07:47,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 10:07:47,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:07:49,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 10:07:49,957 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 10:07:50,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 10:07:50,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:07:50,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 10:07:53,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:07:59,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:07:59,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:07:59,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:07:59,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=678000.0, ans=0.125 2023-09-30 10:08:02,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:08:04,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:08:04,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 10:08:04,437 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=678000.0, ans=0.0 2023-09-30 10:08:05,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:08:05,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 10:08:07,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:08:08,062 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=12.66 vs. limit=15.0 2023-09-30 10:08:11,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:08:12,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 10:08:12,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:08:19,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:08:22,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:08:22,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:08:24,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:08:28,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 10:08:28,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:08:28,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:08:32,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:08:33,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:08:36,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:08:37,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:08:40,984 INFO [train.py:1039] (3/4) Epoch 20, batch 800, loss[loss=0.1785, simple_loss=0.2482, pruned_loss=0.05441, over 23903.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2525, pruned_loss=0.05044, over 4638391.88 frames. ], batch size: 179, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:08:44,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:08:44,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:08:46,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=678200.0, ans=0.07 2023-09-30 10:08:47,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:08:47,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:08:48,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:08:48,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:08:50,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:08:54,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:08:55,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:08:58,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 10:09:00,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:09:01,441 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.508e+02 1.889e+02 2.125e+02 2.539e+02 3.349e+02, threshold=4.249e+02, percent-clipped=0.0 2023-09-30 10:09:01,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:09:01,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:09:03,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:09:03,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 10:09:03,149 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:09:03,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 10:09:07,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:09:10,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:09:12,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:09:12,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:09:14,246 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=678333.3333333334, ans=0.2 2023-09-30 10:09:14,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=678333.3333333334, ans=0.0 2023-09-30 10:09:16,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:09:16,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:09:23,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:09:23,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:09:23,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=678333.3333333334, ans=0.125 2023-09-30 10:09:24,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 10:09:27,137 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 10:09:27,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 10:09:27,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:09:27,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:09:29,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:09:31,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:09:36,376 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 10:09:37,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 10:09:39,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:09:39,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:09:42,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:09:45,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:09:46,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 10:09:47,460 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:09:51,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 10:09:58,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:09:58,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer_na.min_abs, batch_count=678466.6666666666, ans=0.02 2023-09-30 10:10:00,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:10:02,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 10:10:04,509 INFO [train.py:1039] (3/4) Epoch 20, batch 850, loss[loss=0.1777, simple_loss=0.2565, pruned_loss=0.04944, over 20097.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2529, pruned_loss=0.05018, over 4659859.35 frames. ], batch size: 44, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:10:04,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:10:04,771 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:10:07,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 10:10:07,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:10:07,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:10:08,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:10:10,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:10:11,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:10:13,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 10:10:13,309 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 10:10:13,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 10:10:14,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:10:14,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:10:15,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=678533.3333333334, ans=0.125 2023-09-30 10:10:17,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:10:17,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:10:17,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:10:19,824 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=678600.0, ans=0.0 2023-09-30 10:10:23,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:10:24,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:10:25,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 10:10:27,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 10:10:27,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=678600.0, ans=0.1 2023-09-30 10:10:30,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:10:32,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 10:10:38,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 10:10:39,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 10:10:39,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=678666.6666666666, ans=0.2 2023-09-30 10:10:43,208 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 10:10:43,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:10:43,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:10:43,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 10:10:44,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=678666.6666666666, ans=0.125 2023-09-30 10:10:46,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:10:47,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:10:47,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 10:10:49,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:10:51,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:10:51,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:10:51,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:10:52,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:10:55,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 10:10:55,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 10:11:00,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:11:00,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:11:01,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer_na.min_abs, batch_count=678733.3333333334, ans=0.02 2023-09-30 10:11:02,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:11:02,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:11:03,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:11:08,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:11:09,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:11:12,028 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=17.96 vs. limit=22.5 2023-09-30 10:11:13,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:11:13,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:11:14,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:11:23,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 10:11:24,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:11:24,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 10:11:25,464 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.21 vs. limit=15.0 2023-09-30 10:11:26,245 INFO [train.py:1039] (3/4) Epoch 20, batch 900, loss[loss=0.1756, simple_loss=0.2506, pruned_loss=0.05027, over 23412.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2538, pruned_loss=0.05067, over 4672030.78 frames. ], batch size: 93, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:11:26,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:11:26,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:11:27,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 10:11:36,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:11:37,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:11:39,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 10:11:40,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:11:40,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 10:11:44,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 10:11:45,392 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.625e+02 1.863e+02 2.352e+02 2.808e+02 3.950e+02, threshold=4.705e+02, percent-clipped=0.0 2023-09-30 10:11:45,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:11:45,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:11:45,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:11:45,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:11:54,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:11:54,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:11:55,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:11:59,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:12:05,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 10:12:05,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:12:13,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:12:15,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:12:15,173 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 10:12:16,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 10:12:22,117 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:12:22,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:12:22,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:12:28,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:12:29,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=679066.6666666666, ans=0.0 2023-09-30 10:12:30,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:12:31,198 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.78 vs. limit=15.0 2023-09-30 10:12:31,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 10:12:31,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:12:34,762 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 10:12:35,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=679133.3333333334, ans=0.125 2023-09-30 10:12:36,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:12:37,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:12:39,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:12:39,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:12:43,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 10:12:43,121 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 10:12:43,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 10:12:43,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 10:12:47,498 INFO [train.py:1039] (3/4) Epoch 20, batch 950, loss[loss=0.1625, simple_loss=0.2408, pruned_loss=0.04211, over 24450.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2543, pruned_loss=0.05111, over 4688747.03 frames. ], batch size: 58, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:12:47,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:12:50,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 10:12:56,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:13:00,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:13:00,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:13:00,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 10:13:03,205 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 10:13:03,614 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=679266.6666666666, ans=0.125 2023-09-30 10:13:06,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:13:07,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:13:08,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:13:08,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:13:09,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 10:13:11,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 10:13:13,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:13:13,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 10:13:14,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:13:18,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:13:18,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:13:18,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:13:19,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 10:13:22,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 10:13:24,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:13:26,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:13:32,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:13:32,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:13:36,171 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 10:13:37,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 10:13:37,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:13:39,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:13:40,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:13:40,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:13:44,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 10:13:46,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:13:47,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:13:47,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:13:47,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 10:13:49,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:13:49,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:13:49,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 10:13:53,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:13:57,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:13:59,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=679466.6666666666, ans=0.0 2023-09-30 10:14:00,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:14:02,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 10:14:02,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 10:14:07,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:14:08,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=679466.6666666666, ans=0.0 2023-09-30 10:14:10,686 INFO [train.py:1039] (3/4) Epoch 20, batch 1000, loss[loss=0.1644, simple_loss=0.2483, pruned_loss=0.04022, over 24344.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2528, pruned_loss=0.05086, over 4683481.49 frames. ], batch size: 74, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:14:13,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 10:14:15,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:14:17,620 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.22 vs. limit=15.0 2023-09-30 10:14:20,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:14:20,559 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 10:14:20,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 10:14:20,945 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=679533.3333333334, ans=0.04949747468305833 2023-09-30 10:14:25,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:14:25,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:14:28,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:14:29,653 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.877e+02 1.981e+02 2.279e+02 3.470e+02, threshold=3.963e+02, percent-clipped=0.0 2023-09-30 10:14:29,933 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 10:14:30,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=679600.0, ans=0.0 2023-09-30 10:14:33,069 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=679600.0, ans=0.0 2023-09-30 10:14:36,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 10:14:37,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 10:14:37,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:14:38,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 10:14:40,399 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 10:14:41,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 10:14:43,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:14:45,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:14:55,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:14:55,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:14:56,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:14:56,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:14:56,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 10:14:57,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:14:57,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:14:58,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:14:58,587 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 10:15:01,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 10:15:01,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=679733.3333333334, ans=0.125 2023-09-30 10:15:04,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 10:15:04,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 10:15:06,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:15:06,653 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=679733.3333333334, ans=0.0 2023-09-30 10:15:15,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:15:15,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:15:15,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:15:17,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:15:19,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 10:15:21,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:15:21,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 10:15:22,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 10:15:24,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:15:24,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:15:26,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:15:30,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:15:31,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:15:34,043 INFO [train.py:1039] (3/4) Epoch 20, batch 1050, loss[loss=0.1838, simple_loss=0.2638, pruned_loss=0.05192, over 24675.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.252, pruned_loss=0.05073, over 4679577.61 frames. ], batch size: 65, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:15:35,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:15:37,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:15:40,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 10:15:41,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:15:43,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:15:46,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:15:48,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:15:50,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:15:52,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:15:52,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:15:53,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:15:55,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 10:15:55,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:15:55,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 10:15:59,039 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:15:59,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 10:15:59,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:16:04,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:16:04,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:16:04,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:16:07,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 10:16:07,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 10:16:09,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:16:11,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 10:16:13,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 10:16:14,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:16:16,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=680000.0, ans=0.0 2023-09-30 10:16:19,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 10:16:19,543 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=680000.0, ans=0.125 2023-09-30 10:16:22,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 10:16:23,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:16:24,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:16:29,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:16:32,959 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 10:16:34,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 10:16:34,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 10:16:34,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:16:34,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:16:38,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 10:16:41,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:16:42,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:16:42,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:16:43,242 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=680133.3333333334, ans=0.2 2023-09-30 10:16:44,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:16:44,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:16:48,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:16:48,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 10:16:49,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:16:49,842 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 10:16:49,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 10:16:51,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:16:56,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:16:58,158 INFO [train.py:1039] (3/4) Epoch 20, batch 1100, loss[loss=0.1819, simple_loss=0.2629, pruned_loss=0.05044, over 24645.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2516, pruned_loss=0.0505, over 4684470.57 frames. ], batch size: 68, lr: 5.19e-03, grad_scale: 16.0 2023-09-30 10:16:58,737 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=680200.0, ans=0.125 2023-09-30 10:17:01,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:17:06,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:17:08,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:17:10,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:17:10,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 10:17:11,390 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.29 vs. limit=15.0 2023-09-30 10:17:12,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:17:15,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 10:17:16,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:17:17,915 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.546e+02 1.800e+02 1.958e+02 2.202e+02 3.142e+02, threshold=3.917e+02, percent-clipped=0.0 2023-09-30 10:17:19,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:17:19,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 10:17:21,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 10:17:23,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:17:23,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:17:26,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:17:27,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:17:33,912 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:17:36,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 10:17:36,508 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 10:17:37,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:17:40,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:17:41,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:17:41,599 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:17:43,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 10:17:43,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:17:43,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:17:45,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:17:45,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:17:45,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 10:17:50,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:17:50,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 10:17:53,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:17:59,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:18:02,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 10:18:04,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 10:18:04,632 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=680466.6666666666, ans=0.1 2023-09-30 10:18:05,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:18:08,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:18:09,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:18:11,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 10:18:12,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:18:12,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:18:14,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 10:18:14,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:18:16,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 10:18:16,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:18:16,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:18:18,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:18:18,891 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.62 vs. limit=12.0 2023-09-30 10:18:21,502 INFO [train.py:1039] (3/4) Epoch 20, batch 1150, loss[loss=0.2204, simple_loss=0.278, pruned_loss=0.08137, over 19547.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2524, pruned_loss=0.05034, over 4689179.25 frames. ], batch size: 388, lr: 5.18e-03, grad_scale: 16.0 2023-09-30 10:18:24,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:18:28,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:18:30,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:18:30,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:18:30,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 10:18:30,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:18:34,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 10:18:36,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:18:36,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:18:40,057 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=6.82 vs. limit=15.0 2023-09-30 10:18:40,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 10:18:43,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:18:47,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:18:48,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:18:48,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 10:18:49,004 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:18:50,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:18:53,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 10:18:55,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:18:57,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:19:06,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:19:07,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=680666.6666666666, ans=0.125 2023-09-30 10:19:15,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:19:17,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 10:19:17,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:19:17,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:19:26,274 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 10:19:27,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:19:32,833 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 10:19:38,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:19:38,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:19:39,743 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:19:39,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:19:43,296 INFO [train.py:1039] (3/4) Epoch 20, batch 1200, loss[loss=0.1562, simple_loss=0.2345, pruned_loss=0.039, over 24496.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.253, pruned_loss=0.05061, over 4699459.07 frames. ], batch size: 63, lr: 5.18e-03, grad_scale: 32.0 2023-09-30 10:19:43,908 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=680866.6666666666, ans=0.125 2023-09-30 10:19:44,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:19:49,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:19:49,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:19:51,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:19:51,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:19:52,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:19:55,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:19:57,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:19:59,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:19:59,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:20:02,633 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.580e+02 1.868e+02 2.067e+02 2.397e+02 3.713e+02, threshold=4.134e+02, percent-clipped=0.0 2023-09-30 10:20:02,862 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 10:20:04,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 10:20:06,490 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=680933.3333333334, ans=0.0 2023-09-30 10:20:08,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:20:11,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:20:14,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:20:15,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:20:15,822 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 10:20:16,237 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=681000.0, ans=10.0 2023-09-30 10:20:17,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:20:25,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:20:25,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:20:25,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 10:20:27,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:20:30,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 10:20:34,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 10:20:34,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:20:36,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:20:36,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=681066.6666666666, ans=0.125 2023-09-30 10:20:38,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:20:38,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:20:39,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=681066.6666666666, ans=0.0 2023-09-30 10:20:41,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:20:41,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:20:43,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:20:43,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 10:20:44,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:20:44,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:20:44,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:20:48,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:20:48,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:20:52,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 10:20:52,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=681133.3333333334, ans=0.1 2023-09-30 10:20:53,438 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.96 vs. limit=15.0 2023-09-30 10:20:55,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:20:57,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 10:21:01,166 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 10:21:03,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:21:05,358 INFO [train.py:1039] (3/4) Epoch 20, batch 1250, loss[loss=0.169, simple_loss=0.2398, pruned_loss=0.04909, over 21021.00 frames. ], tot_loss[loss=0.1787, simple_loss=0.2547, pruned_loss=0.05139, over 4705332.96 frames. ], batch size: 46, lr: 5.18e-03, grad_scale: 16.0 2023-09-30 10:21:06,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:21:09,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:21:10,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:21:14,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 10:21:17,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:21:19,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:21:20,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 10:21:22,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=681266.6666666666, ans=0.1 2023-09-30 10:21:23,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:21:25,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:21:29,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 10:21:30,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:21:31,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:21:31,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:21:33,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:21:35,570 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=681266.6666666666, ans=0.1 2023-09-30 10:21:38,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 10:21:38,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:21:38,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:21:40,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:21:41,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:21:43,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:21:45,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:21:51,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 10:21:51,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:21:53,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:21:54,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 10:21:54,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:21:54,695 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 10:21:56,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:21:56,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:21:58,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:21:58,192 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=681400.0, ans=0.2 2023-09-30 10:22:02,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:22:02,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:22:04,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 10:22:05,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 10:22:05,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 10:22:08,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:22:09,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 10:22:11,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:22:15,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 10:22:15,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:22:16,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 10:22:16,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 10:22:17,168 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=681466.6666666666, ans=0.1 2023-09-30 10:22:18,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:22:18,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 10:22:18,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:22:19,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 10:22:23,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:22:25,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:22:27,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:22:28,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:22:29,439 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.61 vs. limit=10.0 2023-09-30 10:22:30,057 INFO [train.py:1039] (3/4) Epoch 20, batch 1300, loss[loss=0.1679, simple_loss=0.2538, pruned_loss=0.04102, over 24586.00 frames. ], tot_loss[loss=0.1793, simple_loss=0.2545, pruned_loss=0.05202, over 4698651.58 frames. ], batch size: 71, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:22:31,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:22:32,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=681533.3333333334, ans=0.0 2023-09-30 10:22:33,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 10:22:33,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=681533.3333333334, ans=0.125 2023-09-30 10:22:37,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:22:38,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=681533.3333333334, ans=0.1 2023-09-30 10:22:41,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 10:22:41,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:22:44,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:22:44,652 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:22:46,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 10:22:47,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=681600.0, ans=0.125 2023-09-30 10:22:52,547 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.925e+02 2.165e+02 2.491e+02 3.486e+02, threshold=4.330e+02, percent-clipped=0.0 2023-09-30 10:22:52,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:22:54,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:22:56,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 10:22:57,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:23:03,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:23:04,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:23:06,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:23:06,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:23:07,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:23:07,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 10:23:07,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 10:23:14,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:23:16,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:23:17,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 10:23:17,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 10:23:20,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:23:23,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:23:23,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 10:23:25,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:23:25,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 10:23:26,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=681733.3333333334, ans=0.5 2023-09-30 10:23:27,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:23:31,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:23:31,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:23:34,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 10:23:34,414 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 10:23:36,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 10:23:38,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=681800.0, ans=0.1 2023-09-30 10:23:38,322 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=681800.0, ans=0.1 2023-09-30 10:23:38,613 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.63 vs. limit=6.0 2023-09-30 10:23:40,944 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:23:43,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 10:23:45,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:23:52,108 INFO [train.py:1039] (3/4) Epoch 20, batch 1350, loss[loss=0.1981, simple_loss=0.2621, pruned_loss=0.06706, over 23729.00 frames. ], tot_loss[loss=0.1789, simple_loss=0.2541, pruned_loss=0.05183, over 4708957.25 frames. ], batch size: 164, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:23:52,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 10:23:55,997 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.07 vs. limit=15.0 2023-09-30 10:23:56,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:24:00,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:24:04,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:24:04,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:24:07,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:24:07,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:24:07,547 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:24:10,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:24:12,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 10:24:15,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:24:15,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:24:18,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 10:24:18,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:24:19,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:24:19,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 10:24:21,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 10:24:23,661 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:24:24,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 10:24:27,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:24:27,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 10:24:40,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:24:50,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:24:50,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:24:50,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 10:24:53,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:24:54,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 10:24:54,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:24:54,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:24:58,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:25:01,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 10:25:01,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:25:07,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=682133.3333333334, ans=0.125 2023-09-30 10:25:08,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 10:25:09,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 10:25:14,391 INFO [train.py:1039] (3/4) Epoch 20, batch 1400, loss[loss=0.1754, simple_loss=0.2517, pruned_loss=0.04955, over 23318.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2524, pruned_loss=0.05117, over 4687394.20 frames. ], batch size: 93, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:25:16,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 10:25:18,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:25:20,127 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=682200.0, ans=0.1 2023-09-30 10:25:21,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:25:23,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:25:29,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 10:25:32,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 10:25:37,267 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.836e+02 1.987e+02 2.257e+02 3.606e+02, threshold=3.975e+02, percent-clipped=0.0 2023-09-30 10:25:42,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:25:44,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:25:47,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:25:47,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 10:25:52,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:25:52,636 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 10:25:59,639 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=682333.3333333334, ans=0.125 2023-09-30 10:26:03,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:03,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:05,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=682400.0, ans=0.2 2023-09-30 10:26:09,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 10:26:09,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:26:09,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:26:10,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:26:10,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:26:12,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:26:12,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:26:13,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:26:14,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 10:26:14,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:26:14,776 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=16.90 vs. limit=22.5 2023-09-30 10:26:22,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:25,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:26:29,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=682466.6666666666, ans=0.125 2023-09-30 10:26:31,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 10:26:32,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 10:26:34,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:26:35,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 10:26:35,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:26:36,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=682533.3333333334, ans=0.0 2023-09-30 10:26:37,955 INFO [train.py:1039] (3/4) Epoch 20, batch 1450, loss[loss=0.1754, simple_loss=0.2443, pruned_loss=0.05328, over 23750.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2525, pruned_loss=0.05041, over 4701935.54 frames. ], batch size: 164, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:26:38,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:26:38,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=682533.3333333334, ans=0.1 2023-09-30 10:26:41,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:26:44,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:26:44,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:44,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 10:26:44,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=682533.3333333334, ans=0.0 2023-09-30 10:26:46,128 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=682533.3333333334, ans=0.125 2023-09-30 10:26:50,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:26:51,055 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:26:52,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=682533.3333333334, ans=0.0 2023-09-30 10:26:53,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:26:53,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 10:26:54,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:26:56,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 10:26:56,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:26:56,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:26:56,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 10:26:59,423 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:26:59,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:27:00,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 10:27:00,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:27:03,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:27:04,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:27:04,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=682600.0, ans=0.125 2023-09-30 10:27:06,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer1.prob, batch_count=682600.0, ans=0.125 2023-09-30 10:27:07,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:27:11,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:27:11,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:27:16,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:27:16,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:27:16,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:27:17,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:27:17,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:27:17,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:27:21,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=682666.6666666666, ans=0.1 2023-09-30 10:27:22,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 10:27:26,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:27:27,412 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.19 vs. limit=22.5 2023-09-30 10:27:31,147 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 10:27:32,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:27:32,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:27:34,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:27:36,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 10:27:37,862 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=682733.3333333334, ans=0.035 2023-09-30 10:27:39,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:27:41,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 10:27:42,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 10:27:43,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:27:44,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=682800.0, ans=0.125 2023-09-30 10:27:46,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:27:47,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:27:51,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 10:27:51,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 10:27:52,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 10:27:54,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:27:54,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:28:01,525 INFO [train.py:1039] (3/4) Epoch 20, batch 1500, loss[loss=0.1609, simple_loss=0.2418, pruned_loss=0.04001, over 24615.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2523, pruned_loss=0.05099, over 4684868.42 frames. ], batch size: 60, lr: 5.18e-03, grad_scale: 8.0 2023-09-30 10:28:05,155 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=682866.6666666666, ans=0.125 2023-09-30 10:28:06,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 10:28:07,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:28:07,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:28:09,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:28:09,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:28:11,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:28:12,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 10:28:14,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:28:14,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:28:14,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:28:16,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:28:16,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:28:17,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:28:24,677 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.891e+02 2.112e+02 2.423e+02 4.358e+02, threshold=4.223e+02, percent-clipped=4.0 2023-09-30 10:28:24,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:28:24,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 10:28:24,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:28:25,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:28:26,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:28:29,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 10:28:33,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 10:28:35,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:28:36,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 10:28:36,952 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=683000.0, ans=0.2 2023-09-30 10:28:39,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 10:28:41,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:28:41,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=683000.0, ans=0.0 2023-09-30 10:28:42,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:28:42,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:28:44,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 10:28:45,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:28:45,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:28:47,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 10:28:47,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:28:54,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:28:54,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 10:28:54,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer_ff2.min_abs, batch_count=683066.6666666666, ans=0.1 2023-09-30 10:28:59,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:29:02,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:29:03,909 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=683066.6666666666, ans=0.1 2023-09-30 10:29:06,576 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 10:29:06,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:29:06,671 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 10:29:06,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=683133.3333333334, ans=0.125 2023-09-30 10:29:09,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:29:11,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:29:12,113 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 10:29:12,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:29:13,889 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=683133.3333333334, ans=0.125 2023-09-30 10:29:16,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 10:29:18,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:29:21,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:29:21,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:29:21,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:29:22,796 INFO [train.py:1039] (3/4) Epoch 20, batch 1550, loss[loss=0.165, simple_loss=0.2429, pruned_loss=0.04358, over 16790.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2528, pruned_loss=0.05088, over 4693715.38 frames. ], batch size: 36, lr: 5.17e-03, grad_scale: 8.0 2023-09-30 10:29:22,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:29:23,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:29:26,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 10:29:26,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 10:29:26,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:29:28,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 10:29:28,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 10:29:31,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:29:33,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:29:33,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:29:33,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:29:35,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:29:35,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:29:37,019 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 10:29:38,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:29:38,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:29:39,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:29:42,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:29:43,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 10:29:43,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:29:43,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 10:29:45,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 10:29:45,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 10:29:47,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:29:49,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:29:52,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:29:55,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 10:29:55,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 10:29:58,506 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=683333.3333333334, ans=0.125 2023-09-30 10:30:05,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:30:08,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:30:08,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:30:08,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:30:10,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 10:30:15,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:30:18,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:30:20,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:30:23,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:30:24,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:30:24,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 10:30:25,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:30:26,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:30:28,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:30:28,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 10:30:29,829 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 10:30:31,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:30:38,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 10:30:44,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:30:46,008 INFO [train.py:1039] (3/4) Epoch 20, batch 1600, loss[loss=0.1796, simple_loss=0.2433, pruned_loss=0.0579, over 23766.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2539, pruned_loss=0.05143, over 4708808.79 frames. ], batch size: 212, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:30:46,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:30:46,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 10:30:47,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:30:49,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:30:49,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:30:49,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:30:50,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:30:54,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:30:54,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 10:30:54,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 10:30:56,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 10:30:58,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:31:01,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 10:31:03,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:31:04,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:31:09,445 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.487e+02 1.873e+02 2.054e+02 2.261e+02 3.957e+02, threshold=4.107e+02, percent-clipped=0.0 2023-09-30 10:31:11,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:31:13,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=683600.0, ans=0.125 2023-09-30 10:31:14,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 10:31:16,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:31:17,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 10:31:19,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:31:19,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 10:31:25,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 10:31:34,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:31:35,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 10:31:37,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:31:37,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:31:37,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:31:40,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 10:31:42,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=683733.3333333334, ans=0.125 2023-09-30 10:31:45,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 10:31:47,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:31:48,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:31:48,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:31:48,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:31:52,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:31:53,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:31:55,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:32:00,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:32:02,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:32:04,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 10:32:04,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:32:05,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 10:32:08,690 INFO [train.py:1039] (3/4) Epoch 20, batch 1650, loss[loss=0.1743, simple_loss=0.2464, pruned_loss=0.05106, over 23326.00 frames. ], tot_loss[loss=0.1785, simple_loss=0.2541, pruned_loss=0.05149, over 4716352.29 frames. ], batch size: 119, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:32:09,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:32:09,217 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=683866.6666666666, ans=0.1 2023-09-30 10:32:11,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:32:13,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:32:13,421 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 10:32:13,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 10:32:13,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 10:32:13,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 10:32:13,810 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:32:18,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:32:20,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:32:20,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:32:20,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:32:20,451 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=683866.6666666666, ans=0.125 2023-09-30 10:32:21,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:32:24,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 10:32:26,049 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=683933.3333333334, ans=0.0 2023-09-30 10:32:27,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:32:27,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:32:27,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:32:27,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:32:28,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 10:32:28,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 10:32:35,804 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:32:38,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:32:39,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=683933.3333333334, ans=0.125 2023-09-30 10:32:44,982 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.skip_rate, batch_count=684000.0, ans=0.09899494936611666 2023-09-30 10:32:47,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 10:32:48,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:32:52,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 10:32:55,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:32:58,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:32:58,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:32:58,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:33:00,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:33:00,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:33:04,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:33:04,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:33:04,978 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=684066.6666666666, ans=0.125 2023-09-30 10:33:06,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:33:06,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:33:07,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:33:10,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:33:12,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:33:13,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 10:33:15,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:33:15,706 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=684133.3333333334, ans=0.0 2023-09-30 10:33:16,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 10:33:19,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 10:33:19,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 10:33:19,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:33:20,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:33:20,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:33:20,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:33:20,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 10:33:24,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:33:25,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:33:25,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:33:30,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 10:33:32,086 INFO [train.py:1039] (3/4) Epoch 20, batch 1700, loss[loss=0.1715, simple_loss=0.2453, pruned_loss=0.04891, over 23265.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2529, pruned_loss=0.05164, over 4716318.20 frames. ], batch size: 93, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:33:32,485 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=684200.0, ans=0.125 2023-09-30 10:33:33,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:33:33,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:33:33,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 10:33:36,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:33:36,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:33:36,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:33:38,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:33:38,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:33:39,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 10:33:41,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:33:43,433 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:33:50,711 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.95 vs. limit=22.5 2023-09-30 10:33:51,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:33:54,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:33:55,785 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.477e+02 1.837e+02 2.036e+02 2.305e+02 3.271e+02, threshold=4.073e+02, percent-clipped=0.0 2023-09-30 10:33:56,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=684266.6666666666, ans=0.125 2023-09-30 10:33:59,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:33:59,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:34:00,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:34:00,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:34:05,043 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 10:34:05,319 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:34:05,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:34:09,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:34:11,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:34:12,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 10:34:12,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 10:34:14,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:34:15,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 10:34:17,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:34:24,977 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.67 vs. limit=22.5 2023-09-30 10:34:27,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:34:27,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:34:28,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:34:30,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 10:34:30,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 10:34:31,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:34:33,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:34:33,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 10:34:34,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:34:34,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:34:36,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:34:36,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:34:39,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:34:39,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:34:41,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:34:43,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:34:43,486 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:34:47,377 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:34:48,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 10:34:50,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:34:51,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:34:53,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 10:34:54,998 INFO [train.py:1039] (3/4) Epoch 20, batch 1750, loss[loss=0.1696, simple_loss=0.2205, pruned_loss=0.05938, over 19161.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2508, pruned_loss=0.05123, over 4704875.79 frames. ], batch size: 388, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:35:00,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:35:01,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:35:03,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 10:35:03,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 10:35:03,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:35:03,819 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=684533.3333333334, ans=0.125 2023-09-30 10:35:06,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:35:06,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:35:11,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 10:35:13,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:35:14,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=684600.0, ans=0.125 2023-09-30 10:35:14,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=684600.0, ans=0.0 2023-09-30 10:35:16,293 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.20 vs. limit=22.5 2023-09-30 10:35:17,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 10:35:17,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:35:17,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=684600.0, ans=0.04949747468305833 2023-09-30 10:35:19,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:35:20,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 10:35:22,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 10:35:24,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:35:25,487 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 10:35:32,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:35:35,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:35:35,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:35:38,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:35:38,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:35:40,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:35:40,396 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=684666.6666666666, ans=0.09899494936611666 2023-09-30 10:35:43,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:35:43,708 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=684733.3333333334, ans=0.1 2023-09-30 10:35:46,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:35:46,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:35:48,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 10:35:50,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:35:52,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 10:35:54,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:35:55,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:35:55,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:35:58,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:36:01,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 10:36:01,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:36:01,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:36:05,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:36:08,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:36:10,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:36:10,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=684800.0, ans=0.125 2023-09-30 10:36:11,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 10:36:11,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:36:13,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:36:13,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:36:13,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:36:14,213 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.96 vs. limit=22.5 2023-09-30 10:36:14,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:36:16,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:36:18,213 INFO [train.py:1039] (3/4) Epoch 20, batch 1800, loss[loss=0.1796, simple_loss=0.2646, pruned_loss=0.04732, over 24396.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2513, pruned_loss=0.05099, over 4715617.00 frames. ], batch size: 77, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:36:18,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:36:20,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:36:22,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 10:36:24,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:36:24,721 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=684866.6666666666, ans=0.1 2023-09-30 10:36:27,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 10:36:27,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=684866.6666666666, ans=0.125 2023-09-30 10:36:30,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:36:32,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:36:34,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:36:35,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:36:37,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:36:37,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:36:38,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 10:36:38,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:36:41,810 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.858e+02 2.042e+02 2.295e+02 4.168e+02, threshold=4.085e+02, percent-clipped=1.0 2023-09-30 10:36:42,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:36:46,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 10:36:48,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 10:36:48,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 10:36:49,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:36:51,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:36:51,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:36:53,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:37:02,102 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 10:37:02,432 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=685000.0, ans=0.1 2023-09-30 10:37:03,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:37:05,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:37:07,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 10:37:07,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 10:37:08,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:37:10,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:37:11,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:37:15,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 10:37:17,101 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.23 vs. limit=12.0 2023-09-30 10:37:18,936 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.05 vs. limit=6.0 2023-09-30 10:37:23,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:37:24,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 10:37:24,601 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.21 vs. limit=15.0 2023-09-30 10:37:25,460 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:37:25,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:37:25,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:37:27,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 10:37:29,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=685133.3333333334, ans=0.125 2023-09-30 10:37:30,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:37:30,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:37:33,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 10:37:33,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:37:36,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:37:36,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:37:37,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:37:38,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:37:39,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:37:41,852 INFO [train.py:1039] (3/4) Epoch 20, batch 1850, loss[loss=0.1654, simple_loss=0.251, pruned_loss=0.03986, over 24491.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.252, pruned_loss=0.05057, over 4720486.54 frames. ], batch size: 66, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:37:42,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:37:42,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:37:45,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:37:45,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:37:47,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=685200.0, ans=0.0 2023-09-30 10:37:51,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:37:51,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 10:37:54,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 10:37:59,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 10:38:01,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=685266.6666666666, ans=0.125 2023-09-30 10:38:01,084 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=685266.6666666666, ans=0.1 2023-09-30 10:38:04,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:38:04,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 10:38:04,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 10:38:08,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=685266.6666666666, ans=0.0 2023-09-30 10:38:14,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:38:17,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 10:38:19,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:38:20,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:38:25,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 10:38:25,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:38:25,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 10:38:26,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=685333.3333333334, ans=0.125 2023-09-30 10:38:27,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:38:28,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:38:30,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:38:32,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:38:33,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:38:33,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 10:38:33,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:38:37,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:38:39,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:38:41,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 10:38:43,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:38:47,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:38:48,253 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.31 vs. limit=15.0 2023-09-30 10:38:49,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:38:49,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 10:38:49,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 10:38:50,860 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 10:38:52,881 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 10:38:54,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:38:54,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:38:54,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:38:55,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:38:57,456 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 10:38:57,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:38:57,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:38:57,815 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=685466.6666666666, ans=0.0 2023-09-30 10:38:59,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:39:00,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:39:02,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:39:02,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 10:39:02,472 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=685533.3333333334, ans=0.2 2023-09-30 10:39:03,548 INFO [train.py:1039] (3/4) Epoch 20, batch 1900, loss[loss=0.1652, simple_loss=0.2451, pruned_loss=0.04267, over 24477.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2529, pruned_loss=0.05077, over 4728490.22 frames. ], batch size: 63, lr: 5.17e-03, grad_scale: 16.0 2023-09-30 10:39:05,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:39:05,099 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 10:39:05,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:39:06,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:39:08,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=685533.3333333334, ans=0.125 2023-09-30 10:39:13,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:39:14,220 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.22 vs. limit=15.0 2023-09-30 10:39:16,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:39:17,101 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 10:39:18,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 10:39:18,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:39:20,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:39:20,367 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 10:39:20,423 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 10:39:24,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 10:39:25,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:39:27,714 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.492e+02 1.870e+02 2.135e+02 2.444e+02 3.596e+02, threshold=4.270e+02, percent-clipped=0.0 2023-09-30 10:39:29,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 10:39:32,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 10:39:36,341 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=4.98 vs. limit=12.0 2023-09-30 10:39:41,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 10:39:45,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 10:39:45,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:39:45,829 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 10:39:45,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 10:39:47,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 10:39:47,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 10:39:47,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:39:51,288 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=685666.6666666666, ans=0.1 2023-09-30 10:39:51,403 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:39:52,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 10:39:56,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:39:59,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:39:59,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 10:40:01,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:40:03,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 10:40:03,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:40:03,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=685733.3333333334, ans=0.1 2023-09-30 10:40:10,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:40:10,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:40:10,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:40:12,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:40:13,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 10:40:15,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 10:40:15,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:40:18,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:40:18,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:40:21,140 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.28 vs. limit=15.0 2023-09-30 10:40:22,309 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:40:22,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:40:22,664 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=685800.0, ans=0.125 2023-09-30 10:40:23,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 10:40:23,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:40:24,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=685800.0, ans=0.0 2023-09-30 10:40:28,013 INFO [train.py:1039] (3/4) Epoch 20, batch 1950, loss[loss=0.1799, simple_loss=0.2622, pruned_loss=0.04884, over 24636.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2536, pruned_loss=0.05075, over 4735959.44 frames. ], batch size: 68, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:40:28,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:40:29,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:40:29,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:40:29,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:40:32,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 10:40:34,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 10:40:34,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:40:36,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:40:39,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:40:39,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:40:39,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:40:42,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:40:45,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:40:45,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:40:45,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 10:40:45,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:40:47,455 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:40:48,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:40:50,948 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.65 vs. limit=22.5 2023-09-30 10:40:51,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:40:51,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:40:51,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 10:40:51,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 10:40:53,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 10:40:54,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:40:55,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:40:58,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:41:02,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:41:05,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:41:07,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:41:07,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:41:09,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 10:41:09,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:41:09,986 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=686000.0, ans=0.2 2023-09-30 10:41:14,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:41:14,527 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=686000.0, ans=0.1 2023-09-30 10:41:15,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:41:15,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:41:25,147 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:41:25,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:41:29,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:41:30,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=686066.6666666666, ans=0.0 2023-09-30 10:41:32,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:41:36,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:41:36,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:41:37,921 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 10:41:37,931 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:41:38,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:41:39,623 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 10:41:41,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:41:46,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:41:47,136 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=8.85 vs. limit=22.5 2023-09-30 10:41:47,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:41:47,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:41:50,521 INFO [train.py:1039] (3/4) Epoch 20, batch 2000, loss[loss=0.1717, simple_loss=0.2574, pruned_loss=0.04297, over 24438.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2534, pruned_loss=0.05074, over 4720487.22 frames. ], batch size: 69, lr: 5.16e-03, grad_scale: 32.0 2023-09-30 10:41:50,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:41:53,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:41:55,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 10:41:56,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 10:41:59,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:42:02,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 10:42:04,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 10:42:04,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:42:06,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:42:06,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=686266.6666666666, ans=0.1 2023-09-30 10:42:08,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 10:42:09,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:11,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:11,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:13,099 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.603e+02 1.882e+02 2.052e+02 2.299e+02 3.277e+02, threshold=4.104e+02, percent-clipped=0.0 2023-09-30 10:42:13,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 10:42:13,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 10:42:16,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 10:42:16,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:42:20,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:42:21,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 10:42:21,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:23,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:42:24,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:42:26,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 10:42:27,047 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.90 vs. limit=15.0 2023-09-30 10:42:29,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 10:42:29,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:42:29,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:42:34,400 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.99 vs. limit=15.0 2023-09-30 10:42:35,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:42:36,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:42:36,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:42:38,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:42:39,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:42:39,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:42:42,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:42:42,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:42:44,342 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:42:48,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:42:48,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 10:42:56,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:42:56,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:01,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:01,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:43:04,413 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=686466.6666666666, ans=0.125 2023-09-30 10:43:05,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:08,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:43:08,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:08,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:43:08,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:43:10,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:10,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:11,751 INFO [train.py:1039] (3/4) Epoch 20, batch 2050, loss[loss=0.1719, simple_loss=0.2576, pruned_loss=0.04313, over 24544.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2531, pruned_loss=0.0507, over 4720838.36 frames. ], batch size: 71, lr: 5.16e-03, grad_scale: 32.0 2023-09-30 10:43:13,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:43:15,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:17,735 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=686533.3333333334, ans=0.025 2023-09-30 10:43:20,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:43:23,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:43:24,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:43:25,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:43:27,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 10:43:27,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:43:31,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:43:31,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:43:39,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:43:40,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:43,078 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=4.17 vs. limit=15.0 2023-09-30 10:43:43,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 10:43:45,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:43:45,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=686666.6666666666, ans=0.125 2023-09-30 10:43:47,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 10:43:48,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:43:51,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:43:55,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:43:56,135 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.50 vs. limit=15.0 2023-09-30 10:43:57,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:43:57,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:43:59,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:44:00,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:44:00,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:44:05,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:44:06,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:44:10,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:44:10,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:44:13,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:44:18,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:44:19,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 10:44:25,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:44:27,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:44:29,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:44:32,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 10:44:34,067 INFO [train.py:1039] (3/4) Epoch 20, batch 2100, loss[loss=0.1642, simple_loss=0.2363, pruned_loss=0.04601, over 23642.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2516, pruned_loss=0.05014, over 4708470.25 frames. ], batch size: 149, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:44:35,767 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 10:44:35,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:44:35,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:44:37,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:44:37,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:44:37,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 10:44:37,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 10:44:39,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:44:43,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:44:44,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:44:47,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:44:47,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:44:47,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 10:44:49,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:44:49,152 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 10:44:49,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 10:44:52,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:44:52,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:44:52,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 10:44:52,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 10:44:58,445 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.615e+02 2.064e+02 2.437e+02 3.000e+02 4.850e+02, threshold=4.873e+02, percent-clipped=5.0 2023-09-30 10:44:58,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 10:44:58,704 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:45:02,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:45:03,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:45:07,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:45:07,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 10:45:09,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:45:09,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 10:45:12,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 10:45:13,704 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:45:13,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 10:45:13,782 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 10:45:13,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 10:45:17,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:45:19,142 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:45:19,248 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=687000.0, ans=0.125 2023-09-30 10:45:22,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:45:25,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 10:45:25,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:45:26,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:45:26,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 10:45:26,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:45:26,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:45:28,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:45:28,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 10:45:30,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 10:45:30,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 10:45:34,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:45:37,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:45:39,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 10:45:44,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:45:47,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:45:49,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:45:49,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:45:49,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 10:45:50,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:45:52,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:45:52,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:45:52,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:45:52,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:45:55,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 10:45:57,367 INFO [train.py:1039] (3/4) Epoch 20, batch 2150, loss[loss=0.1969, simple_loss=0.2693, pruned_loss=0.06226, over 23339.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2512, pruned_loss=0.05008, over 4707968.46 frames. ], batch size: 93, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:45:57,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 10:45:57,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:45:59,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:45:59,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:45:59,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:45:59,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:46:05,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 10:46:06,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:46:07,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:46:09,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:46:09,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:09,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:46:13,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:46:15,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:46:15,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:46:18,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=687266.6666666666, ans=0.1 2023-09-30 10:46:21,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:21,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 10:46:26,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:46:28,039 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:46:28,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:28,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:46:29,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:29,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:46:31,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:46:31,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:46:31,148 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:46:32,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 10:46:35,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:46:35,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:46:35,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:46:37,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:46:38,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:46:42,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:46:42,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:46:43,133 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.whiten.whitening_limit, batch_count=687333.3333333334, ans=12.0 2023-09-30 10:46:45,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:46:45,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 10:46:47,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 10:46:49,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:46:51,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:52,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:46:54,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:46:54,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:46:55,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:46:56,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 10:46:58,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 10:46:58,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:47:00,924 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 10:47:01,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:47:01,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:47:02,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 10:47:02,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:47:02,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 10:47:02,572 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 10:47:02,572 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 10:47:02,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 10:47:04,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:47:04,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:47:04,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:47:05,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:47:07,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 10:47:08,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:47:08,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:47:12,527 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.45 vs. limit=12.0 2023-09-30 10:47:18,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:47:18,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 10:47:19,323 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=6.83 vs. limit=15.0 2023-09-30 10:47:20,063 INFO [train.py:1039] (3/4) Epoch 20, batch 2200, loss[loss=0.1886, simple_loss=0.2782, pruned_loss=0.04951, over 24652.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2518, pruned_loss=0.05037, over 4706334.37 frames. ], batch size: 73, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:47:24,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:47:28,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:47:30,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:47:30,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:47:32,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 10:47:35,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:47:35,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:47:35,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 10:47:40,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 10:47:42,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 10:47:45,302 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.463e+02 1.904e+02 2.106e+02 2.500e+02 4.276e+02, threshold=4.212e+02, percent-clipped=0.0 2023-09-30 10:47:45,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 10:47:48,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:47:50,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:47:50,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 10:47:53,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:47:53,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 10:47:59,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:48:00,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:48:01,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 10:48:04,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:48:06,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:48:09,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:48:11,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:48:12,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 10:48:14,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:48:15,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 10:48:18,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:48:18,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 10:48:18,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:48:21,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:48:23,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:48:23,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:48:23,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:48:25,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:48:25,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:48:28,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 10:48:30,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.min_positive, batch_count=687800.0, ans=0.05 2023-09-30 10:48:31,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 10:48:31,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:48:35,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:48:35,664 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 10:48:38,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:48:38,891 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 10:48:40,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 10:48:41,852 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 10:48:43,249 INFO [train.py:1039] (3/4) Epoch 20, batch 2250, loss[loss=0.182, simple_loss=0.2475, pruned_loss=0.0582, over 23770.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2522, pruned_loss=0.0507, over 4699065.63 frames. ], batch size: 212, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:48:43,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:48:43,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 10:48:45,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:48:47,154 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 10:48:48,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:48:50,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:48:56,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:48:57,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:49:00,883 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=3.22 vs. limit=15.0 2023-09-30 10:49:01,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:49:01,775 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=687933.3333333334, ans=0.125 2023-09-30 10:49:03,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:49:03,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 10:49:07,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 10:49:07,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:49:07,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:49:08,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 10:49:09,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:49:10,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:49:11,975 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 10:49:17,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:49:18,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 10:49:18,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 10:49:21,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 10:49:21,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:49:23,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:49:28,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:49:28,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:49:31,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:49:31,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:49:35,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:49:36,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:49:38,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:49:40,456 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=688066.6666666666, ans=0.09899494936611666 2023-09-30 10:49:42,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 10:49:42,593 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=688066.6666666666, ans=0.125 2023-09-30 10:49:46,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 10:49:47,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 10:49:47,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:49:47,386 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=688066.6666666666, ans=0.0 2023-09-30 10:49:52,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 10:49:55,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:49:55,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 10:49:55,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:49:56,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:49:59,245 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.62 vs. limit=15.0 2023-09-30 10:49:59,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 10:50:02,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:50:03,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:50:06,371 INFO [train.py:1039] (3/4) Epoch 20, batch 2300, loss[loss=0.1705, simple_loss=0.2632, pruned_loss=0.03896, over 24425.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2538, pruned_loss=0.05154, over 4684482.60 frames. ], batch size: 69, lr: 5.16e-03, grad_scale: 16.0 2023-09-30 10:50:10,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:50:10,102 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:50:11,740 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 10:50:12,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=688200.0, ans=0.0 2023-09-30 10:50:14,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:50:21,546 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:50:21,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:50:23,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:50:23,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:50:23,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 10:50:25,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:50:26,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:50:28,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:50:30,979 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.587e+02 1.830e+02 1.981e+02 2.237e+02 3.602e+02, threshold=3.962e+02, percent-clipped=0.0 2023-09-30 10:50:32,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 10:50:34,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:50:38,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:50:42,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:50:43,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:50:46,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:50:47,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:50:52,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:50:54,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:50:54,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:50:54,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 10:50:57,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 10:50:57,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:51:00,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:51:00,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:51:01,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:51:04,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 10:51:04,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 10:51:04,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 10:51:04,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:51:04,718 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:51:06,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 10:51:14,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:51:18,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:51:22,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:51:22,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:51:22,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 10:51:25,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:51:25,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:51:27,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 10:51:28,898 INFO [train.py:1039] (3/4) Epoch 20, batch 2350, loss[loss=0.1861, simple_loss=0.2709, pruned_loss=0.05065, over 24359.00 frames. ], tot_loss[loss=0.1787, simple_loss=0.2541, pruned_loss=0.05168, over 4701210.88 frames. ], batch size: 77, lr: 5.15e-03, grad_scale: 16.0 2023-09-30 10:51:28,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 10:51:35,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:51:35,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 10:51:43,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 10:51:44,285 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=688600.0, ans=0.125 2023-09-30 10:51:45,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=688600.0, ans=0.1 2023-09-30 10:51:46,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:51:49,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:51:49,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:51:50,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:51:50,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:51:53,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 10:51:55,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=688600.0, ans=0.0 2023-09-30 10:51:56,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:51:58,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=688600.0, ans=0.0 2023-09-30 10:52:01,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 10:52:02,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:52:04,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:52:04,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:52:08,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 10:52:10,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 10:52:10,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 10:52:13,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:52:13,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:52:14,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:52:15,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=688666.6666666666, ans=0.04949747468305833 2023-09-30 10:52:17,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 10:52:19,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 10:52:20,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:52:23,854 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.83 vs. limit=15.0 2023-09-30 10:52:24,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:52:24,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:52:26,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 10:52:26,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:52:28,434 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=688733.3333333334, ans=0.0 2023-09-30 10:52:29,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 10:52:29,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 10:52:34,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 10:52:38,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 10:52:38,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:52:38,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 10:52:38,970 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 10:52:41,010 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 10:52:43,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 10:52:46,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:52:49,360 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:52:50,530 INFO [train.py:1039] (3/4) Epoch 20, batch 2400, loss[loss=0.1671, simple_loss=0.2162, pruned_loss=0.05895, over 19521.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2531, pruned_loss=0.05154, over 4701623.00 frames. ], batch size: 388, lr: 5.15e-03, grad_scale: 32.0 2023-09-30 10:52:52,118 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:52:55,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:52:55,695 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=688866.6666666666, ans=0.0 2023-09-30 10:52:57,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:52:58,989 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 10:52:59,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 10:53:08,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 10:53:08,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:53:10,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 10:53:10,565 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 10:53:11,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:53:11,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:53:12,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=688933.3333333334, ans=0.2 2023-09-30 10:53:13,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 10:53:14,623 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.600e+02 1.889e+02 2.089e+02 2.400e+02 4.035e+02, threshold=4.178e+02, percent-clipped=1.0 2023-09-30 10:53:15,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=688933.3333333334, ans=0.2 2023-09-30 10:53:18,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:53:22,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 10:53:27,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 10:53:31,776 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 10:53:35,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:53:35,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:53:40,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:53:42,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 10:53:42,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 10:53:50,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:53:52,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:53:56,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:53:56,261 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 10:53:56,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 10:53:57,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:53:57,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:53:57,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:53:57,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 10:54:01,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:54:02,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 10:54:02,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 10:54:04,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 10:54:07,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:54:07,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:54:09,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 10:54:09,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 10:54:09,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 10:54:09,163 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 10:54:10,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 10:54:10,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:54:12,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:54:12,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:54:14,369 INFO [train.py:1039] (3/4) Epoch 20, batch 2450, loss[loss=0.1642, simple_loss=0.248, pruned_loss=0.04021, over 24571.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2517, pruned_loss=0.05112, over 4702074.71 frames. ], batch size: 71, lr: 5.15e-03, grad_scale: 32.0 2023-09-30 10:54:14,529 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 10:54:15,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:54:16,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 10:54:20,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 10:54:20,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:54:24,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:54:24,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:54:26,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 10:54:32,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:54:32,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:54:35,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:54:35,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:54:35,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 10:54:35,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 10:54:42,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:54:45,196 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 10:54:45,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:54:47,848 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.84 vs. limit=15.0 2023-09-30 10:54:50,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 10:54:50,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:54:52,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:54:52,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:54:53,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 10:54:55,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 10:54:58,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=689333.3333333334, ans=0.0 2023-09-30 10:55:02,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:55:05,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:55:05,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:55:06,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 10:55:06,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:55:08,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:55:08,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 10:55:08,785 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=689400.0, ans=0.1 2023-09-30 10:55:13,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 10:55:13,629 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:55:16,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:55:16,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:55:22,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 10:55:22,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 10:55:25,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:55:25,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:55:25,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 10:55:26,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:55:28,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 10:55:31,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 10:55:33,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:55:33,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 10:55:35,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=689533.3333333334, ans=0.125 2023-09-30 10:55:36,685 INFO [train.py:1039] (3/4) Epoch 20, batch 2500, loss[loss=0.1615, simple_loss=0.2101, pruned_loss=0.05647, over 19230.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2507, pruned_loss=0.05102, over 4684401.25 frames. ], batch size: 388, lr: 5.15e-03, grad_scale: 32.0 2023-09-30 10:55:36,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 10:55:38,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 10:55:40,778 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=689533.3333333334, ans=0.2 2023-09-30 10:55:44,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:55:44,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=689533.3333333334, ans=0.07 2023-09-30 10:55:46,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten.whitening_limit, batch_count=689533.3333333334, ans=15.0 2023-09-30 10:55:50,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=689533.3333333334, ans=0.1 2023-09-30 10:55:55,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 10:55:55,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:55:56,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:55:56,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 10:56:00,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=689600.0, ans=0.1 2023-09-30 10:56:01,700 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.618e+02 1.945e+02 2.213e+02 2.493e+02 3.500e+02, threshold=4.426e+02, percent-clipped=0.0 2023-09-30 10:56:03,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 10:56:03,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:56:05,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 10:56:05,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 10:56:05,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 10:56:07,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:56:07,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:56:08,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 10:56:08,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:56:10,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 10:56:10,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:56:14,800 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.21 vs. limit=15.0 2023-09-30 10:56:15,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:56:16,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:56:20,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 10:56:20,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 10:56:22,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:56:22,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:56:26,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:56:32,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:56:34,471 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.11 vs. limit=10.0 2023-09-30 10:56:35,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:56:40,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 10:56:43,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 10:56:43,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:56:43,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 10:56:46,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 10:56:46,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 10:56:48,477 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 10:56:48,478 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 10:56:48,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 10:56:53,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:56:55,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 10:56:55,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 10:56:56,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 10:56:56,764 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 10:56:59,585 INFO [train.py:1039] (3/4) Epoch 20, batch 2550, loss[loss=0.1997, simple_loss=0.2599, pruned_loss=0.06978, over 23434.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2513, pruned_loss=0.05081, over 4697696.14 frames. ], batch size: 285, lr: 5.15e-03, grad_scale: 16.0 2023-09-30 10:56:59,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 10:57:04,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:57:05,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:57:07,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:57:10,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:57:12,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 10:57:12,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 10:57:15,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 10:57:15,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 10:57:19,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:57:19,430 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=689933.3333333334, ans=0.125 2023-09-30 10:57:22,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:57:22,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 10:57:24,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:57:24,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:57:24,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:57:24,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=689933.3333333334, ans=0.2 2023-09-30 10:57:27,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 10:57:27,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 10:57:29,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 10:57:29,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:57:29,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 10:57:43,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 10:57:46,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:57:48,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:57:48,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 10:57:49,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 10:57:55,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 10:57:57,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=690066.6666666666, ans=0.125 2023-09-30 10:57:58,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 10:57:58,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 10:57:58,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 10:58:00,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 10:58:00,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 10:58:04,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=690133.3333333334, ans=0.0 2023-09-30 10:58:05,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:58:05,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:58:11,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:58:11,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 10:58:11,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 10:58:11,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:58:13,161 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 10:58:14,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 10:58:15,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:58:20,513 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=690200.0, ans=0.0 2023-09-30 10:58:21,659 INFO [train.py:1039] (3/4) Epoch 20, batch 2600, loss[loss=0.182, simple_loss=0.2619, pruned_loss=0.05102, over 24395.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2522, pruned_loss=0.05107, over 4703396.96 frames. ], batch size: 77, lr: 5.15e-03, grad_scale: 8.0 2023-09-30 10:58:23,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:58:26,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:58:28,808 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 10:58:31,808 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 10:58:31,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 10:58:31,900 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 10:58:33,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 10:58:33,877 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 10:58:38,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:58:38,248 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 10:58:40,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 10:58:42,018 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 10:58:43,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 10:58:44,515 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=17.13 vs. limit=22.5 2023-09-30 10:58:45,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 10:58:46,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 10:58:46,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 10:58:48,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 10:58:49,609 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.499e+02 1.870e+02 2.130e+02 2.382e+02 3.027e+02, threshold=4.260e+02, percent-clipped=0.0 2023-09-30 10:58:51,229 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 10:58:51,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 10:58:51,617 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=690266.6666666666, ans=0.125 2023-09-30 10:58:55,471 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=690333.3333333334, ans=0.0 2023-09-30 10:58:58,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:58:59,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:58:59,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:58:59,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 10:59:03,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 10:59:03,902 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.29 vs. limit=10.0 2023-09-30 10:59:05,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=690333.3333333334, ans=0.125 2023-09-30 10:59:08,040 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=690333.3333333334, ans=0.0 2023-09-30 10:59:09,234 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 10:59:16,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:59:16,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:59:16,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 10:59:16,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:59:16,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 10:59:18,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 10:59:18,923 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.12 vs. limit=22.5 2023-09-30 10:59:21,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 10:59:22,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 10:59:23,024 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=690400.0, ans=0.0 2023-09-30 10:59:24,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:59:27,918 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 10:59:29,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 10:59:29,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 10:59:32,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=690466.6666666666, ans=0.125 2023-09-30 10:59:34,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 10:59:34,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 10:59:34,296 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 10:59:36,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 10:59:38,166 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=690466.6666666666, ans=0.09899494936611666 2023-09-30 10:59:39,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 10:59:39,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 10:59:43,299 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.28 vs. limit=6.0 2023-09-30 10:59:43,949 INFO [train.py:1039] (3/4) Epoch 20, batch 2650, loss[loss=0.1541, simple_loss=0.2331, pruned_loss=0.03755, over 24584.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2525, pruned_loss=0.05082, over 4710340.05 frames. ], batch size: 60, lr: 5.15e-03, grad_scale: 8.0 2023-09-30 10:59:44,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 10:59:45,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:59:45,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=690533.3333333334, ans=0.125 2023-09-30 10:59:46,025 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=690533.3333333334, ans=0.125 2023-09-30 10:59:49,515 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 10:59:54,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 10:59:54,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 10:59:55,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 10:59:57,162 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 10:59:57,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:00:00,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:00:01,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:00:04,689 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:00:06,277 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:00:07,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 11:00:07,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:00:07,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:00:09,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 11:00:11,768 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 11:00:16,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:00:19,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 11:00:19,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:00:21,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 11:00:23,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:00:23,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:00:23,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:00:25,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:00:29,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 11:00:29,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 11:00:33,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:00:36,687 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 11:00:38,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:00:38,197 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:00:38,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:00:39,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:00:39,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:00:40,117 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=690733.3333333334, ans=0.1 2023-09-30 11:00:41,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:00:43,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:00:44,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:00:46,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:00:47,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:00:49,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:00:49,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:00:52,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:00:52,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:00:52,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:00:55,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:00:57,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:00:57,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:00:57,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 11:01:01,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:01:01,918 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=690800.0, ans=0.0 2023-09-30 11:01:03,143 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:01:05,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:01:06,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:06,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:01:08,174 INFO [train.py:1039] (3/4) Epoch 20, batch 2700, loss[loss=0.1832, simple_loss=0.2524, pruned_loss=0.05696, over 23639.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.253, pruned_loss=0.05097, over 4708524.91 frames. ], batch size: 232, lr: 5.15e-03, grad_scale: 8.0 2023-09-30 11:01:08,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:11,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:01:11,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 11:01:12,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:01:14,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 11:01:16,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:01:16,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:16,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:19,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:01:19,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:01:19,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:01:19,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 11:01:19,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 11:01:21,168 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:01:22,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:01:24,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:01:24,281 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:01:24,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=690933.3333333334, ans=0.125 2023-09-30 11:01:25,962 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:01:28,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:01:30,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 11:01:30,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:01:35,387 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.15 vs. limit=22.5 2023-09-30 11:01:35,957 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.889e+02 2.106e+02 2.437e+02 3.198e+02, threshold=4.211e+02, percent-clipped=0.0 2023-09-30 11:01:37,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:01:37,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:01:44,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=691000.0, ans=0.2 2023-09-30 11:01:45,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:01:45,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:01:45,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:01:46,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:01:47,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:01:51,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:01:51,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:01:51,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:01:56,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:01:56,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:02:04,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:02:04,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:02:08,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:02:08,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:02:10,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:02:12,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:02:12,332 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=691066.6666666666, ans=0.05 2023-09-30 11:02:13,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:02:15,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:16,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:02:16,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:02:19,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:02:20,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=691133.3333333334, ans=0.1 2023-09-30 11:02:21,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:02:21,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:02:24,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 11:02:26,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:02:28,043 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:02:28,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 11:02:29,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 11:02:29,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:02:30,293 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.49 vs. limit=12.0 2023-09-30 11:02:30,957 INFO [train.py:1039] (3/4) Epoch 20, batch 2750, loss[loss=0.1774, simple_loss=0.2587, pruned_loss=0.04811, over 24063.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2529, pruned_loss=0.05114, over 4715403.99 frames. ], batch size: 80, lr: 5.14e-03, grad_scale: 8.0 2023-09-30 11:02:31,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=691200.0, ans=0.1 2023-09-30 11:02:31,465 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_na.min_abs, batch_count=691200.0, ans=0.02 2023-09-30 11:02:34,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:02:34,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:02:37,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:39,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:02:39,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:42,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:02:42,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 11:02:44,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:02:44,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:44,194 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 11:02:44,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:02:44,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:02:48,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=691266.6666666666, ans=0.0 2023-09-30 11:02:50,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 11:02:53,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:02:55,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:02:55,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:02:55,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:02:56,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:02:58,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:03:00,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:03:00,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:03:03,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:03:03,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:03:04,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:03:06,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:03:06,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:03:13,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:03:14,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:03:14,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:03:20,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:03:20,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:03:20,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:03:29,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:03:29,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:03:29,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 11:03:35,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:03:37,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 11:03:41,488 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=9.41 vs. limit=15.0 2023-09-30 11:03:42,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 11:03:45,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:03:45,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 11:03:46,050 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=691466.6666666666, ans=0.0 2023-09-30 11:03:47,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:03:48,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:03:48,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 11:03:50,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:03:52,299 INFO [train.py:1039] (3/4) Epoch 20, batch 2800, loss[loss=0.1717, simple_loss=0.2394, pruned_loss=0.052, over 23805.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2509, pruned_loss=0.05104, over 4704745.90 frames. ], batch size: 164, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:03:52,790 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=691533.3333333334, ans=0.0 2023-09-30 11:03:53,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 11:03:53,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:03:53,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:03:55,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 11:03:55,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:03:57,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:03:59,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:03:59,320 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 11:03:59,322 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 11:04:01,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:04:04,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:04:04,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:04:07,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:04:10,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 11:04:12,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 11:04:13,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 11:04:13,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:04:15,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:04:15,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:04:20,060 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.934e+02 2.198e+02 2.522e+02 3.773e+02, threshold=4.395e+02, percent-clipped=0.0 2023-09-30 11:04:20,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:04:21,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:04:21,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 11:04:21,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:04:27,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=691666.6666666666, ans=0.2 2023-09-30 11:04:27,600 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=691666.6666666666, ans=0.04949747468305833 2023-09-30 11:04:29,053 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=691666.6666666666, ans=0.125 2023-09-30 11:04:29,213 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=691666.6666666666, ans=0.125 2023-09-30 11:04:30,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:04:32,455 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:04:34,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:04:35,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:04:36,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:04:43,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:04:43,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 11:04:45,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:04:45,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:04:45,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:04:49,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:04:49,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer1.prob, batch_count=691733.3333333334, ans=0.125 2023-09-30 11:04:51,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:04:55,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:04:57,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:04:57,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:04:57,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 11:04:57,914 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.19 vs. limit=15.0 2023-09-30 11:04:58,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 11:05:00,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:05:00,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:05:00,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 11:05:00,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:05:02,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:05:02,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:05:02,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 11:05:04,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:05:04,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:05:06,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:05:07,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 11:05:08,323 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.12 vs. limit=15.0 2023-09-30 11:05:09,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=691800.0, ans=0.125 2023-09-30 11:05:10,829 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:05:15,441 INFO [train.py:1039] (3/4) Epoch 20, batch 2850, loss[loss=0.1702, simple_loss=0.2374, pruned_loss=0.05151, over 23561.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.25, pruned_loss=0.05052, over 4700285.62 frames. ], batch size: 256, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:05:15,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:05:15,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 11:05:16,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:05:20,052 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:05:22,652 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=691866.6666666666, ans=0.04949747468305833 2023-09-30 11:05:23,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:05:23,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:05:25,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:05:28,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:05:29,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:05:31,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:05:31,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 11:05:36,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=691933.3333333334, ans=0.125 2023-09-30 11:05:38,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 11:05:38,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:05:40,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 11:05:40,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:05:43,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 11:05:45,117 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 11:05:46,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:05:49,061 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=692000.0, ans=0.04949747468305833 2023-09-30 11:05:55,604 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.06 vs. limit=10.0 2023-09-30 11:06:00,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:06:01,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:06:01,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:06:01,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:06:01,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=692000.0, ans=0.2 2023-09-30 11:06:02,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=692000.0, ans=0.125 2023-09-30 11:06:03,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:06:03,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:06:06,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:06:06,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 11:06:09,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:06:09,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:06:09,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:06:11,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:06:12,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:06:12,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:06:16,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:06:18,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:06:19,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:06:21,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:06:21,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:06:25,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:06:28,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:06:28,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=692133.3333333334, ans=0.1 2023-09-30 11:06:29,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 11:06:30,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 11:06:32,451 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=692133.3333333334, ans=0.125 2023-09-30 11:06:33,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:06:34,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:06:34,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 11:06:35,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:06:35,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:06:36,490 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:06:36,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:06:36,523 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 11:06:36,613 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 11:06:36,619 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:06:37,966 INFO [train.py:1039] (3/4) Epoch 20, batch 2900, loss[loss=0.1658, simple_loss=0.2364, pruned_loss=0.0476, over 20366.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2498, pruned_loss=0.05018, over 4693474.84 frames. ], batch size: 44, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:06:38,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:06:44,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 11:06:44,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:06:44,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:06:45,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 11:06:49,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:06:50,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 11:06:50,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 11:06:51,013 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=692200.0, ans=0.125 2023-09-30 11:06:52,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:06:53,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:06:55,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:06:55,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:06:58,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:06:59,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:07:00,122 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=692266.6666666666, ans=0.125 2023-09-30 11:07:02,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 11:07:04,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 11:07:05,491 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.493e+02 1.792e+02 1.963e+02 2.139e+02 2.917e+02, threshold=3.926e+02, percent-clipped=0.0 2023-09-30 11:07:05,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:07:07,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:07:09,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 11:07:11,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 11:07:14,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:07:14,319 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 11:07:14,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:07:16,044 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:07:16,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 11:07:19,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:07:20,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:07:24,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:07:29,637 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:07:31,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 11:07:33,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 11:07:33,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:07:36,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:07:39,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 11:07:40,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:07:43,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=692466.6666666666, ans=0.0 2023-09-30 11:07:43,595 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=692466.6666666666, ans=0.125 2023-09-30 11:07:46,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:07:55,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:07:55,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:07:57,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 11:08:00,652 INFO [train.py:1039] (3/4) Epoch 20, batch 2950, loss[loss=0.1796, simple_loss=0.2469, pruned_loss=0.05617, over 23568.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.251, pruned_loss=0.05033, over 4694650.65 frames. ], batch size: 256, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:08:00,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:08:00,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 11:08:02,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:08:02,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:08:08,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:08:09,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 11:08:11,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:08:11,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:08:14,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:08:14,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:08:15,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 11:08:15,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 11:08:17,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:08:17,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:08:22,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:08:24,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:08:27,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:08:28,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:08:33,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:08:33,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:08:35,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:08:35,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:08:35,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:08:38,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 11:08:43,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 11:08:43,905 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 11:08:45,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:08:46,895 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 11:08:48,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 11:08:48,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:08:50,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:08:50,142 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 11:08:50,150 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:08:53,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 11:08:55,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:08:55,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:08:58,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:08:58,480 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=692733.3333333334, ans=0.125 2023-09-30 11:08:59,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:09:01,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:09:01,247 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 11:09:01,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:09:01,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 11:09:04,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=692800.0, ans=0.125 2023-09-30 11:09:08,787 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:09:10,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:09:10,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 11:09:11,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:09:13,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 11:09:16,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:09:16,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:09:17,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:09:18,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:09:19,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 11:09:20,169 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=692800.0, ans=0.125 2023-09-30 11:09:21,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:09:21,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=692866.6666666666, ans=0.125 2023-09-30 11:09:22,916 INFO [train.py:1039] (3/4) Epoch 20, batch 3000, loss[loss=0.2578, simple_loss=0.3099, pruned_loss=0.1028, over 19147.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2522, pruned_loss=0.05037, over 4705180.50 frames. ], batch size: 388, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:09:22,917 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-30 11:09:37,405 INFO [train.py:1071] (3/4) Epoch 20, validation: loss=0.3156, simple_loss=0.2725, pruned_loss=0.1794, over 1125622.00 frames. 2023-09-30 11:09:37,406 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-30 11:09:37,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:09:37,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:09:37,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:09:39,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:09:40,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:09:40,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:09:40,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 11:09:41,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:09:43,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:09:44,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:09:48,473 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 11:09:49,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 11:09:51,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:09:52,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:09:54,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 11:09:54,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:10:00,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 11:10:05,486 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.484e+02 1.867e+02 2.114e+02 2.609e+02 3.839e+02, threshold=4.228e+02, percent-clipped=0.0 2023-09-30 11:10:10,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:10:13,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=693000.0, ans=0.125 2023-09-30 11:10:15,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 11:10:17,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:10:21,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:10:22,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:10:22,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:10:25,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:10:25,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 11:10:26,634 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.85 vs. limit=10.0 2023-09-30 11:10:27,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 11:10:29,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:10:30,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 11:10:32,258 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:10:32,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:10:33,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:10:33,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:10:38,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:10:39,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:10:39,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:10:40,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:10:42,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 11:10:42,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:10:42,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:10:42,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:10:47,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:10:47,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:10:49,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 11:10:49,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 11:10:49,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:10:51,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 11:10:51,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:10:54,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 11:10:54,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=693133.3333333334, ans=0.0 2023-09-30 11:10:57,356 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:10:58,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:10:58,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 11:11:00,424 INFO [train.py:1039] (3/4) Epoch 20, batch 3050, loss[loss=0.1797, simple_loss=0.2515, pruned_loss=0.0539, over 23290.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.253, pruned_loss=0.05095, over 4692660.54 frames. ], batch size: 105, lr: 5.14e-03, grad_scale: 16.0 2023-09-30 11:11:00,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 11:11:00,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 11:11:01,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:11:03,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:11:03,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 11:11:03,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:03,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:11:06,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 11:11:08,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:11:10,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:11:10,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:11:15,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:17,123 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=10.18 vs. limit=22.5 2023-09-30 11:11:18,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 11:11:23,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 11:11:23,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 11:11:25,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:11:28,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:11:31,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:31,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:11:31,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:11:38,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:11:38,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:11:40,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:11:40,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:11:40,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:11:42,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:45,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:11:46,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:11:48,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 11:11:48,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:11:48,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:11:53,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:11:53,909 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.34 vs. limit=15.0 2023-09-30 11:11:54,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:11:54,560 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:11:56,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:12:00,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:12:00,746 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:12:01,117 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=693400.0, ans=0.1 2023-09-30 11:12:06,386 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=693400.0, ans=0.1 2023-09-30 11:12:09,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:12:10,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:12:10,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:12:12,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:12:12,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:12:12,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:12:13,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 11:12:15,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:12:16,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:12:16,856 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=693466.6666666666, ans=15.0 2023-09-30 11:12:17,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 11:12:19,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:12:23,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:12:25,039 INFO [train.py:1039] (3/4) Epoch 20, batch 3100, loss[loss=0.1697, simple_loss=0.2344, pruned_loss=0.0525, over 23853.00 frames. ], tot_loss[loss=0.178, simple_loss=0.2532, pruned_loss=0.05136, over 4688582.63 frames. ], batch size: 212, lr: 5.14e-03, grad_scale: 8.0 2023-09-30 11:12:26,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:12:28,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 11:12:30,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 11:12:31,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 11:12:33,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 11:12:35,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:12:39,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:12:39,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:12:42,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 11:12:47,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:12:52,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 11:12:55,876 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.538e+02 1.886e+02 2.147e+02 2.525e+02 3.564e+02, threshold=4.295e+02, percent-clipped=0.0 2023-09-30 11:12:57,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:12:57,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:12:57,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:12:59,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:12:59,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 11:13:00,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:13:02,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 11:13:02,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:13:03,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:13:05,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 11:13:06,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:13:12,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:13:13,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 11:13:14,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 11:13:15,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:18,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:13:20,665 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:13:20,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:20,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:13:22,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:13:22,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:13:24,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:13:24,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:13:24,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:24,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 11:13:28,025 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=693733.3333333334, ans=0.0 2023-09-30 11:13:29,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:13:30,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 11:13:32,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:13:33,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 11:13:35,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:13:35,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:35,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 11:13:38,741 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=693800.0, ans=0.125 2023-09-30 11:13:47,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 11:13:49,313 INFO [train.py:1039] (3/4) Epoch 20, batch 3150, loss[loss=0.1734, simple_loss=0.2553, pruned_loss=0.04574, over 24612.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2526, pruned_loss=0.05102, over 4694783.58 frames. ], batch size: 68, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:13:51,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:13:51,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:13:52,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:13:52,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:13:52,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 11:13:53,350 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=693866.6666666666, ans=0.1 2023-09-30 11:13:54,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:13:55,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 11:13:58,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 11:13:58,525 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=693866.6666666666, ans=0.0 2023-09-30 11:13:59,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:14:01,638 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 11:14:04,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 11:14:04,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=693933.3333333334, ans=0.125 2023-09-30 11:14:06,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:14:06,267 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 11:14:07,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 11:14:09,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 11:14:09,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 11:14:09,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 11:14:09,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:14:09,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:14:09,639 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=693933.3333333334, ans=0.0 2023-09-30 11:14:10,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:14:12,648 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.min_positive, batch_count=693933.3333333334, ans=0.025 2023-09-30 11:14:13,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 11:14:15,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:14:15,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:14:16,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=693933.3333333334, ans=0.0 2023-09-30 11:14:17,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:14:20,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 11:14:25,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 11:14:25,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:14:29,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:14:30,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:14:30,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 11:14:33,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 11:14:34,147 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=694000.0, ans=0.125 2023-09-30 11:14:35,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:14:36,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 11:14:36,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:14:36,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:14:36,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:14:37,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:14:37,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:14:39,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 11:14:40,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:14:40,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:14:42,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:14:42,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:14:43,889 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 11:14:43,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:14:45,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 11:14:46,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:14:47,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 11:14:47,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 11:14:50,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:14:50,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:14:51,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 11:14:52,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 11:14:52,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:14:56,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:14:57,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:14:57,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:14:58,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=694133.3333333334, ans=0.95 2023-09-30 11:15:05,211 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.12 vs. limit=6.0 2023-09-30 11:15:05,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:15:05,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:15:08,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 11:15:12,655 INFO [train.py:1039] (3/4) Epoch 20, batch 3200, loss[loss=0.1927, simple_loss=0.2512, pruned_loss=0.06704, over 23831.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2508, pruned_loss=0.05044, over 4692491.12 frames. ], batch size: 179, lr: 5.13e-03, grad_scale: 16.0 2023-09-30 11:15:12,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:15:12,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 11:15:18,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:15:20,315 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:15:20,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 11:15:22,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:15:25,790 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:15:30,814 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:15:39,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:15:42,126 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.511e+02 1.858e+02 2.103e+02 2.505e+02 4.292e+02, threshold=4.206e+02, percent-clipped=0.0 2023-09-30 11:15:47,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=694333.3333333334, ans=0.125 2023-09-30 11:15:50,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 11:15:51,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:15:54,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 11:15:54,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 11:15:58,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:15:58,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:16:00,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:16:03,902 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 11:16:06,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 11:16:08,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 11:16:13,422 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 11:16:15,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:16:15,891 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=694400.0, ans=0.125 2023-09-30 11:16:21,784 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.prob, batch_count=694466.6666666666, ans=0.125 2023-09-30 11:16:22,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:16:22,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:16:23,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:16:23,092 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 11:16:23,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:16:25,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=694466.6666666666, ans=0.125 2023-09-30 11:16:26,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:16:27,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 11:16:29,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 11:16:30,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 11:16:30,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 11:16:33,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:16:35,055 INFO [train.py:1039] (3/4) Epoch 20, batch 3250, loss[loss=0.1773, simple_loss=0.2615, pruned_loss=0.04651, over 24569.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2507, pruned_loss=0.05053, over 4692132.02 frames. ], batch size: 71, lr: 5.13e-03, grad_scale: 16.0 2023-09-30 11:16:36,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 11:16:36,608 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 11:16:36,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:16:36,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:16:39,571 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 11:16:41,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=694533.3333333334, ans=0.125 2023-09-30 11:16:44,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:16:49,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:16:51,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=694600.0, ans=0.1 2023-09-30 11:16:56,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:16:56,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 11:16:57,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:16:57,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:16:57,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:16:59,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:16:59,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:17:02,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:17:02,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:17:02,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:17:04,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:17:04,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:17:04,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:17:10,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:17:11,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:17:14,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:17:14,856 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:17:15,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:17:15,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:17:15,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:17:19,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 11:17:21,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:17:21,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:17:21,514 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=694666.6666666666, ans=0.125 2023-09-30 11:17:22,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:17:23,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=694733.3333333334, ans=0.0 2023-09-30 11:17:24,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:17:31,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:17:31,929 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=694733.3333333334, ans=0.125 2023-09-30 11:17:33,435 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=694733.3333333334, ans=0.0 2023-09-30 11:17:36,928 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.56 vs. limit=15.0 2023-09-30 11:17:39,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:17:39,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:17:39,454 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 11:17:39,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:17:39,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 11:17:41,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:17:43,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 11:17:44,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 11:17:45,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:17:47,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:17:48,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:17:48,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 11:17:48,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:17:50,424 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=694800.0, ans=0.125 2023-09-30 11:17:53,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:17:53,360 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=694800.0, ans=0.0 2023-09-30 11:17:53,907 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.63 vs. limit=15.0 2023-09-30 11:17:54,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:17:56,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 11:17:56,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:17:57,610 INFO [train.py:1039] (3/4) Epoch 20, batch 3300, loss[loss=0.1732, simple_loss=0.2573, pruned_loss=0.04455, over 24431.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2512, pruned_loss=0.05047, over 4711515.34 frames. ], batch size: 69, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:17:57,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:17:57,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 11:18:01,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:18:01,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 11:18:04,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 11:18:04,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 11:18:06,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:18:11,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:18:12,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:18:12,597 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:18:12,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 11:18:13,637 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.52 vs. limit=22.5 2023-09-30 11:18:14,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 11:18:16,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:18:18,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:18:24,440 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 11:18:24,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:18:24,566 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:18:26,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:18:27,583 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 11:18:27,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:18:29,026 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.513e+02 1.888e+02 2.046e+02 2.323e+02 3.227e+02, threshold=4.091e+02, percent-clipped=0.0 2023-09-30 11:18:29,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 11:18:30,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:18:30,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:18:30,738 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 11:18:37,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:18:37,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:18:39,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:18:39,034 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 11:18:40,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 11:18:41,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:18:42,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:18:44,049 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 11:18:45,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 11:18:45,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:18:47,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 11:18:50,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:18:53,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 11:18:53,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:18:56,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:18:56,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:18:56,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:18:56,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:18:59,638 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=10.08 vs. limit=15.0 2023-09-30 11:19:00,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:19:01,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:19:01,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:19:03,599 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 11:19:03,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 11:19:08,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:19:08,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:19:08,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:19:08,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=695133.3333333334, ans=0.0 2023-09-30 11:19:09,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:19:09,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:19:11,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:19:11,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:11,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 11:19:14,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:19:15,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 11:19:18,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 11:19:20,113 INFO [train.py:1039] (3/4) Epoch 20, batch 3350, loss[loss=0.1875, simple_loss=0.2622, pruned_loss=0.05638, over 23360.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2518, pruned_loss=0.05051, over 4708367.69 frames. ], batch size: 105, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:19:20,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:19:21,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:23,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:19:23,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:19:25,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:19:27,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:19:27,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:19:30,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:19:31,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:19:33,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:19:35,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:37,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:19:38,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:19:40,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:19:40,873 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=12.42 vs. limit=22.5 2023-09-30 11:19:41,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 11:19:42,383 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.53 vs. limit=15.0 2023-09-30 11:19:43,310 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 11:19:45,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:19:48,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 11:19:48,467 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 11:19:48,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:19:48,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:19:48,978 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=695266.6666666666, ans=0.2 2023-09-30 11:19:50,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:19:52,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 11:19:52,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:52,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:19:55,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:55,837 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=695333.3333333334, ans=0.1 2023-09-30 11:19:56,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:19:57,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:19:59,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:20:02,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:20:05,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:20:06,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:20:10,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:20:10,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:20:10,850 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=695400.0, ans=0.125 2023-09-30 11:20:12,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:20:12,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:20:15,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:20:16,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 11:20:16,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 11:20:16,728 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 11:20:16,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:20:19,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 11:20:20,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:20:21,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:20:24,153 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=6.30 vs. limit=15.0 2023-09-30 11:20:28,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=695466.6666666666, ans=0.125 2023-09-30 11:20:30,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:20:32,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 11:20:32,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 11:20:33,656 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:20:33,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:20:39,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:20:41,557 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=695533.3333333334, ans=0.1 2023-09-30 11:20:43,110 INFO [train.py:1039] (3/4) Epoch 20, batch 3400, loss[loss=0.1566, simple_loss=0.2298, pruned_loss=0.04171, over 18667.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.2529, pruned_loss=0.051, over 4702326.46 frames. ], batch size: 40, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:20:43,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 11:20:43,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:20:43,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:20:44,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:20:44,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 11:20:46,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:20:46,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 11:20:47,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:20:47,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:20:49,255 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:20:49,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:20:50,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 11:20:55,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 11:20:55,846 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 11:20:55,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:21:00,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:21:00,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 11:21:02,468 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:21:04,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:21:07,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:21:10,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 11:21:14,107 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.519e+02 1.831e+02 2.000e+02 2.171e+02 2.770e+02, threshold=4.000e+02, percent-clipped=0.0 2023-09-30 11:21:15,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:21:18,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:21:18,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:21:20,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 11:21:26,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:21:30,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 11:21:37,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:21:37,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:21:37,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 11:21:37,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:21:38,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:21:40,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:21:40,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:21:41,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:21:43,366 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=695733.3333333334, ans=0.125 2023-09-30 11:21:48,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:21:48,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:21:53,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:21:54,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 11:21:56,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=695800.0, ans=0.125 2023-09-30 11:21:59,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 11:22:04,523 INFO [train.py:1039] (3/4) Epoch 20, batch 3450, loss[loss=0.1788, simple_loss=0.2623, pruned_loss=0.0476, over 24558.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2529, pruned_loss=0.05087, over 4690133.45 frames. ], batch size: 71, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:22:04,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 11:22:11,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 11:22:11,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:22:13,481 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:22:13,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 11:22:15,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:22:18,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:22:23,690 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=695933.3333333334, ans=0.1 2023-09-30 11:22:24,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:22:25,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:22:26,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:22:26,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:22:28,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:22:30,141 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=695933.3333333334, ans=0.125 2023-09-30 11:22:33,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 11:22:39,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 11:22:41,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 11:22:41,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:22:43,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:22:47,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=696000.0, ans=0.0 2023-09-30 11:22:50,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 11:22:51,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:22:52,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=696000.0, ans=0.2 2023-09-30 11:22:55,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:22:55,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:22:57,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:22:58,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:23:00,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 11:23:00,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:23:00,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=696066.6666666666, ans=0.1 2023-09-30 11:23:00,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=696066.6666666666, ans=0.0 2023-09-30 11:23:03,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:23:04,860 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=696066.6666666666, ans=0.0 2023-09-30 11:23:06,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:23:06,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=696066.6666666666, ans=0.125 2023-09-30 11:23:08,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 11:23:11,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:23:15,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:23:19,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:23:20,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:23:25,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:23:26,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:23:26,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:23:26,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:23:26,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer1.prob, batch_count=696200.0, ans=0.125 2023-09-30 11:23:27,419 INFO [train.py:1039] (3/4) Epoch 20, batch 3500, loss[loss=0.1737, simple_loss=0.2351, pruned_loss=0.05613, over 23875.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2519, pruned_loss=0.05085, over 4681176.27 frames. ], batch size: 195, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:23:31,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:23:34,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:23:34,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 11:23:37,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 11:23:40,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 11:23:43,216 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.79 vs. limit=22.5 2023-09-30 11:23:44,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:23:44,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 11:23:48,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:23:48,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:23:52,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:23:52,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:23:52,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 11:23:53,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:23:53,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:23:53,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=696266.6666666666, ans=10.0 2023-09-30 11:23:55,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 11:23:58,756 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.890e+02 2.132e+02 2.452e+02 4.334e+02, threshold=4.264e+02, percent-clipped=1.0 2023-09-30 11:23:58,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:23:58,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 11:23:59,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=696333.3333333334, ans=0.2 2023-09-30 11:23:59,719 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.74 vs. limit=15.0 2023-09-30 11:24:00,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:24:04,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:05,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 11:24:05,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:24:07,548 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:24:08,029 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.53 vs. limit=22.5 2023-09-30 11:24:10,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:24:10,569 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:11,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=696333.3333333334, ans=0.1 2023-09-30 11:24:12,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:24:12,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:24:14,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 11:24:14,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 11:24:15,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 11:24:15,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:24:17,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:18,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:24:18,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:24:23,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 11:24:23,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:24:30,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:24:30,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 11:24:32,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 11:24:32,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:24:35,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:24:35,372 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:24:36,932 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:41,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 11:24:43,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:24:45,257 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:24:45,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 11:24:46,941 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 11:24:48,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:24:49,919 INFO [train.py:1039] (3/4) Epoch 20, batch 3550, loss[loss=0.1981, simple_loss=0.2605, pruned_loss=0.06782, over 22871.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2502, pruned_loss=0.05079, over 4682022.35 frames. ], batch size: 322, lr: 5.13e-03, grad_scale: 8.0 2023-09-30 11:24:50,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:24:51,623 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:24:51,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:24:56,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:24:58,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=696533.3333333334, ans=0.0 2023-09-30 11:25:03,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.out_combiner.scale_min, batch_count=696533.3333333334, ans=0.2 2023-09-30 11:25:05,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:25:07,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 11:25:10,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:25:12,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:25:14,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:25:14,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:25:14,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:25:18,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:25:19,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:25:19,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:25:19,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 11:25:21,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:25:27,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:25:27,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:25:28,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:25:28,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:25:29,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:25:29,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 11:25:29,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:25:29,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=696666.6666666666, ans=0.125 2023-09-30 11:25:32,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:25:32,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 11:25:40,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:25:40,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:25:41,278 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.94 vs. limit=15.0 2023-09-30 11:25:42,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:25:44,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 11:25:44,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:25:45,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 11:25:47,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:25:48,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:25:48,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:25:54,554 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 11:25:56,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:25:59,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:26:01,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 11:26:02,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:26:05,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:26:07,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 11:26:12,329 INFO [train.py:1039] (3/4) Epoch 20, batch 3600, loss[loss=0.1795, simple_loss=0.2514, pruned_loss=0.05379, over 23657.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2501, pruned_loss=0.05052, over 4691370.14 frames. ], batch size: 232, lr: 5.12e-03, grad_scale: 16.0 2023-09-30 11:26:14,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 11:26:14,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:26:16,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:26:17,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:26:17,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:26:19,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:26:23,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:26:24,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:26:26,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:26:26,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:26:27,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:26:27,738 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 11:26:28,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.scale_min, batch_count=696933.3333333334, ans=0.2 2023-09-30 11:26:30,860 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:26:32,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:26:34,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:26:38,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:26:40,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:26:40,556 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:26:41,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 11:26:42,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:26:43,317 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.766e+02 1.915e+02 2.241e+02 3.227e+02, threshold=3.831e+02, percent-clipped=0.0 2023-09-30 11:26:43,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:26:45,202 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:26:45,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:26:47,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:26:47,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=697000.0, ans=0.125 2023-09-30 11:26:48,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:26:49,162 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=697000.0, ans=0.2 2023-09-30 11:26:50,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 11:26:58,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:27:00,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:27:00,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 11:27:06,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:27:12,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:27:16,481 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:27:16,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=697133.3333333334, ans=0.1 2023-09-30 11:27:23,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:27:23,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:27:23,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 11:27:25,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 11:27:26,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 11:27:28,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:27:28,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:27:30,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer1.prob, batch_count=697133.3333333334, ans=0.125 2023-09-30 11:27:31,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 11:27:31,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:27:31,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:27:31,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:27:33,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 11:27:34,880 INFO [train.py:1039] (3/4) Epoch 20, batch 3650, loss[loss=0.1691, simple_loss=0.2555, pruned_loss=0.04131, over 24685.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2508, pruned_loss=0.0508, over 4690300.63 frames. ], batch size: 73, lr: 5.12e-03, grad_scale: 16.0 2023-09-30 11:27:34,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 11:27:36,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:27:38,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 11:27:44,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 11:27:45,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=697200.0, ans=0.0 2023-09-30 11:27:47,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:27:51,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 11:27:53,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 11:27:58,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:27:58,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:27:58,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:28:01,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 11:28:01,667 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=697266.6666666666, ans=0.0 2023-09-30 11:28:02,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:28:02,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 11:28:03,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=697266.6666666666, ans=0.0 2023-09-30 11:28:04,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:28:04,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:28:05,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 11:28:07,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 11:28:07,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:28:07,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:28:09,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:28:12,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 11:28:13,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 11:28:13,189 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=697333.3333333334, ans=0.1 2023-09-30 11:28:15,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:28:18,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 11:28:21,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:28:21,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:28:26,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:28:26,648 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=697400.0, ans=0.125 2023-09-30 11:28:29,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:28:29,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:28:29,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:28:31,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:28:33,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:28:35,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:28:36,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:28:36,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:28:38,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 11:28:38,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=697400.0, ans=0.0 2023-09-30 11:28:39,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:28:39,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:28:40,557 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.13 vs. limit=10.0 2023-09-30 11:28:43,049 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=697466.6666666666, ans=0.125 2023-09-30 11:28:46,599 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 11:28:50,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:28:50,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:28:52,020 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:28:53,390 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:28:54,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:28:56,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:28:58,366 INFO [train.py:1039] (3/4) Epoch 20, batch 3700, loss[loss=0.1466, simple_loss=0.2238, pruned_loss=0.03471, over 22251.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2515, pruned_loss=0.05039, over 4707171.30 frames. ], batch size: 49, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:28:58,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 11:28:58,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:29:00,148 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:29:03,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:29:03,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:29:06,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:29:06,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 11:29:06,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:29:08,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 11:29:08,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:29:11,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:29:16,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:29:17,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:29:19,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:29:19,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:29:20,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 11:29:22,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:29:24,430 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 11:29:31,978 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.879e+02 2.205e+02 2.626e+02 3.954e+02, threshold=4.410e+02, percent-clipped=1.0 2023-09-30 11:29:32,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:29:32,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:29:33,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:29:33,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 11:29:33,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:29:35,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:29:37,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 11:29:38,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:29:40,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:29:44,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:29:44,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:29:45,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 11:29:48,905 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:29:48,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 11:29:50,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:29:50,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 11:29:55,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:29:55,852 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=697733.3333333334, ans=0.2 2023-09-30 11:29:56,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:30:00,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:30:01,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 11:30:03,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:30:03,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:30:03,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:30:03,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:30:08,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:30:09,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 11:30:11,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 11:30:11,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:30:11,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:30:13,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:30:14,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:30:18,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:30:21,105 INFO [train.py:1039] (3/4) Epoch 20, batch 3750, loss[loss=0.1461, simple_loss=0.2288, pruned_loss=0.03169, over 24451.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.2521, pruned_loss=0.05067, over 4708105.26 frames. ], batch size: 66, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:30:21,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:30:21,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:30:24,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 11:30:25,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 11:30:27,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 11:30:29,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 11:30:29,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:30:31,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:30:33,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:30:34,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:30:35,755 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.70 vs. limit=15.0 2023-09-30 11:30:40,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:30:43,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:30:43,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:30:46,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:30:49,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:30:51,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 11:30:51,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:30:52,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:30:54,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:30:54,423 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=698000.0, ans=0.0 2023-09-30 11:30:57,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 11:31:02,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 11:31:02,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:31:05,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:31:05,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:31:11,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:31:13,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 11:31:16,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 11:31:19,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:31:23,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:31:23,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:31:26,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:31:30,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 11:31:30,229 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=698133.3333333334, ans=0.0 2023-09-30 11:31:33,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 11:31:35,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:31:36,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:31:38,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:31:40,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=698133.3333333334, ans=0.1 2023-09-30 11:31:45,025 INFO [train.py:1039] (3/4) Epoch 20, batch 3800, loss[loss=0.1843, simple_loss=0.2705, pruned_loss=0.04903, over 24456.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2526, pruned_loss=0.051, over 4708995.96 frames. ], batch size: 77, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:31:47,138 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:31:49,119 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=698200.0, ans=0.0 2023-09-30 11:31:50,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:31:52,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 11:31:53,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 11:31:56,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:31:57,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:31:59,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 11:32:00,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 11:32:00,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:32:01,079 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:32:03,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:32:03,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:32:03,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=698266.6666666666, ans=0.0 2023-09-30 11:32:04,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:32:06,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 11:32:09,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 11:32:09,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:32:13,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:32:15,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:32:15,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:32:18,366 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.534e+02 1.841e+02 1.992e+02 2.236e+02 3.615e+02, threshold=3.984e+02, percent-clipped=0.0 2023-09-30 11:32:18,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:32:18,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:32:20,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:32:20,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:32:27,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 11:32:27,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 11:32:28,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:32:36,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:32:42,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:32:45,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 11:32:48,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 11:32:49,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:32:51,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:32:53,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:32:53,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=698466.6666666666, ans=0.125 2023-09-30 11:32:55,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 11:32:58,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 11:32:58,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 11:33:00,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:33:00,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:33:06,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:33:07,654 INFO [train.py:1039] (3/4) Epoch 20, batch 3850, loss[loss=0.1761, simple_loss=0.2632, pruned_loss=0.0445, over 24348.00 frames. ], tot_loss[loss=0.1767, simple_loss=0.252, pruned_loss=0.05071, over 4712750.92 frames. ], batch size: 74, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:33:07,824 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:33:12,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:33:14,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 11:33:16,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:33:17,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:33:21,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 11:33:22,431 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.04 vs. limit=8.0 2023-09-30 11:33:23,559 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:33:26,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 11:33:28,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 11:33:34,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:36,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:33:39,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:33:39,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:33:41,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:41,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:33:43,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:33:43,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:33:43,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:33:45,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:33:46,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:46,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:33:48,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 11:33:48,400 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 11:33:49,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:33:49,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:53,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:33:55,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:33:55,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 11:33:58,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 11:34:00,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:34:00,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=698733.3333333334, ans=0.025 2023-09-30 11:34:02,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 11:34:04,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 11:34:08,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:34:10,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:34:13,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:34:13,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 11:34:16,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 11:34:18,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:34:18,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:34:21,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:34:21,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:34:23,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:23,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:23,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:34:23,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 11:34:23,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:34:27,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 11:34:27,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:27,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:34:28,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:34:29,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:30,649 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:34:32,490 INFO [train.py:1039] (3/4) Epoch 20, batch 3900, loss[loss=0.1809, simple_loss=0.2537, pruned_loss=0.05403, over 23308.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2502, pruned_loss=0.05023, over 4698473.41 frames. ], batch size: 105, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:34:32,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:34:32,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:34:32,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:34:32,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 11:34:32,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:37,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:34:38,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:34:38,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:34:40,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:34:43,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:34:43,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:46,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:34:46,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 11:34:46,489 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:34:47,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:34:49,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 11:34:49,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:34:51,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 11:34:52,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 11:34:58,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:34:59,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:34:59,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:35:01,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:35:04,754 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.560e+02 1.817e+02 2.055e+02 2.286e+02 3.490e+02, threshold=4.109e+02, percent-clipped=0.0 2023-09-30 11:35:05,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=699000.0, ans=0.125 2023-09-30 11:35:06,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:35:08,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:35:10,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:35:10,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:35:11,772 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:35:17,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:35:17,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:35:24,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 11:35:25,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:35:32,881 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=699066.6666666666, ans=0.0 2023-09-30 11:35:36,579 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.64 vs. limit=22.5 2023-09-30 11:35:37,911 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:35:41,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:35:43,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 11:35:43,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 11:35:43,489 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:35:45,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 11:35:46,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:35:48,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 11:35:53,934 INFO [train.py:1039] (3/4) Epoch 20, batch 3950, loss[loss=0.191, simple_loss=0.27, pruned_loss=0.05604, over 24387.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.25, pruned_loss=0.05049, over 4691924.59 frames. ], batch size: 77, lr: 5.12e-03, grad_scale: 8.0 2023-09-30 11:35:57,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:35:57,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 11:35:58,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:36:03,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:36:04,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:36:13,268 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 11:36:14,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:36:14,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 11:36:14,858 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 11:36:16,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:36:18,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:36:18,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:36:18,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:36:22,358 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 11:36:25,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:36:25,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:36:25,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:36:25,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:36:26,296 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.84 vs. limit=15.0 2023-09-30 11:36:26,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:36:36,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:36:36,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:36:41,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 11:36:48,203 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 11:36:48,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 11:36:48,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:36:50,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:36:50,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=699400.0, ans=0.125 2023-09-30 11:36:58,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:36:58,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:36:59,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:36:59,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:37:00,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 11:37:01,882 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=4.39 vs. limit=6.0 2023-09-30 11:37:04,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:37:05,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:37:08,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 11:37:15,289 INFO [train.py:1039] (3/4) Epoch 20, batch 4000, loss[loss=0.1964, simple_loss=0.2642, pruned_loss=0.06431, over 23550.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2507, pruned_loss=0.05062, over 4703659.00 frames. ], batch size: 135, lr: 5.11e-03, grad_scale: 16.0 2023-09-30 11:37:15,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=699533.3333333334, ans=0.0 2023-09-30 11:37:20,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:37:26,202 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=699533.3333333334, ans=0.1 2023-09-30 11:37:27,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:37:33,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:37:34,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:37:35,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:37:35,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 11:37:35,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:37:35,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 11:37:35,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:37:35,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 11:37:38,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:37:43,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:37:43,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:37:43,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:37:43,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:37:43,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 11:37:44,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:37:45,031 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 11:37:45,321 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=699600.0, ans=0.0 2023-09-30 11:37:46,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:37:46,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:37:48,529 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.791e+02 1.984e+02 2.297e+02 3.398e+02, threshold=3.968e+02, percent-clipped=0.0 2023-09-30 11:37:50,248 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 11:37:50,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 11:37:50,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:37:50,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=699666.6666666666, ans=0.1 2023-09-30 11:37:52,225 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=699666.6666666666, ans=0.125 2023-09-30 11:37:59,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 11:37:59,305 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:38:00,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:38:02,425 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 11:38:03,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:38:04,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 11:38:04,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:38:04,667 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.27 vs. limit=12.0 2023-09-30 11:38:05,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:38:05,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:38:07,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:38:08,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:38:08,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:38:10,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 11:38:11,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:38:13,196 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 11:38:19,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:38:22,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 11:38:25,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:38:26,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:38:27,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:38:28,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:38:31,606 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=699800.0, ans=0.0 2023-09-30 11:38:34,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:38:36,112 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.05 vs. limit=8.0 2023-09-30 11:38:36,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 11:38:36,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 11:38:37,995 INFO [train.py:1039] (3/4) Epoch 20, batch 4050, loss[loss=0.1493, simple_loss=0.23, pruned_loss=0.03432, over 24356.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2511, pruned_loss=0.05103, over 4699821.85 frames. ], batch size: 61, lr: 5.11e-03, grad_scale: 16.0 2023-09-30 11:38:39,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:38:39,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:38:41,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:38:41,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:38:42,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:38:47,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:38:50,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:38:50,731 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=699866.6666666666, ans=0.1 2023-09-30 11:38:52,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 11:38:55,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:38:55,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:38:59,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:39:02,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:39:03,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 11:39:07,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 11:39:07,451 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 11:39:09,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:39:14,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 11:39:15,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:39:19,237 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=700000.0, ans=0.1 2023-09-30 11:39:20,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:39:21,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:39:23,399 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:39:23,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:39:23,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=700000.0, ans=0.2 2023-09-30 11:39:26,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:39:28,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 11:39:28,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:39:31,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:39:33,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 11:39:39,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:39:39,574 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=700066.6666666666, ans=0.0 2023-09-30 11:39:44,631 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=700133.3333333334, ans=0.015 2023-09-30 11:39:46,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 11:39:47,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:39:47,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:39:49,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 11:39:49,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 11:39:49,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:39:52,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:39:53,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:39:53,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:40:00,017 INFO [train.py:1039] (3/4) Epoch 20, batch 4100, loss[loss=0.1653, simple_loss=0.2485, pruned_loss=0.04109, over 24473.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.252, pruned_loss=0.05126, over 4717885.83 frames. ], batch size: 66, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:40:01,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 11:40:02,360 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=12.40 vs. limit=15.0 2023-09-30 11:40:03,358 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 11:40:05,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 11:40:05,815 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=700200.0, ans=0.125 2023-09-30 11:40:07,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 11:40:07,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:40:09,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:40:09,310 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:40:09,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:40:10,785 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 11:40:14,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:40:14,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:40:14,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:40:16,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:40:18,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=700266.6666666666, ans=0.1 2023-09-30 11:40:21,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:40:22,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:40:22,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:40:22,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 11:40:24,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:40:24,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:40:25,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:40:25,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:40:25,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 11:40:28,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:40:30,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 11:40:31,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:40:34,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:40:34,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 11:40:36,130 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.811e+02 2.046e+02 2.303e+02 3.809e+02, threshold=4.092e+02, percent-clipped=0.0 2023-09-30 11:40:36,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:40:37,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:40:37,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:40:41,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 11:40:43,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:40:43,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:40:46,664 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 11:40:48,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:40:48,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:40:51,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:40:58,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:41:02,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:41:04,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:41:13,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:41:13,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:41:18,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:41:21,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:41:22,555 INFO [train.py:1039] (3/4) Epoch 20, batch 4150, loss[loss=0.1757, simple_loss=0.2486, pruned_loss=0.05144, over 23432.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2519, pruned_loss=0.05062, over 4722662.03 frames. ], batch size: 93, lr: 5.11e-03, grad_scale: 4.0 2023-09-30 11:41:25,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:41:27,611 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:41:27,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:41:27,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:41:28,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=700533.3333333334, ans=0.125 2023-09-30 11:41:30,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 11:41:30,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:41:32,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 11:41:32,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 11:41:32,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 11:41:34,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:41:40,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:41:40,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:41:44,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:41:45,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:41:46,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:41:48,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 11:41:48,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:41:50,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 11:41:55,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:41:59,913 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:42:00,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=700666.6666666666, ans=0.1 2023-09-30 11:42:01,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 11:42:02,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 11:42:03,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:42:04,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 11:42:04,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:42:04,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:42:06,287 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=700666.6666666666, ans=0.0 2023-09-30 11:42:09,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:42:11,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:42:15,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 11:42:18,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:42:20,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:42:20,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 11:42:20,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:42:23,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 11:42:25,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:42:26,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:42:26,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:42:28,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 11:42:28,258 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:42:29,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 11:42:31,357 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=21.16 vs. limit=22.5 2023-09-30 11:42:31,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 11:42:34,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 11:42:34,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:42:34,959 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:42:35,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:42:36,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 11:42:36,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:42:36,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 11:42:36,693 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:42:38,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=700800.0, ans=0.1 2023-09-30 11:42:39,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:42:39,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 11:42:41,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 11:42:41,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=700800.0, ans=0.0 2023-09-30 11:42:44,264 INFO [train.py:1039] (3/4) Epoch 20, batch 4200, loss[loss=0.1523, simple_loss=0.2326, pruned_loss=0.03601, over 24573.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2511, pruned_loss=0.05002, over 4727160.94 frames. ], batch size: 60, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:42:45,677 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.85 vs. limit=5.0 2023-09-30 11:42:46,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:42:46,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 11:42:47,185 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=700866.6666666666, ans=0.125 2023-09-30 11:42:49,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:42:50,161 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=700866.6666666666, ans=0.0 2023-09-30 11:42:50,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=700866.6666666666, ans=0.1 2023-09-30 11:42:51,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:42:54,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:42:54,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:42:54,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:42:57,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 11:42:59,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 11:43:00,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:43:03,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:43:03,883 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=700933.3333333334, ans=0.0 2023-09-30 11:43:06,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:43:09,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 11:43:11,150 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:43:12,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:43:12,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 11:43:12,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:43:12,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:43:14,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:43:14,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:43:16,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:43:17,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 11:43:19,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:43:20,930 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.891e+02 2.083e+02 2.448e+02 3.727e+02, threshold=4.165e+02, percent-clipped=0.0 2023-09-30 11:43:24,441 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=701000.0, ans=0.0 2023-09-30 11:43:25,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 11:43:27,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:43:28,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:43:30,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:43:32,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:43:32,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 11:43:32,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:43:33,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:43:33,971 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=701066.6666666666, ans=0.0 2023-09-30 11:43:40,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 11:43:40,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:43:46,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:43:48,917 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.60 vs. limit=15.0 2023-09-30 11:43:49,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 11:43:49,829 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=701133.3333333334, ans=0.125 2023-09-30 11:43:51,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:43:52,974 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=701133.3333333334, ans=0.1 2023-09-30 11:43:56,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:43:58,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:44:00,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 11:44:01,056 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=701133.3333333334, ans=0.2 2023-09-30 11:44:05,345 INFO [train.py:1039] (3/4) Epoch 20, batch 4250, loss[loss=0.1661, simple_loss=0.2457, pruned_loss=0.04326, over 19429.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2503, pruned_loss=0.0498, over 4709061.55 frames. ], batch size: 42, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:44:06,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 11:44:11,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 11:44:11,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 11:44:13,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:44:13,828 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=5.10 vs. limit=12.0 2023-09-30 11:44:15,358 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.34 vs. limit=15.0 2023-09-30 11:44:17,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:44:19,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 11:44:19,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:44:22,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=701266.6666666666, ans=0.125 2023-09-30 11:44:23,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:44:27,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:44:29,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:44:30,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:44:31,331 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.96 vs. limit=10.0 2023-09-30 11:44:32,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:44:32,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:44:34,625 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.73 vs. limit=15.0 2023-09-30 11:44:35,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:44:36,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:44:36,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:44:40,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:44:42,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:44:43,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 11:44:48,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 11:44:48,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:44:50,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:44:50,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:44:51,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:44:51,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:44:51,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:44:52,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=701333.3333333334, ans=0.125 2023-09-30 11:44:54,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 11:44:54,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 11:44:56,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=701400.0, ans=0.0 2023-09-30 11:44:59,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:45:01,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:45:03,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 11:45:03,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:45:03,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 11:45:04,718 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:45:06,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:45:07,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:45:07,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:45:09,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 11:45:11,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 11:45:12,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:45:18,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:45:21,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:45:23,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:45:23,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:45:23,826 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=701466.6666666666, ans=0.125 2023-09-30 11:45:25,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:45:26,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:45:27,999 INFO [train.py:1039] (3/4) Epoch 20, batch 4300, loss[loss=0.1821, simple_loss=0.2532, pruned_loss=0.05549, over 23764.00 frames. ], tot_loss[loss=0.175, simple_loss=0.25, pruned_loss=0.04998, over 4699545.06 frames. ], batch size: 179, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:45:28,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:45:28,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 11:45:29,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:45:34,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:45:34,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:45:38,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=701533.3333333334, ans=0.125 2023-09-30 11:45:39,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:45:40,043 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass_mid.scale_min, batch_count=701533.3333333334, ans=0.2 2023-09-30 11:45:44,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:45:44,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 11:45:45,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:45:45,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=701600.0, ans=0.125 2023-09-30 11:45:48,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:45:48,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 11:45:48,601 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 11:45:51,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 11:45:54,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:45:57,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 11:45:57,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:45:57,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 11:46:00,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 11:46:03,843 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.784e+02 1.943e+02 2.161e+02 2.799e+02, threshold=3.885e+02, percent-clipped=0.0 2023-09-30 11:46:03,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:46:05,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:46:05,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:46:07,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:46:08,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:46:10,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:46:10,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 11:46:12,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 11:46:15,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:46:15,737 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=701733.3333333334, ans=0.09899494936611666 2023-09-30 11:46:19,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:46:19,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:46:20,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:46:20,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:46:20,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 11:46:20,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 11:46:20,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 11:46:22,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:46:22,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 11:46:24,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 11:46:27,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:46:29,463 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 11:46:29,693 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=701733.3333333334, ans=0.125 2023-09-30 11:46:30,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:46:32,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:46:32,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:46:35,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 11:46:35,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 11:46:35,597 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:46:37,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:46:37,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:46:37,294 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:46:40,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:46:42,415 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.49 vs. limit=6.0 2023-09-30 11:46:43,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:46:44,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:46:44,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:46:49,810 INFO [train.py:1039] (3/4) Epoch 20, batch 4350, loss[loss=0.1577, simple_loss=0.2337, pruned_loss=0.04085, over 24544.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2502, pruned_loss=0.05022, over 4707876.18 frames. ], batch size: 60, lr: 5.11e-03, grad_scale: 8.0 2023-09-30 11:46:51,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 11:46:51,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 11:46:57,210 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:47:00,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:47:03,315 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.95 vs. limit=10.0 2023-09-30 11:47:04,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 11:47:04,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:47:08,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:47:13,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:47:13,718 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=701933.3333333334, ans=0.05 2023-09-30 11:47:15,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:47:15,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:47:18,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=701933.3333333334, ans=0.04949747468305833 2023-09-30 11:47:19,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:47:21,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:47:22,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:47:24,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=702000.0, ans=0.1 2023-09-30 11:47:30,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 11:47:30,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:47:30,968 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=2.50 vs. limit=15.0 2023-09-30 11:47:31,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:47:36,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:47:39,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 11:47:42,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:47:44,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 11:47:47,377 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 11:47:48,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:47:48,949 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:47:50,451 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 11:47:51,867 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 11:47:51,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:47:51,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:47:53,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:47:54,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:47:56,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:47:56,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:47:58,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 11:47:58,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:47:58,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:47:59,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:47:59,867 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.00 vs. limit=12.0 2023-09-30 11:48:00,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 11:48:01,957 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 11:48:01,966 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 11:48:01,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 11:48:05,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=702133.3333333334, ans=0.125 2023-09-30 11:48:07,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:48:09,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:48:09,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:48:09,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:48:10,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 11:48:13,802 INFO [train.py:1039] (3/4) Epoch 20, batch 4400, loss[loss=0.1809, simple_loss=0.2663, pruned_loss=0.04773, over 24580.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2507, pruned_loss=0.05008, over 4688635.52 frames. ], batch size: 71, lr: 5.10e-03, grad_scale: 16.0 2023-09-30 11:48:13,970 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 11:48:13,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:18,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:48:18,619 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:20,245 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:48:23,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 11:48:23,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 11:48:23,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 11:48:23,373 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 11:48:24,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 11:48:24,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:48:25,092 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=702200.0, ans=0.125 2023-09-30 11:48:27,890 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 11:48:29,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:31,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:48:31,086 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 11:48:34,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:48:35,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 11:48:35,074 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 11:48:36,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 11:48:38,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 11:48:38,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 11:48:38,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:48:41,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:48:42,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:48:42,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:48:45,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 11:48:45,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 11:48:47,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:48:49,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 11:48:49,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:48:49,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:48:50,496 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.844e+02 2.019e+02 2.293e+02 3.220e+02, threshold=4.037e+02, percent-clipped=0.0 2023-09-30 11:48:50,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:48:50,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 11:48:52,220 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 11:48:55,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:49:01,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:49:04,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 11:49:09,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:49:12,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:49:14,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:49:15,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 11:49:16,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:49:16,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:49:16,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:49:18,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 11:49:23,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 11:49:27,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 11:49:29,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 11:49:29,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:49:29,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 11:49:30,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:49:33,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:49:34,270 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.prob, batch_count=702533.3333333334, ans=0.125 2023-09-30 11:49:35,351 INFO [train.py:1039] (3/4) Epoch 20, batch 4450, loss[loss=0.1771, simple_loss=0.2504, pruned_loss=0.05193, over 24317.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2515, pruned_loss=0.04995, over 4708453.09 frames. ], batch size: 61, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:49:35,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 11:49:40,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:49:43,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:49:43,564 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 11:49:43,926 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer_na.min_abs, batch_count=702533.3333333334, ans=0.02 2023-09-30 11:49:50,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:49:50,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:49:54,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:49:57,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:50:01,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:50:01,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:50:01,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 11:50:01,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:50:03,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:50:03,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:50:03,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 11:50:06,517 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 11:50:08,912 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=702666.6666666666, ans=0.2 2023-09-30 11:50:10,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:50:10,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:50:11,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:50:11,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:50:13,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:50:16,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 11:50:18,515 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 11:50:18,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 11:50:18,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:50:22,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:50:22,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 11:50:27,296 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.90 vs. limit=15.0 2023-09-30 11:50:29,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 11:50:32,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:50:33,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 11:50:33,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:50:33,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:50:33,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:50:33,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:50:36,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:50:37,179 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:50:41,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 11:50:41,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 11:50:43,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 11:50:43,604 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=702800.0, ans=0.0 2023-09-30 11:50:44,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:50:46,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:50:47,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:50:47,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 11:50:50,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:50:54,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 11:50:55,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:50:57,870 INFO [train.py:1039] (3/4) Epoch 20, batch 4500, loss[loss=0.1763, simple_loss=0.2619, pruned_loss=0.04533, over 24657.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2515, pruned_loss=0.05039, over 4704810.87 frames. ], batch size: 73, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:51:01,949 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=702866.6666666666, ans=0.125 2023-09-30 11:51:03,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:51:04,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 11:51:04,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 11:51:04,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=702866.6666666666, ans=0.09899494936611666 2023-09-30 11:51:05,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:51:10,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:51:10,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:51:11,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 11:51:11,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:51:11,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:51:13,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:51:24,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=702933.3333333334, ans=0.125 2023-09-30 11:51:25,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:51:25,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:51:31,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:51:31,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 11:51:33,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:51:33,821 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=703000.0, ans=0.0 2023-09-30 11:51:36,925 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.958e+02 2.176e+02 2.653e+02 3.969e+02, threshold=4.352e+02, percent-clipped=0.0 2023-09-30 11:51:38,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 11:51:42,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:51:45,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:51:46,276 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=7.92 vs. limit=15.0 2023-09-30 11:51:48,374 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 11:51:48,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 11:51:48,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:51:49,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:51:50,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:51:51,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:51:55,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:51:55,743 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 11:51:55,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 11:51:55,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:52:00,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:52:00,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 11:52:06,629 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:52:06,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 11:52:08,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:52:09,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 11:52:12,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 11:52:12,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 11:52:15,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 11:52:16,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 11:52:17,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:52:20,728 INFO [train.py:1039] (3/4) Epoch 20, batch 4550, loss[loss=0.1661, simple_loss=0.2251, pruned_loss=0.05354, over 23450.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2508, pruned_loss=0.05021, over 4712984.06 frames. ], batch size: 285, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:52:22,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:52:22,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:52:24,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:52:28,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:52:30,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:52:34,142 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:52:34,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:52:34,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:52:38,022 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:52:38,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:52:41,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:52:44,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 11:52:45,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 11:52:47,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 11:52:48,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 11:52:51,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 11:52:53,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:52:56,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 11:52:58,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:53:02,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:03,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:03,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 11:53:05,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 11:53:07,532 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=703400.0, ans=0.0 2023-09-30 11:53:09,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:53:11,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:11,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:53:13,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:53:14,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 11:53:14,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 11:53:14,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:53:16,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 11:53:19,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 11:53:19,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:53:20,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:53:20,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:53:21,262 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.07 vs. limit=15.0 2023-09-30 11:53:23,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:23,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:53:25,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 11:53:25,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 11:53:26,224 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.41 vs. limit=15.0 2023-09-30 11:53:26,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:53:26,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 11:53:26,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 11:53:28,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 11:53:28,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 11:53:31,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 11:53:31,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:53:32,050 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.83 vs. limit=12.0 2023-09-30 11:53:35,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:53:35,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:53:35,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 11:53:36,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=703466.6666666666, ans=0.2 2023-09-30 11:53:38,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:53:38,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 11:53:40,331 INFO [train.py:1039] (3/4) Epoch 20, batch 4600, loss[loss=0.1809, simple_loss=0.2526, pruned_loss=0.05456, over 23709.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2488, pruned_loss=0.04996, over 4692771.42 frames. ], batch size: 149, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:53:40,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:53:42,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:53:45,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=703533.3333333334, ans=0.125 2023-09-30 11:53:46,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:53:47,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 11:53:48,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:53:49,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 11:53:51,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:53:56,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:53:56,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:53:57,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:01,340 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=703600.0, ans=0.125 2023-09-30 11:54:02,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=703600.0, ans=0.125 2023-09-30 11:54:04,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 11:54:05,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:07,511 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer1.prob, batch_count=703600.0, ans=0.125 2023-09-30 11:54:08,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:10,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=703600.0, ans=0.1 2023-09-30 11:54:11,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:54:11,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:54:18,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 11:54:18,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 11:54:19,794 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.438e+02 1.811e+02 2.166e+02 2.771e+02 4.574e+02, threshold=4.333e+02, percent-clipped=1.0 2023-09-30 11:54:19,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:54:20,910 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=19.37 vs. limit=22.5 2023-09-30 11:54:27,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:27,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:54:29,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 11:54:33,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 11:54:35,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 11:54:35,310 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=703733.3333333334, ans=0.125 2023-09-30 11:54:39,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:54:41,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:54:42,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:54:42,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 11:54:42,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:54:44,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 11:54:44,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:54:45,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:54:46,205 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=703800.0, ans=10.0 2023-09-30 11:54:47,310 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:54:47,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:54:48,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:54:48,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 11:54:51,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 11:54:51,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 11:54:51,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:54:53,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:54:53,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:54:55,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:55:00,877 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.57 vs. limit=15.0 2023-09-30 11:55:02,989 INFO [train.py:1039] (3/4) Epoch 20, batch 4650, loss[loss=0.1511, simple_loss=0.2306, pruned_loss=0.03578, over 24535.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2484, pruned_loss=0.0497, over 4705061.79 frames. ], batch size: 60, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:55:06,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:55:09,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:55:09,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:55:09,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:55:09,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:55:09,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:55:12,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:55:12,896 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=703866.6666666666, ans=0.2 2023-09-30 11:55:15,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 11:55:18,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:55:20,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 11:55:21,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:55:21,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 11:55:23,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:55:23,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 11:55:23,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 11:55:25,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:55:26,738 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 11:55:28,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 11:55:28,950 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=703933.3333333334, ans=0.0 2023-09-30 11:55:30,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:55:30,301 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 11:55:34,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:55:35,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 11:55:39,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:55:39,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 11:55:39,228 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 11:55:40,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:55:43,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 11:55:44,753 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.31 vs. limit=15.0 2023-09-30 11:55:45,616 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=704000.0, ans=0.1 2023-09-30 11:55:46,868 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:55:47,061 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:55:47,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=704000.0, ans=0.2 2023-09-30 11:55:52,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:55:54,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:55:56,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:55:56,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 11:55:56,414 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:56:00,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=704066.6666666666, ans=0.1 2023-09-30 11:56:01,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 11:56:01,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 11:56:01,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 11:56:01,591 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 11:56:03,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:56:07,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=704133.3333333334, ans=0.2 2023-09-30 11:56:10,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:56:10,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:56:10,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 11:56:10,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:56:12,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:56:12,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 11:56:14,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 11:56:17,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 11:56:17,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:56:19,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:56:22,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:56:22,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:56:22,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 11:56:22,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 11:56:23,863 INFO [train.py:1039] (3/4) Epoch 20, batch 4700, loss[loss=0.1824, simple_loss=0.2766, pruned_loss=0.04412, over 24299.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2496, pruned_loss=0.04982, over 4708424.07 frames. ], batch size: 74, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:56:24,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 11:56:27,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 11:56:28,819 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=704200.0, ans=0.1 2023-09-30 11:56:36,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:56:36,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:56:37,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:56:39,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:56:41,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 11:56:44,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 11:56:44,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 11:56:48,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:56:48,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:56:50,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:56:54,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:56:56,979 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.95 vs. limit=15.0 2023-09-30 11:56:59,819 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=9.16 vs. limit=15.0 2023-09-30 11:57:00,543 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.486e+02 1.959e+02 2.502e+02 2.880e+02 4.077e+02, threshold=5.005e+02, percent-clipped=0.0 2023-09-30 11:57:00,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:57:01,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=704333.3333333334, ans=0.125 2023-09-30 11:57:02,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 11:57:04,556 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=11.10 vs. limit=15.0 2023-09-30 11:57:05,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:57:11,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 11:57:12,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 11:57:16,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:19,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 11:57:21,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:57:24,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:57:24,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 11:57:27,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:27,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:57:30,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:57:30,412 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:57:30,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 11:57:31,903 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 11:57:33,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:57:35,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:35,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:36,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 11:57:36,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 11:57:38,873 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=704466.6666666666, ans=0.1 2023-09-30 11:57:42,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 11:57:44,863 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=704533.3333333334, ans=0.125 2023-09-30 11:57:45,948 INFO [train.py:1039] (3/4) Epoch 20, batch 4750, loss[loss=0.1819, simple_loss=0.2662, pruned_loss=0.04878, over 24355.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.251, pruned_loss=0.0501, over 4719742.74 frames. ], batch size: 77, lr: 5.10e-03, grad_scale: 8.0 2023-09-30 11:57:46,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:57:46,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:57:51,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:57:51,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:57:55,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 11:57:55,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:58:00,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 11:58:00,587 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 11:58:01,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 11:58:01,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:58:03,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:58:09,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 11:58:14,097 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 11:58:16,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 11:58:16,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:58:19,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:58:19,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:58:19,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:58:21,475 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 11:58:21,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 11:58:23,822 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=704666.6666666666, ans=0.0 2023-09-30 11:58:24,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 11:58:28,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:58:28,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:58:31,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 11:58:31,572 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 11:58:32,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:58:34,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 11:58:37,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 11:58:39,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 11:58:39,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 11:58:39,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:58:40,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 11:58:40,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:58:42,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 11:58:43,331 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.24 vs. limit=15.0 2023-09-30 11:58:43,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 11:58:48,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 11:58:50,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:58:53,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 11:58:53,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 11:58:53,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:58:55,643 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=704800.0, ans=0.125 2023-09-30 11:58:56,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:58:58,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 11:58:58,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:58:59,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 11:59:03,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:59:03,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 11:59:03,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 11:59:05,251 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 11:59:08,051 INFO [train.py:1039] (3/4) Epoch 20, batch 4800, loss[loss=0.1509, simple_loss=0.2282, pruned_loss=0.03685, over 24305.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2522, pruned_loss=0.05082, over 4716311.61 frames. ], batch size: 56, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 11:59:08,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 11:59:09,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 11:59:09,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 11:59:11,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=704866.6666666666, ans=0.0 2023-09-30 11:59:15,794 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:59:17,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:59:23,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 11:59:23,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:59:24,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:59:27,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 11:59:27,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 11:59:27,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 11:59:28,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 11:59:33,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 11:59:34,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:59:34,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 11:59:36,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:59:36,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 11:59:36,350 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:59:38,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 11:59:41,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 11:59:42,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:59:44,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 11:59:44,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 11:59:45,725 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.503e+02 1.894e+02 2.052e+02 2.398e+02 3.146e+02, threshold=4.103e+02, percent-clipped=0.0 2023-09-30 11:59:45,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 11:59:46,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:59:48,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 11:59:48,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 11:59:49,292 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=705000.0, ans=0.1 2023-09-30 11:59:50,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 11:59:50,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 11:59:51,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 11:59:51,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:59:51,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 11:59:53,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 11:59:55,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 11:59:59,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:00:04,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:04,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=705066.6666666666, ans=0.125 2023-09-30 12:00:05,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:00:12,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 12:00:12,447 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:00:12,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:12,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:00:14,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:00:18,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:00:20,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:00:20,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:21,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:00:21,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:00:23,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:00:23,804 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=705133.3333333334, ans=0.1 2023-09-30 12:00:26,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:00:26,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:26,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:00:27,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 12:00:28,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=705200.0, ans=0.1 2023-09-30 12:00:29,404 INFO [train.py:1039] (3/4) Epoch 20, batch 4850, loss[loss=0.1825, simple_loss=0.2511, pruned_loss=0.05701, over 23961.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2526, pruned_loss=0.05094, over 4716500.90 frames. ], batch size: 196, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:00:29,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 12:00:29,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:00:29,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:00:30,263 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.32 vs. limit=22.5 2023-09-30 12:00:33,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:00:33,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:35,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:00:45,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 12:00:46,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=705266.6666666666, ans=0.0 2023-09-30 12:00:47,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:00:50,977 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.whiten.whitening_limit, batch_count=705266.6666666666, ans=12.0 2023-09-30 12:00:51,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:00:52,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:00:52,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:00:55,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:00:57,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:00:59,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:00:59,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 12:00:59,918 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=705266.6666666666, ans=0.0 2023-09-30 12:01:01,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:01:05,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:01:06,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 12:01:06,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:01:06,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 12:01:10,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:01:10,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:01:14,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:01:14,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 12:01:15,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 12:01:16,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:01:23,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:01:23,652 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 12:01:24,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:01:25,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:01:26,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:01:26,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 12:01:26,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:01:28,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 12:01:28,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:01:29,831 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:01:30,155 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:01:31,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 12:01:39,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:01:45,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:01:45,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:01:47,460 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.21 vs. limit=22.5 2023-09-30 12:01:48,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=705466.6666666666, ans=0.1 2023-09-30 12:01:49,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 12:01:49,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:01:53,043 INFO [train.py:1039] (3/4) Epoch 20, batch 4900, loss[loss=0.1626, simple_loss=0.2485, pruned_loss=0.03838, over 24461.00 frames. ], tot_loss[loss=0.1768, simple_loss=0.2521, pruned_loss=0.05076, over 4724577.24 frames. ], batch size: 66, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:01:56,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:01:58,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:01:58,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:02:02,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 12:02:02,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=705533.3333333334, ans=0.0 2023-09-30 12:02:08,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 12:02:12,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 12:02:13,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 12:02:13,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:02:13,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:02:15,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:02:15,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:02:15,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:02:15,557 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 12:02:20,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 12:02:20,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 12:02:23,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:02:25,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:02:28,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:02:28,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:02:29,517 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.474e+02 1.910e+02 2.109e+02 2.496e+02 4.455e+02, threshold=4.218e+02, percent-clipped=1.0 2023-09-30 12:02:31,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:02:31,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 12:02:33,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:02:33,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:02:33,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 12:02:33,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 12:02:39,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 12:02:41,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:02:42,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:02:42,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:02:44,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:02:44,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 12:02:44,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:02:44,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 12:02:49,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:02:49,832 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=705733.3333333334, ans=0.0 2023-09-30 12:02:51,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:02:52,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:02:52,918 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=705733.3333333334, ans=0.1 2023-09-30 12:02:56,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 12:02:56,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:02:56,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 12:02:56,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 12:02:56,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=705800.0, ans=0.2 2023-09-30 12:03:03,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:03:05,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:03:06,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 12:03:06,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 12:03:06,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:03:09,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:03:14,435 INFO [train.py:1039] (3/4) Epoch 20, batch 4950, loss[loss=0.1504, simple_loss=0.2361, pruned_loss=0.03241, over 24418.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2508, pruned_loss=0.04986, over 4716882.45 frames. ], batch size: 63, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:03:14,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:03:14,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:03:15,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:03:15,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 12:03:17,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 12:03:21,297 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:03:21,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 12:03:24,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 12:03:24,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 12:03:24,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:03:24,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 12:03:26,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:03:26,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:03:26,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:03:26,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:03:29,990 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:03:30,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:03:31,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:03:33,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:03:34,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:03:34,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:03:38,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 12:03:43,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:03:46,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:03:47,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:03:49,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:03:50,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:03:52,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 12:03:52,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 12:03:56,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:03:58,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:03:58,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:03:59,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:03:59,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:04:01,194 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:04:04,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:04:06,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:04:09,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:04:10,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:04:10,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:04:12,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 12:04:12,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:04:12,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:04:17,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:04:19,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:04:19,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:04:20,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:04:20,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:04:22,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:04:22,575 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=706133.3333333334, ans=0.125 2023-09-30 12:04:23,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:04:23,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:04:23,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:04:26,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 12:04:28,188 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=706133.3333333334, ans=0.125 2023-09-30 12:04:34,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:04:37,586 INFO [train.py:1039] (3/4) Epoch 20, batch 5000, loss[loss=0.1819, simple_loss=0.2605, pruned_loss=0.05164, over 23373.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2503, pruned_loss=0.04986, over 4702039.58 frames. ], batch size: 93, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:04:39,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 12:04:39,345 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 12:04:44,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=706200.0, ans=0.125 2023-09-30 12:04:47,100 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:04:47,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:04:47,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 12:04:48,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 12:04:50,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:04:50,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=706200.0, ans=0.125 2023-09-30 12:04:52,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 12:04:54,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:04:54,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:04:54,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 12:04:55,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:04:55,673 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:04:57,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 12:04:57,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:04:57,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:04:58,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 12:05:00,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 12:05:01,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:05:02,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 12:05:02,096 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 12:05:02,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:05:03,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:05:03,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 12:05:03,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 12:05:03,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 12:05:05,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:05:06,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:05:07,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 12:05:09,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:05:11,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:05:11,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:05:12,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 12:05:14,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 12:05:14,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:05:15,915 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.770e+02 2.000e+02 2.327e+02 3.746e+02, threshold=4.000e+02, percent-clipped=0.0 2023-09-30 12:05:16,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:05:16,583 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer2.prob, batch_count=706333.3333333334, ans=0.125 2023-09-30 12:05:19,371 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 12:05:22,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:05:24,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:05:24,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:05:26,428 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=706400.0, ans=0.125 2023-09-30 12:05:29,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 12:05:29,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:05:29,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:05:30,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:05:32,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 12:05:32,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:05:33,699 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.75 vs. limit=15.0 2023-09-30 12:05:36,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:05:36,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=706400.0, ans=0.1 2023-09-30 12:05:37,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:05:38,491 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=706400.0, ans=0.0 2023-09-30 12:05:43,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 12:05:47,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:05:55,936 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=706466.6666666666, ans=0.0 2023-09-30 12:05:57,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:05:59,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:06:00,965 INFO [train.py:1039] (3/4) Epoch 20, batch 5050, loss[loss=0.1716, simple_loss=0.2439, pruned_loss=0.0496, over 23250.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2515, pruned_loss=0.05065, over 4698092.40 frames. ], batch size: 105, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:06:01,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:06:01,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:06:01,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:06:01,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:06:01,240 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:06:06,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:06:06,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 12:06:07,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:06:09,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:06:09,980 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.whiten, num_groups=1, num_channels=512, metric=5.26 vs. limit=12.0 2023-09-30 12:06:11,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:06:12,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 12:06:15,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:06:15,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:06:16,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:06:18,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:06:19,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:06:20,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=706600.0, ans=0.0 2023-09-30 12:06:23,341 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.72 vs. limit=15.0 2023-09-30 12:06:29,123 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=706600.0, ans=0.2 2023-09-30 12:06:30,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 12:06:30,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:06:31,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:06:31,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 12:06:34,385 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:06:37,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:06:37,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:06:37,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:06:37,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 12:06:38,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 12:06:39,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:06:40,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:06:42,488 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=706666.6666666666, ans=0.125 2023-09-30 12:06:42,871 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.14 vs. limit=15.0 2023-09-30 12:06:43,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:06:45,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 12:06:47,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:06:49,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 12:06:51,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:06:53,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:06:53,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:06:53,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:06:55,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:06:58,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:06:58,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:06:59,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:06:59,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:06:59,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 12:06:59,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:07:01,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:07:05,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:07:05,958 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 12:07:05,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:07:07,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:07:09,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:07:09,552 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 12:07:13,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:07:13,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 12:07:13,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:07:17,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:07:18,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:07:18,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 12:07:20,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 12:07:24,222 INFO [train.py:1039] (3/4) Epoch 20, batch 5100, loss[loss=0.1803, simple_loss=0.2704, pruned_loss=0.0451, over 24320.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2522, pruned_loss=0.05081, over 4703891.79 frames. ], batch size: 74, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:07:24,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:07:24,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:07:24,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:07:27,903 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 12:07:30,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:07:31,005 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=706866.6666666666, ans=0.0 2023-09-30 12:07:33,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 12:07:33,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 12:07:35,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:07:37,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:07:40,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:07:40,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 12:07:40,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 12:07:46,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:07:46,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:07:51,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:07:54,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 12:07:54,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:07:58,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:07:58,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 12:08:00,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:08:01,551 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.521e+02 1.819e+02 2.005e+02 2.241e+02 3.147e+02, threshold=4.010e+02, percent-clipped=0.0 2023-09-30 12:08:01,681 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:08:01,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 12:08:03,941 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 12:08:05,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:08:05,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 12:08:05,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 12:08:09,322 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.91 vs. limit=6.0 2023-09-30 12:08:10,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:08:18,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:08:20,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 12:08:21,559 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 12:08:21,583 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 12:08:23,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 12:08:23,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:08:26,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 12:08:29,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 12:08:31,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=707133.3333333334, ans=0.125 2023-09-30 12:08:33,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 12:08:35,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:08:37,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 12:08:40,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:08:40,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 12:08:46,125 INFO [train.py:1039] (3/4) Epoch 20, batch 5150, loss[loss=0.1917, simple_loss=0.2685, pruned_loss=0.05743, over 23213.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.253, pruned_loss=0.05063, over 4712447.01 frames. ], batch size: 105, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:08:46,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:08:47,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:08:47,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:08:47,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:08:47,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:08:49,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:08:49,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=707200.0, ans=0.0 2023-09-30 12:08:51,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 12:08:51,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 12:08:51,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 12:08:51,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:08:51,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 12:08:53,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:08:53,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 12:08:56,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:08:57,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:09:02,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 12:09:02,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 12:09:03,017 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=707266.6666666666, ans=0.95 2023-09-30 12:09:04,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:09:04,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:09:04,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=707266.6666666666, ans=0.0 2023-09-30 12:09:07,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:09:07,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:09:07,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:09:07,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:09:07,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:09:10,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 12:09:11,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:09:11,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:09:15,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 12:09:16,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 12:09:18,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:09:21,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=707333.3333333334, ans=0.125 2023-09-30 12:09:23,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:09:24,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 12:09:29,303 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:09:30,236 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=5.66 vs. limit=15.0 2023-09-30 12:09:33,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=707333.3333333334, ans=0.125 2023-09-30 12:09:36,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:09:38,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:09:42,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:09:42,810 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer2.prob, batch_count=707400.0, ans=0.125 2023-09-30 12:09:44,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:09:47,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 12:09:49,581 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:09:51,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:09:51,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:09:53,861 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.14 vs. limit=10.0 2023-09-30 12:09:56,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:09:56,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:09:59,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 12:10:03,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:10:04,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 12:10:07,623 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:10:07,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:10:09,502 INFO [train.py:1039] (3/4) Epoch 20, batch 5200, loss[loss=0.1652, simple_loss=0.252, pruned_loss=0.03914, over 24568.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2526, pruned_loss=0.04988, over 4729499.91 frames. ], batch size: 71, lr: 5.09e-03, grad_scale: 16.0 2023-09-30 12:10:09,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:10:11,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:10:11,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:10:11,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:10:14,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:10:15,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=707533.3333333334, ans=0.125 2023-09-30 12:10:17,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:10:18,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:10:23,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 12:10:24,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:10:24,311 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=707600.0, ans=0.125 2023-09-30 12:10:24,437 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=707600.0, ans=0.125 2023-09-30 12:10:26,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:10:28,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:10:30,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:10:30,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:10:31,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 12:10:33,723 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 12:10:33,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:10:36,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 12:10:38,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:10:38,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 12:10:40,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 12:10:40,932 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 12:10:41,153 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:10:44,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 12:10:46,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:10:46,097 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 12:10:46,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:10:46,418 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=707666.6666666666, ans=0.1 2023-09-30 12:10:47,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:10:47,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:10:48,962 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.576e+02 1.872e+02 2.079e+02 2.395e+02 3.722e+02, threshold=4.157e+02, percent-clipped=0.0 2023-09-30 12:10:49,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 12:10:50,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:10:52,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:10:54,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 12:10:55,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 12:10:55,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 12:10:57,100 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=707666.6666666666, ans=0.0 2023-09-30 12:11:01,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 12:11:01,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:11:06,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:11:06,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:11:07,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 12:11:08,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=707733.3333333334, ans=0.1 2023-09-30 12:11:09,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:11:09,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 12:11:09,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:11:10,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:11:12,568 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=707733.3333333334, ans=0.0 2023-09-30 12:11:13,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:11:13,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:11:18,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:11:18,565 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=707800.0, ans=0.0 2023-09-30 12:11:19,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:11:19,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:11:20,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=707800.0, ans=0.125 2023-09-30 12:11:25,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:11:25,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 12:11:26,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:11:26,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:11:30,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:11:30,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:11:31,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:11:33,427 INFO [train.py:1039] (3/4) Epoch 20, batch 5250, loss[loss=0.1836, simple_loss=0.2461, pruned_loss=0.06054, over 23736.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2516, pruned_loss=0.05057, over 4718786.80 frames. ], batch size: 232, lr: 5.08e-03, grad_scale: 16.0 2023-09-30 12:11:34,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:11:35,758 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.26 vs. limit=15.0 2023-09-30 12:11:38,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:11:38,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:11:39,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:11:47,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:11:49,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:11:50,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:11:52,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:11:54,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 12:11:54,594 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:11:56,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:12:10,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=708000.0, ans=0.125 2023-09-30 12:12:23,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=708066.6666666666, ans=0.2 2023-09-30 12:12:24,511 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=11.54 vs. limit=15.0 2023-09-30 12:12:25,412 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=708066.6666666666, ans=0.0 2023-09-30 12:12:28,313 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=708066.6666666666, ans=0.07 2023-09-30 12:12:47,657 INFO [train.py:1039] (3/4) Epoch 20, batch 5300, loss[loss=0.1918, simple_loss=0.2638, pruned_loss=0.05985, over 23281.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2496, pruned_loss=0.05004, over 4703291.13 frames. ], batch size: 119, lr: 5.08e-03, grad_scale: 8.0 2023-09-30 12:13:02,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:13:02,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 12:13:02,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 12:13:02,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:13:03,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:13:03,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:13:03,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:13:03,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:13:03,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:03,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:13:03,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:13:04,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:13:04,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 12:13:04,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 12:13:04,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 12:13:04,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:13:04,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 12:13:05,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 12:13:05,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:13:05,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:13:05,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:13:05,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:13:06,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:13:06,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:13:06,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:13:06,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:13:07,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:13:07,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:13:07,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:13:07,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:13:07,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:13:08,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 12:13:08,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:13:08,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:13:08,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 12:13:08,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 12:13:08,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:13:08,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:13:08,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 12:13:09,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 12:13:09,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 12:13:09,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:13:10,143 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:13:10,816 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 12:13:10,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 12:13:10,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:13:11,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:13:11,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 12:13:11,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 12:13:11,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 12:13:11,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 12:13:17,107 INFO [train.py:1039] (3/4) Epoch 21, batch 0, loss[loss=0.1669, simple_loss=0.2475, pruned_loss=0.04314, over 24288.00 frames. ], tot_loss[loss=0.1669, simple_loss=0.2475, pruned_loss=0.04314, over 24288.00 frames. ], batch size: 61, lr: 4.96e-03, grad_scale: 16.0 2023-09-30 12:13:17,108 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-30 12:13:30,291 INFO [train.py:1071] (3/4) Epoch 21, validation: loss=0.2775, simple_loss=0.2715, pruned_loss=0.1418, over 1125622.00 frames. 2023-09-30 12:13:30,293 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-30 12:13:34,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 12:13:34,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:13:37,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:13:40,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:13:42,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:13:42,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:42,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 12:13:43,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 12:13:47,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:47,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:50,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:13:50,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:13:50,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:13:52,056 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.491e+02 1.851e+02 2.011e+02 2.315e+02 3.678e+02, threshold=4.022e+02, percent-clipped=0.0 2023-09-30 12:13:52,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:13:53,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 12:13:56,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:14:05,061 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:14:05,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:14:07,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 12:14:12,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:14:12,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:14:15,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:14:19,206 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:14:22,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:14:27,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 12:14:27,755 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=708480.0, ans=0.5 2023-09-30 12:14:29,295 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=708480.0, ans=0.125 2023-09-30 12:14:30,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 12:14:30,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:14:30,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:14:32,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:14:32,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:14:33,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 12:14:37,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:14:37,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:14:40,393 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.08 vs. limit=6.0 2023-09-30 12:14:42,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:14:43,099 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=708546.6666666666, ans=10.0 2023-09-30 12:14:45,947 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 12:14:47,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:14:51,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:14:52,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:14:53,950 INFO [train.py:1039] (3/4) Epoch 21, batch 50, loss[loss=0.1683, simple_loss=0.2403, pruned_loss=0.04814, over 23216.00 frames. ], tot_loss[loss=0.1787, simple_loss=0.2539, pruned_loss=0.0517, over 1062979.82 frames. ], batch size: 105, lr: 4.96e-03, grad_scale: 16.0 2023-09-30 12:14:54,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 12:14:54,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:14:54,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:14:57,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:14:57,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:15:00,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:15:00,613 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=708613.3333333334, ans=0.125 2023-09-30 12:15:00,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=708613.3333333334, ans=0.125 2023-09-30 12:15:04,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 12:15:04,850 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:15:06,861 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=708613.3333333334, ans=0.2 2023-09-30 12:15:11,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:15:13,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 12:15:15,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 12:15:17,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:15:18,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:15:18,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:15:20,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:15:20,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:15:21,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 12:15:21,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:15:29,079 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=708746.6666666666, ans=0.125 2023-09-30 12:15:33,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:15:34,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:15:34,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:15:36,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 12:15:37,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:15:39,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:15:39,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 12:15:40,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:15:42,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 12:15:44,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer_na.min_abs, batch_count=708813.3333333334, ans=0.02 2023-09-30 12:15:46,010 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=708813.3333333334, ans=0.0 2023-09-30 12:15:49,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:15:49,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:15:50,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:15:51,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:15:51,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:15:56,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 12:15:56,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 12:15:59,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:15:59,349 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:16:00,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:16:02,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:16:02,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 12:16:04,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 12:16:04,450 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 12:16:05,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:07,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:16:08,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 12:16:08,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 12:16:10,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:11,124 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=23.26 vs. limit=22.5 2023-09-30 12:16:11,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:16:13,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 12:16:13,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:16:14,846 INFO [train.py:1039] (3/4) Epoch 21, batch 100, loss[loss=0.1749, simple_loss=0.2605, pruned_loss=0.04466, over 24637.00 frames. ], tot_loss[loss=0.1784, simple_loss=0.2546, pruned_loss=0.05112, over 1864815.94 frames. ], batch size: 68, lr: 4.96e-03, grad_scale: 8.0 2023-09-30 12:16:16,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:16:18,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:16:22,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:16:24,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 12:16:24,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:16:30,064 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:16:30,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:16:30,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:16:30,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:16:30,204 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:16:33,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 12:16:35,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:16:36,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:36,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:16:36,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:16:38,920 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.524e+02 1.766e+02 1.906e+02 2.268e+02 3.553e+02, threshold=3.812e+02, percent-clipped=0.0 2023-09-30 12:16:40,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 12:16:42,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:42,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:16:42,462 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 12:16:45,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:16:48,666 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 12:16:48,693 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 12:16:50,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:16:50,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:16:54,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:16:56,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:16:58,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:03,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:04,124 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=709146.6666666666, ans=0.1 2023-09-30 12:17:05,301 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 12:17:06,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 12:17:09,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:17:12,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:17:12,479 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=709146.6666666666, ans=0.1 2023-09-30 12:17:13,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:15,981 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.78 vs. limit=12.0 2023-09-30 12:17:16,597 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:17:18,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:17:19,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:17:20,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=709213.3333333334, ans=0.125 2023-09-30 12:17:21,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:22,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:17:24,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:17:24,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:17:24,268 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:24,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 12:17:24,390 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 12:17:24,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:17:25,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:17:26,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:26,028 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:17:26,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 12:17:26,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 12:17:27,555 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:17:27,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:29,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:17:30,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:17:32,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:17:32,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:17:33,190 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=709213.3333333334, ans=0.125 2023-09-30 12:17:34,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:17:35,823 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=256, metric=11.07 vs. limit=15.0 2023-09-30 12:17:36,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=709280.0, ans=0.125 2023-09-30 12:17:37,850 INFO [train.py:1039] (3/4) Epoch 21, batch 150, loss[loss=0.169, simple_loss=0.2505, pruned_loss=0.04373, over 24466.00 frames. ], tot_loss[loss=0.1795, simple_loss=0.2553, pruned_loss=0.05189, over 2491471.52 frames. ], batch size: 66, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:17:37,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:17:37,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:17:38,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:41,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:17:41,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:43,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:17:44,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:49,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 12:17:49,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 12:17:49,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 12:17:54,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:17:54,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:17:54,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:17:55,943 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:17:55,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:17:56,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:17:57,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:18:00,474 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 12:18:02,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:18:09,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:18:11,274 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=709413.3333333334, ans=0.125 2023-09-30 12:18:12,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:18:14,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 12:18:15,192 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.80 vs. limit=15.0 2023-09-30 12:18:17,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:18:17,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:18:17,261 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:18:20,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:18:21,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:18:22,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:18:24,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=709413.3333333334, ans=0.125 2023-09-30 12:18:25,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:18:25,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 12:18:31,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:18:31,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:18:33,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:18:33,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:18:34,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:18:35,179 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer1.prob, batch_count=709480.0, ans=0.125 2023-09-30 12:18:36,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 12:18:37,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:18:40,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:18:41,082 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=709480.0, ans=0.125 2023-09-30 12:18:42,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:18:43,727 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:18:43,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 12:18:44,045 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.min_positive, batch_count=709546.6666666666, ans=0.05 2023-09-30 12:18:45,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:18:45,803 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 12:18:50,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:18:53,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:18:53,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:18:56,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 12:18:56,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:18:58,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:19:00,185 INFO [train.py:1039] (3/4) Epoch 21, batch 200, loss[loss=0.1929, simple_loss=0.2716, pruned_loss=0.05709, over 24376.00 frames. ], tot_loss[loss=0.1804, simple_loss=0.256, pruned_loss=0.05235, over 2993550.46 frames. ], batch size: 77, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:19:00,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 12:19:00,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:19:02,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:19:03,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:19:09,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:19:09,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:19:09,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:19:13,121 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=709613.3333333334, ans=0.2 2023-09-30 12:19:22,967 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.816e+02 2.113e+02 2.431e+02 3.187e+02, threshold=4.227e+02, percent-clipped=0.0 2023-09-30 12:19:26,982 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=9.80 vs. limit=15.0 2023-09-30 12:19:30,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:19:30,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:19:33,330 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=709746.6666666666, ans=0.1 2023-09-30 12:19:34,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:19:34,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:19:35,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 12:19:35,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:19:37,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:19:39,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:19:39,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:19:39,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:19:40,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 12:19:40,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 12:19:42,244 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:19:45,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:19:53,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:19:57,353 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=2.93 vs. limit=10.0 2023-09-30 12:20:02,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:02,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:20:10,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:14,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 12:20:14,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:20:14,798 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=709880.0, ans=0.0 2023-09-30 12:20:15,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:20:15,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:20:17,370 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:20:20,279 INFO [train.py:1039] (3/4) Epoch 21, batch 250, loss[loss=0.1728, simple_loss=0.2501, pruned_loss=0.04778, over 23283.00 frames. ], tot_loss[loss=0.1796, simple_loss=0.2552, pruned_loss=0.05197, over 3375753.44 frames. ], batch size: 119, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:20:20,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 12:20:20,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:20:20,507 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 12:20:23,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:23,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:20:25,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:26,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:20:30,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:20:30,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:20:32,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:20:36,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:20:38,990 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=710013.3333333334, ans=0.1 2023-09-30 12:20:46,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:20:48,170 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:20:49,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:20:50,387 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=710013.3333333334, ans=0.125 2023-09-30 12:20:56,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:20:57,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:20:57,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:20:57,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:20:59,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:20:59,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:21:00,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:21:02,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:21:05,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 12:21:06,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:21:08,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:21:09,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:21:09,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:21:09,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:21:12,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:21:12,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:21:14,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:21:15,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:21:17,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:21:21,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:21:26,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:21:29,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:21:34,947 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:21:36,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:21:38,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:21:40,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 12:21:42,806 INFO [train.py:1039] (3/4) Epoch 21, batch 300, loss[loss=0.1676, simple_loss=0.2431, pruned_loss=0.04601, over 16779.00 frames. ], tot_loss[loss=0.1778, simple_loss=0.2541, pruned_loss=0.05078, over 3680401.11 frames. ], batch size: 36, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:21:43,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:21:43,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:21:44,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 12:21:44,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 12:21:45,830 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.00 vs. limit=6.0 2023-09-30 12:21:46,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:21:46,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 12:21:52,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:21:53,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:21:56,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:21:58,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 12:21:58,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:22:00,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:22:00,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 12:22:00,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:22:05,040 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.476e+02 1.836e+02 2.048e+02 2.213e+02 3.686e+02, threshold=4.095e+02, percent-clipped=0.0 2023-09-30 12:22:05,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 12:22:08,400 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:22:08,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 12:22:09,312 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=7.91 vs. limit=15.0 2023-09-30 12:22:14,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 12:22:14,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:17,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:22:18,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:18,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 12:22:18,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:22:19,208 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=710413.3333333334, ans=0.0 2023-09-30 12:22:20,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=710413.3333333334, ans=0.125 2023-09-30 12:22:21,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:22:24,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:22:24,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:22:29,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 12:22:29,571 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 12:22:31,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:22:32,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:32,966 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=710480.0, ans=0.0 2023-09-30 12:22:34,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 12:22:34,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:22:38,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:22:41,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:22:41,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 12:22:46,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:46,771 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:22:50,526 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:52,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:22:53,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 12:22:53,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 12:22:53,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=710546.6666666666, ans=0.125 2023-09-30 12:22:54,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:22:56,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 12:22:56,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:22:56,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:22:57,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.scale_min, batch_count=710546.6666666666, ans=0.2 2023-09-30 12:22:58,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:22:59,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:22:59,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:04,736 INFO [train.py:1039] (3/4) Epoch 21, batch 350, loss[loss=0.187, simple_loss=0.2514, pruned_loss=0.06129, over 23595.00 frames. ], tot_loss[loss=0.1769, simple_loss=0.2527, pruned_loss=0.05053, over 3915839.22 frames. ], batch size: 256, lr: 4.95e-03, grad_scale: 8.0 2023-09-30 12:23:04,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:23:04,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 12:23:05,153 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=710613.3333333334, ans=0.125 2023-09-30 12:23:09,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:14,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:23:17,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:23:17,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:21,910 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 12:23:22,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:23:23,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 12:23:26,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:26,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 12:23:26,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:23:31,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 12:23:32,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:23:32,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:23:34,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:23:37,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:23:37,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:23:38,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:23:38,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:23:38,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:23:41,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:23:41,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:23:44,971 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.89 vs. limit=15.0 2023-09-30 12:23:47,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:23:47,350 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:23:48,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:23:50,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:23:56,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 12:23:56,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:24:01,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:24:01,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:24:01,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:24:03,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 12:24:03,545 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=710813.3333333334, ans=0.0 2023-09-30 12:24:07,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:08,933 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 12:24:10,458 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 12:24:10,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:24:15,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:24:15,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 12:24:17,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:18,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:24:19,207 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=710880.0, ans=0.0 2023-09-30 12:24:22,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:24:22,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:22,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:24:25,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:24:26,559 INFO [train.py:1039] (3/4) Epoch 21, batch 400, loss[loss=0.1829, simple_loss=0.2649, pruned_loss=0.05045, over 24308.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2516, pruned_loss=0.04999, over 4099163.83 frames. ], batch size: 77, lr: 4.95e-03, grad_scale: 16.0 2023-09-30 12:24:27,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=710946.6666666666, ans=0.2 2023-09-30 12:24:28,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:24:29,206 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=710946.6666666666, ans=0.0 2023-09-30 12:24:30,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:24:30,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 12:24:32,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:32,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:24:35,482 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.45 vs. limit=10.0 2023-09-30 12:24:36,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:24:36,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:24:39,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:24:40,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:24:42,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 12:24:42,788 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=711013.3333333334, ans=10.0 2023-09-30 12:24:43,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 12:24:43,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:24:44,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 12:24:44,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:24:49,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:24:49,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:24:49,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 12:24:49,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:24:49,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:24:50,712 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.787e+02 1.990e+02 2.388e+02 3.650e+02, threshold=3.979e+02, percent-clipped=0.0 2023-09-30 12:24:50,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:24:50,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:24:52,517 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 12:24:52,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 12:24:57,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:24:59,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:24:59,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=711080.0, ans=0.125 2023-09-30 12:25:00,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 12:25:01,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 12:25:01,730 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=711080.0, ans=0.125 2023-09-30 12:25:04,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:25:09,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:25:15,609 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 12:25:18,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:25:21,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 12:25:22,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:25:25,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:25:25,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 12:25:30,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:25:31,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:25:33,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:25:37,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:25:37,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 12:25:40,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 12:25:41,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 12:25:41,280 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=711213.3333333334, ans=0.0 2023-09-30 12:25:44,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:25:44,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:25:44,927 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=711213.3333333334, ans=0.125 2023-09-30 12:25:46,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 12:25:47,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:25:49,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:25:49,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 12:25:50,708 INFO [train.py:1039] (3/4) Epoch 21, batch 450, loss[loss=0.1786, simple_loss=0.266, pruned_loss=0.04558, over 24376.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2516, pruned_loss=0.05006, over 4241288.14 frames. ], batch size: 77, lr: 4.95e-03, grad_scale: 16.0 2023-09-30 12:25:50,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 12:25:50,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:25:51,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:25:51,181 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=711280.0, ans=0.125 2023-09-30 12:25:52,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:25:52,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 12:25:54,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:25:56,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:25:57,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:26:02,234 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:26:08,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:26:10,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:26:12,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 12:26:12,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 12:26:15,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:26:17,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:26:19,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:26:22,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:26:24,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:26:28,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 12:26:28,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 12:26:29,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 12:26:29,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:26:31,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:26:32,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:26:34,446 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 12:26:34,459 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 12:26:34,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:26:36,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:26:37,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 12:26:40,478 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=7.82 vs. limit=15.0 2023-09-30 12:26:43,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:26:44,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:26:44,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 12:26:44,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 12:26:49,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:26:51,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 12:26:51,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:26:54,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 12:26:56,234 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=711546.6666666666, ans=0.125 2023-09-30 12:26:58,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:26:59,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 12:27:00,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 12:27:01,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:27:07,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:27:09,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:27:10,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:27:10,909 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 12:27:13,749 INFO [train.py:1039] (3/4) Epoch 21, batch 500, loss[loss=0.1837, simple_loss=0.2559, pruned_loss=0.05572, over 22755.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.2532, pruned_loss=0.05114, over 4345283.73 frames. ], batch size: 322, lr: 4.95e-03, grad_scale: 16.0 2023-09-30 12:27:15,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:27:17,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:27:17,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:27:17,340 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 12:27:19,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 12:27:19,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:27:21,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 12:27:25,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 12:27:27,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:27:30,959 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:27:30,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:27:31,378 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=711680.0, ans=0.0 2023-09-30 12:27:32,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:27:33,357 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=711680.0, ans=0.0 2023-09-30 12:27:37,394 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.852e+02 2.048e+02 2.259e+02 3.327e+02, threshold=4.095e+02, percent-clipped=0.0 2023-09-30 12:27:41,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=711680.0, ans=0.0 2023-09-30 12:27:42,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:27:42,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 12:27:42,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:27:42,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:27:44,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 12:27:44,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:27:46,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:27:46,279 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=711746.6666666666, ans=0.1 2023-09-30 12:27:47,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:27:48,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:27:48,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:27:49,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 12:27:52,617 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 12:27:56,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:27:56,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:27:57,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:27:59,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:27:59,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:28:02,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 12:28:05,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:28:06,305 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=711813.3333333334, ans=0.125 2023-09-30 12:28:07,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:28:10,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:28:14,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:28:16,622 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=12.43 vs. limit=15.0 2023-09-30 12:28:19,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:28:20,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 12:28:20,736 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:28:20,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:28:24,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 12:28:25,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 12:28:27,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:28:33,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 12:28:35,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 12:28:35,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:28:35,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 12:28:35,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:28:35,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:28:37,330 INFO [train.py:1039] (3/4) Epoch 21, batch 550, loss[loss=0.1728, simple_loss=0.2542, pruned_loss=0.04567, over 24505.00 frames. ], tot_loss[loss=0.1783, simple_loss=0.2542, pruned_loss=0.05122, over 4416698.91 frames. ], batch size: 63, lr: 4.95e-03, grad_scale: 16.0 2023-09-30 12:28:37,435 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:28:37,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:28:37,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:28:39,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:28:42,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:28:43,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 12:28:43,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:28:47,557 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=711946.6666666666, ans=0.0 2023-09-30 12:28:48,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:28:48,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:28:52,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:28:55,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:28:57,023 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=712013.3333333334, ans=0.125 2023-09-30 12:28:59,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 12:28:59,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 12:29:02,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:29:08,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:29:08,271 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:29:11,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:29:15,088 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:29:15,097 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 12:29:15,225 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:29:16,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 12:29:19,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:29:19,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:29:21,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:29:21,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:29:23,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 12:29:26,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 12:29:27,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:29:27,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:29:29,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:29:29,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:29:33,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:29:35,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:29:37,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:29:37,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:29:39,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 12:29:40,051 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=712146.6666666666, ans=0.125 2023-09-30 12:29:41,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:29:42,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:29:44,343 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:29:45,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:29:47,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:29:47,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 12:29:47,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=712213.3333333334, ans=0.125 2023-09-30 12:29:52,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 12:29:54,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 12:29:58,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:29:58,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:29:58,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:29:58,319 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=712280.0, ans=0.1 2023-09-30 12:29:59,465 INFO [train.py:1039] (3/4) Epoch 21, batch 600, loss[loss=0.1641, simple_loss=0.2503, pruned_loss=0.03892, over 24620.00 frames. ], tot_loss[loss=0.1776, simple_loss=0.2536, pruned_loss=0.05077, over 4486544.86 frames. ], batch size: 68, lr: 4.94e-03, grad_scale: 16.0 2023-09-30 12:30:05,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:30:07,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:30:09,129 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 12:30:11,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:30:12,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:30:14,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:30:17,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 12:30:17,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:30:21,811 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.505e+02 1.803e+02 1.996e+02 2.235e+02 3.480e+02, threshold=3.991e+02, percent-clipped=0.0 2023-09-30 12:30:24,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 12:30:26,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer2.prob, batch_count=712346.6666666666, ans=0.125 2023-09-30 12:30:27,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:30:27,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:30:28,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:30:34,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:30:34,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:30:34,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:30:40,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:30:44,552 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:30:44,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:30:44,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:30:46,666 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=14.15 vs. limit=22.5 2023-09-30 12:30:53,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 12:31:00,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:31:00,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:31:04,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 12:31:05,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:31:08,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 12:31:08,623 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:31:08,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:31:15,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 12:31:16,377 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 12:31:17,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:31:19,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:31:21,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:31:22,643 INFO [train.py:1039] (3/4) Epoch 21, batch 650, loss[loss=0.1639, simple_loss=0.2231, pruned_loss=0.05229, over 23658.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.252, pruned_loss=0.05062, over 4538030.74 frames. ], batch size: 232, lr: 4.94e-03, grad_scale: 16.0 2023-09-30 12:31:24,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 12:31:25,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:31:31,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:31:31,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:31:35,263 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=712613.3333333334, ans=0.1 2023-09-30 12:31:36,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:31:36,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=712613.3333333334, ans=0.0 2023-09-30 12:31:37,178 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=8.54 vs. limit=15.0 2023-09-30 12:31:40,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 12:31:43,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:31:43,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:31:45,662 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:31:48,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:31:50,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 12:31:53,165 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=712680.0, ans=0.125 2023-09-30 12:31:54,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:31:54,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:31:56,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 12:31:57,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:31:59,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:32:01,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:32:01,069 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 12:32:01,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:32:01,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:32:03,275 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.91 vs. limit=22.5 2023-09-30 12:32:04,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:32:05,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:32:07,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:32:07,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:32:08,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 12:32:08,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:32:08,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:32:11,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 12:32:11,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:32:14,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 12:32:14,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 12:32:15,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 12:32:15,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:32:15,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:32:15,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:32:15,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:32:18,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:32:20,678 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer2.prob, batch_count=712813.3333333334, ans=0.125 2023-09-30 12:32:27,377 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:32:27,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:32:28,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:32:31,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:32:32,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 12:32:33,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:32:41,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:32:41,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:32:42,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:32:42,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:32:45,070 INFO [train.py:1039] (3/4) Epoch 21, batch 700, loss[loss=0.1698, simple_loss=0.2475, pruned_loss=0.04611, over 23236.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.251, pruned_loss=0.05, over 4585832.94 frames. ], batch size: 105, lr: 4.94e-03, grad_scale: 16.0 2023-09-30 12:32:46,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 12:32:48,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 12:32:50,221 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=712946.6666666666, ans=0.07 2023-09-30 12:32:51,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 12:32:51,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:32:52,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:32:54,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 12:32:58,032 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=1.402e-02 2023-09-30 12:32:59,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:33:02,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:33:04,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:33:06,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:33:07,181 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.567e+02 1.845e+02 2.008e+02 2.196e+02 3.321e+02, threshold=4.016e+02, percent-clipped=0.0 2023-09-30 12:33:07,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:33:10,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:33:12,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 12:33:12,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:33:13,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 12:33:17,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 12:33:21,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:33:21,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:33:23,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:33:26,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:33:26,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 12:33:27,141 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:33:31,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:33:32,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:33:34,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 12:33:35,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:33:37,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:33:38,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:33:43,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:33:43,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 12:33:49,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 12:33:49,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 12:33:52,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:33:54,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:33:55,554 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:33:58,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:33:58,619 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 12:34:03,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 12:34:03,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 12:34:03,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 12:34:05,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 12:34:06,588 INFO [train.py:1039] (3/4) Epoch 21, batch 750, loss[loss=0.1694, simple_loss=0.2395, pruned_loss=0.0496, over 23826.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2503, pruned_loss=0.04984, over 4616431.19 frames. ], batch size: 179, lr: 4.94e-03, grad_scale: 16.0 2023-09-30 12:34:06,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 12:34:07,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:34:08,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 12:34:10,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:34:11,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:34:12,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:34:14,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:34:16,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:34:16,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:34:19,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:34:19,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:34:21,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:34:26,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:34:26,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:34:27,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 12:34:28,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=713346.6666666666, ans=0.1 2023-09-30 12:34:29,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:34:29,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:34:30,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:34:33,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 12:34:35,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 12:34:35,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:34:39,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 12:34:39,215 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 12:34:40,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 12:34:40,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:34:40,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 12:34:42,948 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.34 vs. limit=6.0 2023-09-30 12:34:43,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:34:45,425 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.37 vs. limit=15.0 2023-09-30 12:34:51,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:34:51,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:34:51,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:34:53,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:34:53,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:34:53,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 12:34:55,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:34:56,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 12:34:57,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:34:57,296 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=713480.0, ans=0.2 2023-09-30 12:35:02,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:35:04,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 12:35:05,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:35:09,590 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.33 vs. limit=15.0 2023-09-30 12:35:10,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:35:12,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:35:12,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:35:14,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:35:18,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 12:35:18,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:35:18,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:35:23,839 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:35:23,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:35:25,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:35:25,699 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:35:30,166 INFO [train.py:1039] (3/4) Epoch 21, batch 800, loss[loss=0.1466, simple_loss=0.2183, pruned_loss=0.03742, over 19519.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2507, pruned_loss=0.04991, over 4632722.15 frames. ], batch size: 42, lr: 4.94e-03, grad_scale: 32.0 2023-09-30 12:35:36,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:35:36,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:35:40,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:35:40,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:35:40,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:35:41,617 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:35:43,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:35:49,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:35:49,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:35:52,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 12:35:52,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:35:53,662 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.427e+02 1.844e+02 2.013e+02 2.212e+02 3.409e+02, threshold=4.025e+02, percent-clipped=0.0 2023-09-30 12:35:53,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:35:53,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:35:53,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:35:55,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 12:35:55,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:35:55,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 12:36:00,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:36:02,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:36:05,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:36:05,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:36:06,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:36:06,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:36:13,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:36:13,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:36:13,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 12:36:15,744 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 12:36:15,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 12:36:15,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:36:15,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:36:18,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:36:18,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:36:24,047 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 12:36:25,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 12:36:26,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:36:28,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:36:33,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:36:34,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=713813.3333333334, ans=0.1 2023-09-30 12:36:36,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:36:38,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 12:36:38,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:36:42,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 12:36:43,587 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.59 vs. limit=15.0 2023-09-30 12:36:51,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:36:53,265 INFO [train.py:1039] (3/4) Epoch 21, batch 850, loss[loss=0.1876, simple_loss=0.2722, pruned_loss=0.05147, over 24561.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2511, pruned_loss=0.04977, over 4661666.26 frames. ], batch size: 71, lr: 4.94e-03, grad_scale: 32.0 2023-09-30 12:36:53,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:36:53,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 12:36:53,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.attention_skip_rate, batch_count=713946.6666666666, ans=0.0 2023-09-30 12:36:54,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:36:54,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:36:56,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 12:36:56,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:36:58,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:36:58,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:00,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:37:00,570 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=713946.6666666666, ans=0.1 2023-09-30 12:37:01,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:37:04,772 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 12:37:04,845 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 12:37:04,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 12:37:05,890 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.28 vs. limit=15.0 2023-09-30 12:37:06,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:37:06,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:37:08,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:10,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:37:10,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:37:15,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:37:15,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:37:15,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 12:37:16,926 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=714013.3333333334, ans=0.125 2023-09-30 12:37:19,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 12:37:23,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:37:25,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 12:37:28,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 12:37:29,696 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 12:37:33,312 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 12:37:33,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:37:33,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:37:33,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 12:37:36,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:37,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:37,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 12:37:41,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:37:42,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:37:42,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:37:44,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:37:44,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:37:45,369 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=714146.6666666666, ans=0.0 2023-09-30 12:37:46,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 12:37:46,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 12:37:51,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:37:51,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:37:52,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:37:52,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:37:52,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:37:54,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:37:56,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:37:58,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:37:59,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:38:01,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 12:38:09,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 12:38:11,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:38:11,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 12:38:12,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:38:12,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:38:12,963 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=714213.3333333334, ans=0.09899494936611666 2023-09-30 12:38:14,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 12:38:15,645 INFO [train.py:1039] (3/4) Epoch 21, batch 900, loss[loss=0.2029, simple_loss=0.2688, pruned_loss=0.06854, over 23763.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2517, pruned_loss=0.04991, over 4669337.61 frames. ], batch size: 212, lr: 4.94e-03, grad_scale: 32.0 2023-09-30 12:38:19,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:38:20,055 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_abs, batch_count=714280.0, ans=0.5 2023-09-30 12:38:22,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:38:23,052 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=714280.0, ans=0.2 2023-09-30 12:38:24,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 12:38:25,949 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=714280.0, ans=0.125 2023-09-30 12:38:27,773 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.45 vs. limit=15.0 2023-09-30 12:38:29,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:38:29,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 12:38:31,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 12:38:32,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:38:32,842 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:38:32,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 12:38:32,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:38:39,496 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.822e+02 2.031e+02 2.211e+02 2.952e+02, threshold=4.063e+02, percent-clipped=0.0 2023-09-30 12:38:42,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:38:42,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:38:42,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:38:44,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:38:51,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 12:38:52,306 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=714413.3333333334, ans=0.1 2023-09-30 12:38:53,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:38:56,182 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=714413.3333333334, ans=0.1 2023-09-30 12:39:00,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 12:39:00,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:39:01,962 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 12:39:02,100 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 12:39:09,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:39:09,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:39:10,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:39:11,302 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.prob, batch_count=714480.0, ans=0.125 2023-09-30 12:39:16,269 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward3.hidden_balancer.prob, batch_count=714480.0, ans=0.125 2023-09-30 12:39:19,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:39:19,092 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:39:21,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 12:39:21,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:39:25,113 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 12:39:26,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:39:26,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:39:28,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:39:29,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:39:31,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=714546.6666666666, ans=0.125 2023-09-30 12:39:34,647 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.72 vs. limit=15.0 2023-09-30 12:39:35,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 12:39:35,211 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 12:39:36,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 12:39:36,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 12:39:38,194 INFO [train.py:1039] (3/4) Epoch 21, batch 950, loss[loss=0.1772, simple_loss=0.2649, pruned_loss=0.04474, over 24426.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2521, pruned_loss=0.04932, over 4699121.36 frames. ], batch size: 69, lr: 4.94e-03, grad_scale: 32.0 2023-09-30 12:39:39,838 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:39:45,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 12:39:48,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:39:52,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:39:52,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:39:54,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 12:39:57,221 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 12:39:57,689 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=714680.0, ans=0.0 2023-09-30 12:40:01,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:40:01,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:40:02,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:40:02,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:40:03,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 12:40:03,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 12:40:05,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:40:07,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 12:40:07,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:40:11,129 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=714746.6666666666, ans=0.125 2023-09-30 12:40:12,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:40:13,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:40:13,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:40:15,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 12:40:16,358 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=714746.6666666666, ans=0.125 2023-09-30 12:40:16,639 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.91 vs. limit=22.5 2023-09-30 12:40:17,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 12:40:19,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:40:21,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:40:26,536 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:40:26,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:40:29,694 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 12:40:31,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 12:40:31,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 12:40:34,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:40:34,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:40:34,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:40:37,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 12:40:38,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:40:40,657 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:40:42,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:40:42,720 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 12:40:42,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:40:42,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:40:42,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 12:40:47,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:40:48,124 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.49 vs. limit=22.5 2023-09-30 12:40:51,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:40:54,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:40:56,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 12:40:56,309 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 12:40:59,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:41:02,680 INFO [train.py:1039] (3/4) Epoch 21, batch 1000, loss[loss=0.1585, simple_loss=0.2054, pruned_loss=0.0558, over 19461.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2513, pruned_loss=0.04949, over 4692526.66 frames. ], batch size: 388, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:41:02,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 12:41:03,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:41:08,268 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.24 vs. limit=22.5 2023-09-30 12:41:08,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:41:10,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 12:41:10,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 12:41:11,642 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=13.29 vs. limit=15.0 2023-09-30 12:41:15,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:41:15,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:41:18,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:41:18,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=715013.3333333334, ans=0.0 2023-09-30 12:41:21,948 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 12:41:25,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 12:41:27,770 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.831e+02 2.095e+02 2.362e+02 3.753e+02, threshold=4.190e+02, percent-clipped=0.0 2023-09-30 12:41:27,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 12:41:29,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:41:30,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 12:41:32,802 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=715013.3333333334, ans=0.125 2023-09-30 12:41:33,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 12:41:33,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 12:41:34,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:41:35,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:41:37,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=715080.0, ans=0.0 2023-09-30 12:41:44,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:41:45,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:41:47,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:41:48,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:41:48,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 12:41:48,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:41:50,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:41:51,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:41:51,632 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 12:41:54,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 12:41:56,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 12:41:59,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 12:42:02,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:42:08,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:42:08,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:42:10,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:42:10,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:42:12,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 12:42:13,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:42:13,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 12:42:13,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 12:42:15,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:42:15,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:42:18,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:42:21,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:42:23,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:42:24,979 INFO [train.py:1039] (3/4) Epoch 21, batch 1050, loss[loss=0.1578, simple_loss=0.2324, pruned_loss=0.04155, over 24563.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.25, pruned_loss=0.04885, over 4701082.08 frames. ], batch size: 60, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:42:25,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:42:26,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:42:28,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:42:29,689 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:42:31,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:42:35,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 12:42:36,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 12:42:39,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:42:40,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:42:40,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:42:41,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:42:43,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 12:42:43,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:42:43,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 12:42:44,971 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:42:44,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 12:42:46,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 12:42:53,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:42:53,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:42:53,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:42:57,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 12:42:57,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 12:42:59,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:43:00,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 12:43:04,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 12:43:06,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:43:10,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 12:43:11,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 12:43:13,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:43:13,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:43:15,541 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.41 vs. limit=15.0 2023-09-30 12:43:16,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:43:19,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 12:43:21,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 12:43:21,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 12:43:22,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:43:22,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:43:24,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 12:43:29,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:43:32,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:43:32,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:43:32,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:43:32,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:43:37,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:43:37,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 12:43:40,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 12:43:40,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 12:43:41,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 12:43:42,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:43:45,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:43:47,961 INFO [train.py:1039] (3/4) Epoch 21, batch 1100, loss[loss=0.1926, simple_loss=0.2595, pruned_loss=0.06286, over 23668.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2499, pruned_loss=0.04893, over 4704050.75 frames. ], batch size: 256, lr: 4.93e-03, grad_scale: 8.0 2023-09-30 12:43:49,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=715613.3333333334, ans=0.1 2023-09-30 12:43:51,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:43:57,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:43:57,796 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.35 vs. limit=12.0 2023-09-30 12:43:58,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 12:43:58,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:44:00,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 12:44:01,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:44:02,033 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=715680.0, ans=0.125 2023-09-30 12:44:05,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 12:44:06,485 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.73 vs. limit=15.0 2023-09-30 12:44:07,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:44:10,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:44:10,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 12:44:10,674 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=715680.0, ans=0.0 2023-09-30 12:44:11,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 12:44:13,695 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.860e+02 2.075e+02 2.430e+02 4.755e+02, threshold=4.150e+02, percent-clipped=2.0 2023-09-30 12:44:13,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:44:13,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:44:17,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:44:19,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 12:44:24,336 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:44:28,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 12:44:28,964 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 12:44:30,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:44:33,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:44:33,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:44:35,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:44:38,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 12:44:38,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:44:38,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:44:38,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:44:38,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:44:38,744 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=715813.3333333334, ans=0.0 2023-09-30 12:44:40,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 12:44:41,146 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=715813.3333333334, ans=0.1 2023-09-30 12:44:46,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:44:46,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 12:44:50,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:44:53,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:44:56,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 12:44:56,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 12:44:58,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:45:00,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:45:00,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:45:03,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 12:45:04,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:45:04,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:45:05,125 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=715880.0, ans=0.125 2023-09-30 12:45:06,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 12:45:06,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:45:06,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff3_skip_rate, batch_count=715880.0, ans=0.0 2023-09-30 12:45:07,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 12:45:08,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:45:08,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:45:09,411 INFO [train.py:1039] (3/4) Epoch 21, batch 1150, loss[loss=0.1621, simple_loss=0.2512, pruned_loss=0.03654, over 24688.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2507, pruned_loss=0.04916, over 4711339.32 frames. ], batch size: 68, lr: 4.93e-03, grad_scale: 8.0 2023-09-30 12:45:09,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:45:14,289 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff3_skip_rate, batch_count=715946.6666666666, ans=0.0 2023-09-30 12:45:16,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:45:17,187 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.06 vs. limit=6.0 2023-09-30 12:45:19,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:45:20,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:45:20,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:45:21,046 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 12:45:22,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:45:23,137 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=715946.6666666666, ans=0.125 2023-09-30 12:45:25,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 12:45:26,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:45:26,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:45:31,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 12:45:33,838 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:45:34,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=716013.3333333334, ans=0.125 2023-09-30 12:45:34,148 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer1.prob, batch_count=716013.3333333334, ans=0.125 2023-09-30 12:45:38,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:45:38,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:45:38,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 12:45:38,559 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:45:38,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:45:44,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 12:45:46,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:45:48,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:45:56,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:45:57,474 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=10.02 vs. limit=15.0 2023-09-30 12:46:01,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:46:01,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 12:46:03,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:46:03,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:46:12,685 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 12:46:14,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:46:22,549 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 12:46:25,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:46:27,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:46:27,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:46:27,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:46:32,273 INFO [train.py:1039] (3/4) Epoch 21, batch 1200, loss[loss=0.1678, simple_loss=0.2425, pruned_loss=0.04655, over 23680.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2516, pruned_loss=0.04982, over 4713858.37 frames. ], batch size: 149, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:46:32,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:46:37,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 12:46:37,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:46:40,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:46:40,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:46:40,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:46:42,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:46:44,009 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:46:47,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:46:47,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:46:50,458 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 12:46:52,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 12:46:56,031 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=716346.6666666666, ans=0.0 2023-09-30 12:46:57,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:46:57,512 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=716346.6666666666, ans=0.2 2023-09-30 12:46:58,562 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.596e+02 1.836e+02 2.061e+02 2.415e+02 4.765e+02, threshold=4.121e+02, percent-clipped=1.0 2023-09-30 12:46:59,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=716346.6666666666, ans=0.05 2023-09-30 12:47:00,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:47:01,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:47:01,908 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=716346.6666666666, ans=0.125 2023-09-30 12:47:02,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=716346.6666666666, ans=0.125 2023-09-30 12:47:03,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:47:03,318 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 12:47:04,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:47:05,012 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=716413.3333333334, ans=0.1 2023-09-30 12:47:05,816 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.whiten, num_groups=1, num_channels=192, metric=4.47 vs. limit=12.0 2023-09-30 12:47:12,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 12:47:12,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:47:13,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 12:47:14,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:47:19,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 12:47:24,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 12:47:24,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:47:25,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:47:27,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:47:28,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 12:47:30,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:47:30,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:47:32,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:47:32,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 12:47:34,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:47:34,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:47:34,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 12:47:37,452 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:47:37,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:47:41,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 12:47:42,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:47:46,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 12:47:49,755 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 12:47:52,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:47:53,989 INFO [train.py:1039] (3/4) Epoch 21, batch 1250, loss[loss=0.1829, simple_loss=0.2497, pruned_loss=0.05803, over 23882.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2525, pruned_loss=0.0503, over 4724063.98 frames. ], batch size: 164, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:47:54,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:47:57,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:47:59,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:48:03,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 12:48:05,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:48:07,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:48:07,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 12:48:10,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:48:12,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:48:12,609 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=716680.0, ans=0.1 2023-09-30 12:48:15,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 12:48:17,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:48:17,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:48:17,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:48:17,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=716680.0, ans=0.125 2023-09-30 12:48:21,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 12:48:25,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 12:48:25,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:48:25,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:48:27,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:48:28,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:48:31,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:48:33,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 12:48:38,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 12:48:40,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 12:48:43,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:48:44,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 12:48:45,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:48:45,122 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 12:48:45,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:48:45,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:48:50,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:48:50,760 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:48:52,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:48:52,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:48:53,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 12:48:53,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 12:48:55,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 12:48:58,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:49:00,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 12:49:00,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:49:03,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 12:49:03,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:49:05,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 12:49:05,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 12:49:06,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:49:06,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 12:49:06,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:49:09,767 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=5.05 vs. limit=10.0 2023-09-30 12:49:10,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 12:49:12,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:49:14,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:49:14,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:49:16,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=716946.6666666666, ans=0.125 2023-09-30 12:49:17,286 INFO [train.py:1039] (3/4) Epoch 21, batch 1300, loss[loss=0.1655, simple_loss=0.2518, pruned_loss=0.03961, over 24637.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.253, pruned_loss=0.0506, over 4719856.87 frames. ], batch size: 65, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:49:17,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 12:49:19,670 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.73 vs. limit=10.0 2023-09-30 12:49:20,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:49:21,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 12:49:27,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:49:28,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 12:49:30,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:49:31,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:49:31,794 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 12:49:33,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 12:49:37,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:49:39,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:49:39,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 12:49:43,781 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.630e+02 1.834e+02 2.043e+02 2.344e+02 3.785e+02, threshold=4.086e+02, percent-clipped=0.0 2023-09-30 12:49:45,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 12:49:45,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=717013.3333333334, ans=0.125 2023-09-30 12:49:47,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.min_positive, batch_count=717013.3333333334, ans=0.05 2023-09-30 12:49:50,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:49:50,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:49:51,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:49:53,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:49:54,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:49:56,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 12:49:56,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 12:50:02,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:50:02,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 12:50:05,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 12:50:05,259 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 12:50:06,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:50:08,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:50:10,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 12:50:10,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:50:11,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 12:50:12,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:50:17,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:50:17,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:50:22,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 12:50:22,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 12:50:25,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 12:50:27,788 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=6.72 vs. limit=15.0 2023-09-30 12:50:28,400 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:50:31,463 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 12:50:33,004 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:50:39,745 INFO [train.py:1039] (3/4) Epoch 21, batch 1350, loss[loss=0.1674, simple_loss=0.2475, pruned_loss=0.04358, over 24307.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2528, pruned_loss=0.05088, over 4707663.66 frames. ], batch size: 61, lr: 4.93e-03, grad_scale: 16.0 2023-09-30 12:50:39,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 12:50:42,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:50:44,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:50:46,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:50:48,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:50:50,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:50:51,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:50:55,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 12:50:56,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 12:50:58,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:50:59,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:51:02,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 12:51:04,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:51:04,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:51:04,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 12:51:06,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 12:51:09,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 12:51:11,371 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=717413.3333333334, ans=0.1 2023-09-30 12:51:12,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:51:12,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 12:51:24,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:51:26,366 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=717413.3333333334, ans=0.0 2023-09-30 12:51:33,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:51:35,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:51:35,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 12:51:35,407 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.min_positive, batch_count=717480.0, ans=0.05 2023-09-30 12:51:38,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:51:41,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 12:51:41,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 12:51:41,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:51:42,152 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=717480.0, ans=0.125 2023-09-30 12:51:45,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:51:47,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 12:51:48,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:51:54,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 12:51:55,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 12:52:01,871 INFO [train.py:1039] (3/4) Epoch 21, batch 1400, loss[loss=0.1796, simple_loss=0.2494, pruned_loss=0.05489, over 23510.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2505, pruned_loss=0.05048, over 4670189.65 frames. ], batch size: 120, lr: 4.93e-03, grad_scale: 8.0 2023-09-30 12:52:02,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 12:52:04,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:52:07,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:52:09,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:52:12,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 12:52:13,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 12:52:24,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:52:25,191 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module2.balancer1.prob, batch_count=717680.0, ans=0.125 2023-09-30 12:52:28,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:52:30,113 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.457e+02 1.890e+02 2.143e+02 2.435e+02 3.256e+02, threshold=4.286e+02, percent-clipped=0.0 2023-09-30 12:52:30,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:52:30,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 12:52:32,362 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=717680.0, ans=0.125 2023-09-30 12:52:35,110 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:52:36,585 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 12:52:48,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:52:48,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:52:54,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=717813.3333333334, ans=0.125 2023-09-30 12:52:55,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 12:52:55,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:52:55,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:52:57,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:52:57,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:52:58,053 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.73 vs. limit=15.0 2023-09-30 12:52:58,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 12:52:58,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:52:58,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:52:58,923 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.hidden_balancer.prob, batch_count=717813.3333333334, ans=0.125 2023-09-30 12:53:01,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 12:53:01,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:53:06,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:53:09,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:53:17,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 12:53:18,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 12:53:20,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:53:21,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 12:53:23,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:53:25,607 INFO [train.py:1039] (3/4) Epoch 21, batch 1450, loss[loss=0.1747, simple_loss=0.2578, pruned_loss=0.04581, over 24682.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2504, pruned_loss=0.04973, over 4684537.91 frames. ], batch size: 65, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 12:53:25,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:53:28,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 12:53:31,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:53:31,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:53:31,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 12:53:38,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:53:39,722 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 12:53:41,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:53:41,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 12:53:42,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 12:53:44,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 12:53:46,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:53:46,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:53:46,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 12:53:46,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=718013.3333333334, ans=0.125 2023-09-30 12:53:48,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:53:49,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 12:53:49,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 12:53:49,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:53:51,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:53:53,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:53:56,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:54:00,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:54:00,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:54:03,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:54:03,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:54:04,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:54:04,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 12:54:04,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:54:06,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:54:11,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 12:54:12,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:54:17,831 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 12:54:19,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:54:20,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 12:54:21,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=718146.6666666666, ans=0.0 2023-09-30 12:54:22,416 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:54:22,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 12:54:27,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:54:27,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 12:54:31,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 12:54:31,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:54:35,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:54:35,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:54:38,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 12:54:41,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 12:54:41,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 12:54:41,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=718213.3333333334, ans=0.125 2023-09-30 12:54:42,758 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:54:44,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 12:54:47,161 INFO [train.py:1039] (3/4) Epoch 21, batch 1500, loss[loss=0.1746, simple_loss=0.2466, pruned_loss=0.05125, over 23386.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2504, pruned_loss=0.04981, over 4687614.36 frames. ], batch size: 285, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 12:54:47,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=718280.0, ans=0.1 2023-09-30 12:54:55,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 12:54:55,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 12:54:55,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:54:57,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:54:57,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:54:59,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 12:54:59,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 12:55:01,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 12:55:01,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 12:55:01,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:55:03,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:55:05,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:55:06,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:55:06,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.attention_skip_rate, batch_count=718346.6666666666, ans=0.0 2023-09-30 12:55:13,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:55:13,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 12:55:14,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:55:16,111 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.411e+02 1.806e+02 1.957e+02 2.271e+02 3.905e+02, threshold=3.913e+02, percent-clipped=0.0 2023-09-30 12:55:16,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:55:17,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:55:20,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 12:55:24,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 12:55:26,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:55:26,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 12:55:29,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 12:55:30,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:55:32,306 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:55:32,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:55:32,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 12:55:33,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:55:33,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:55:36,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 12:55:36,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:55:41,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 12:55:41,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 12:55:41,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=718480.0, ans=0.125 2023-09-30 12:55:43,494 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=718480.0, ans=0.125 2023-09-30 12:55:48,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 12:55:49,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 12:55:52,960 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 12:55:53,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:55:53,036 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 12:55:54,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:55:56,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:55:58,055 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 12:55:59,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 12:56:01,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 12:56:01,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=718546.6666666666, ans=0.2 2023-09-30 12:56:02,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:56:06,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:56:07,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:56:08,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:56:09,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:56:09,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 12:56:09,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 12:56:11,638 INFO [train.py:1039] (3/4) Epoch 21, batch 1550, loss[loss=0.1858, simple_loss=0.2678, pruned_loss=0.05188, over 24042.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2515, pruned_loss=0.04978, over 4707392.62 frames. ], batch size: 80, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 12:56:11,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 12:56:11,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:56:13,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 12:56:14,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 12:56:16,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:56:18,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:56:19,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:56:19,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 12:56:21,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:56:21,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=718613.3333333334, ans=0.125 2023-09-30 12:56:23,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:56:25,286 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 12:56:25,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:56:26,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 12:56:28,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 12:56:29,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 12:56:29,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 12:56:32,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:56:33,474 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 12:56:33,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 12:56:35,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 12:56:35,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:56:37,246 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=10.15 vs. limit=15.0 2023-09-30 12:56:38,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:56:41,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:56:44,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 12:56:44,919 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 12:56:46,705 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=718746.6666666666, ans=0.125 2023-09-30 12:56:54,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:56:57,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:56:57,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 12:56:57,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:56:59,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 12:57:04,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 12:57:06,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:57:11,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:57:14,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:57:14,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:57:15,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 12:57:15,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:57:16,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 12:57:16,272 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=718880.0, ans=0.125 2023-09-30 12:57:16,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=718880.0, ans=0.0 2023-09-30 12:57:17,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:57:17,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 12:57:17,633 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 12:57:21,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=718880.0, ans=0.125 2023-09-30 12:57:22,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:57:24,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer2.prob, batch_count=718880.0, ans=0.125 2023-09-30 12:57:26,738 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.min_positive, batch_count=718880.0, ans=0.025 2023-09-30 12:57:27,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 12:57:32,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:57:34,733 INFO [train.py:1039] (3/4) Epoch 21, batch 1600, loss[loss=0.1506, simple_loss=0.2332, pruned_loss=0.03396, over 24502.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2526, pruned_loss=0.05013, over 4694480.25 frames. ], batch size: 63, lr: 4.92e-03, grad_scale: 16.0 2023-09-30 12:57:34,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:57:34,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 12:57:36,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:57:37,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:57:37,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:57:39,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 12:57:39,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 12:57:41,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:57:43,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 12:57:44,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 12:57:47,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 12:57:50,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:57:52,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 12:57:52,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:57:56,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:57:59,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 12:58:01,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 12:58:04,558 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.831e+02 2.011e+02 2.218e+02 3.597e+02, threshold=4.022e+02, percent-clipped=0.0 2023-09-30 12:58:04,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:58:06,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 12:58:06,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:58:07,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 12:58:11,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 12:58:20,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:58:21,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 12:58:21,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:58:23,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:58:23,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 12:58:24,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 12:58:28,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 12:58:30,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:58:30,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:58:31,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:58:33,805 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 12:58:34,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 12:58:36,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 12:58:37,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 12:58:43,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:58:43,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:58:47,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 12:58:47,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 12:58:48,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 12:58:53,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:58:56,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:58:56,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 12:58:56,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 12:58:56,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 12:58:56,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 12:58:56,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 12:58:57,949 INFO [train.py:1039] (3/4) Epoch 21, batch 1650, loss[loss=0.1795, simple_loss=0.2532, pruned_loss=0.05295, over 23452.00 frames. ], tot_loss[loss=0.1777, simple_loss=0.2537, pruned_loss=0.05083, over 4689909.79 frames. ], batch size: 134, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 12:59:01,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 12:59:01,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:59:01,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:59:03,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 12:59:06,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:59:09,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 12:59:13,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 12:59:13,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 12:59:13,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 12:59:13,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 12:59:15,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 12:59:15,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 12:59:21,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 12:59:24,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 12:59:32,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 12:59:33,523 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.25 vs. limit=15.0 2023-09-30 12:59:34,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:59:35,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 12:59:40,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 12:59:41,168 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 12:59:42,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 12:59:44,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 12:59:46,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 12:59:47,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 12:59:47,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:59:48,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 12:59:49,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 12:59:49,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:59:51,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 12:59:52,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 12:59:52,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 12:59:56,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 12:59:57,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 13:00:00,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:00:00,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 13:00:02,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 13:00:02,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 13:00:02,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:00:03,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:00:03,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:00:05,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:00:05,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 13:00:08,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:00:12,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:00:12,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:00:15,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 13:00:19,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=719613.3333333334, ans=0.1 2023-09-30 13:00:20,683 INFO [train.py:1039] (3/4) Epoch 21, batch 1700, loss[loss=0.1475, simple_loss=0.2275, pruned_loss=0.0337, over 24439.00 frames. ], tot_loss[loss=0.1766, simple_loss=0.2522, pruned_loss=0.05054, over 4688645.32 frames. ], batch size: 58, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 13:00:20,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:00:20,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:00:20,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 13:00:20,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:00:20,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:00:20,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:00:24,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:00:24,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:00:25,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 13:00:26,389 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.min_abs, batch_count=719613.3333333334, ans=0.5 2023-09-30 13:00:27,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:00:37,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:00:40,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:00:45,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:00:45,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:00:45,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:00:47,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:00:49,903 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.571e+02 1.892e+02 2.034e+02 2.348e+02 3.587e+02, threshold=4.068e+02, percent-clipped=0.0 2023-09-30 13:00:50,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 13:00:51,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:00:51,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:00:55,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:00:55,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 13:00:57,736 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=719746.6666666666, ans=0.0 2023-09-30 13:00:58,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 13:00:59,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 13:00:59,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=719746.6666666666, ans=0.0 2023-09-30 13:01:00,600 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:01:02,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 13:01:04,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:01:14,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:01:16,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:01:16,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:01:17,037 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=3.01 vs. limit=15.0 2023-09-30 13:01:18,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:01:18,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 13:01:18,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:01:18,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=719813.3333333334, ans=0.2 2023-09-30 13:01:21,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:01:21,687 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 13:01:21,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:01:21,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:01:23,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:01:23,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:01:24,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:01:24,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:01:26,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:01:26,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:01:26,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:01:33,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:01:33,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=719880.0, ans=0.07 2023-09-30 13:01:34,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 13:01:36,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:01:38,423 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:01:41,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 13:01:42,795 INFO [train.py:1039] (3/4) Epoch 21, batch 1750, loss[loss=0.1626, simple_loss=0.2358, pruned_loss=0.04467, over 23445.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2505, pruned_loss=0.05009, over 4688484.17 frames. ], batch size: 134, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 13:01:46,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:01:47,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:01:49,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:01:49,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 13:01:50,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:01:53,151 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=10.49 vs. limit=15.0 2023-09-30 13:01:54,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:01:54,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:02:02,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 13:02:04,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:02:06,242 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 13:02:06,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:02:08,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:02:10,391 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=720013.3333333334, ans=0.125 2023-09-30 13:02:11,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 13:02:13,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 13:02:15,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:02:16,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 13:02:24,527 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:02:27,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:02:27,594 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:02:31,169 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:02:31,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:02:32,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:02:33,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=720080.0, ans=0.125 2023-09-30 13:02:34,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:02:35,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:02:36,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:02:37,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 13:02:39,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:02:41,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 13:02:41,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:02:42,211 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:02:43,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:02:45,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:02:47,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=720146.6666666666, ans=0.125 2023-09-30 13:02:48,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:02:48,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 13:02:48,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:02:51,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:02:56,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:02:59,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:03:01,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:03:01,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 13:03:01,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:03:03,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:03:03,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:03:03,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:03:03,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:03:04,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:03:09,630 INFO [train.py:1039] (3/4) Epoch 21, batch 1800, loss[loss=0.1633, simple_loss=0.2389, pruned_loss=0.04391, over 23674.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2502, pruned_loss=0.04985, over 4691233.70 frames. ], batch size: 256, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 13:03:09,704 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:03:09,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:03:11,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:03:13,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:03:13,849 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=8.68 vs. limit=12.0 2023-09-30 13:03:18,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:03:18,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:03:21,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:03:24,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:03:24,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:03:26,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:03:27,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:03:27,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 13:03:29,373 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:03:31,781 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=720346.6666666666, ans=0.125 2023-09-30 13:03:32,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:03:37,736 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 13:03:39,071 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.557e+02 1.961e+02 2.256e+02 2.662e+02 3.514e+02, threshold=4.513e+02, percent-clipped=0.0 2023-09-30 13:03:39,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 13:03:40,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 13:03:40,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:03:41,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:03:41,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:03:43,192 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:03:43,511 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=720413.3333333334, ans=0.125 2023-09-30 13:03:51,586 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 13:03:54,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:03:56,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:03:58,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 13:03:58,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 13:03:58,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=720480.0, ans=0.2 2023-09-30 13:03:59,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:04:01,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:04:02,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:04:07,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 13:04:14,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:04:14,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 13:04:16,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:04:16,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:04:17,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:04:17,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 13:04:20,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:04:20,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:04:23,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 13:04:23,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:04:26,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:04:26,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:04:26,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:04:27,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:04:29,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:04:30,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:04:30,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:04:31,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.prob, batch_count=720613.3333333334, ans=0.125 2023-09-30 13:04:31,134 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:04:32,731 INFO [train.py:1039] (3/4) Epoch 21, batch 1850, loss[loss=0.1747, simple_loss=0.2481, pruned_loss=0.05065, over 17247.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2509, pruned_loss=0.05004, over 4684474.09 frames. ], batch size: 37, lr: 4.92e-03, grad_scale: 8.0 2023-09-30 13:04:34,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:04:36,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:04:45,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:04:45,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 13:04:47,469 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=720680.0, ans=0.04949747468305833 2023-09-30 13:04:50,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 13:04:55,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 13:04:58,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:04:58,958 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=720680.0, ans=0.125 2023-09-30 13:05:00,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 13:05:00,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 13:05:00,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=720680.0, ans=0.0 2023-09-30 13:05:00,603 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=720680.0, ans=0.125 2023-09-30 13:05:07,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:05:08,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 13:05:12,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:05:12,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:05:12,792 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.min_positive, batch_count=720746.6666666666, ans=0.05 2023-09-30 13:05:17,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 13:05:17,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:17,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:05:19,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:05:20,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:05:24,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:05:27,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:05:27,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:27,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 13:05:27,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:05:30,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:05:32,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:05:36,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 13:05:36,396 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:05:41,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:05:41,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:05:41,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 13:05:41,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 13:05:44,645 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 13:05:44,770 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 13:05:47,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:05:47,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:05:47,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:05:47,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:49,246 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 13:05:49,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:05:50,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:50,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 13:05:53,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:05:54,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:05:54,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 13:05:56,032 INFO [train.py:1039] (3/4) Epoch 21, batch 1900, loss[loss=0.1786, simple_loss=0.2611, pruned_loss=0.04803, over 24373.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2517, pruned_loss=0.05039, over 4698832.43 frames. ], batch size: 77, lr: 4.91e-03, grad_scale: 8.0 2023-09-30 13:05:57,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:05:57,804 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 13:05:57,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:05:59,390 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:06:04,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:06:05,823 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=720946.6666666666, ans=0.125 2023-09-30 13:06:05,953 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=720946.6666666666, ans=0.125 2023-09-30 13:06:07,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:06:07,368 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 13:06:09,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 13:06:11,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:06:11,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:06:12,846 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 13:06:12,902 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 13:06:14,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=721013.3333333334, ans=0.125 2023-09-30 13:06:16,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 13:06:18,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:06:22,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 13:06:24,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 13:06:26,414 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.558e+02 1.817e+02 1.990e+02 2.367e+02 3.522e+02, threshold=3.980e+02, percent-clipped=0.0 2023-09-30 13:06:35,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 13:06:40,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 13:06:40,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:06:40,551 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=721080.0, ans=0.0 2023-09-30 13:06:41,613 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 13:06:41,621 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 13:06:41,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 13:06:41,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 13:06:41,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:06:47,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 13:06:50,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:06:55,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:06:55,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 13:06:55,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:07:00,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 13:07:00,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:07:06,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=721213.3333333334, ans=0.125 2023-09-30 13:07:08,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:07:08,727 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:07:08,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:07:10,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:07:10,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:07:10,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:07:11,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:07:14,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:07:14,989 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:07:16,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:07:16,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:07:16,837 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:07:18,223 INFO [train.py:1039] (3/4) Epoch 21, batch 1950, loss[loss=0.1567, simple_loss=0.233, pruned_loss=0.04025, over 20360.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2528, pruned_loss=0.0508, over 4700211.06 frames. ], batch size: 44, lr: 4.91e-03, grad_scale: 8.0 2023-09-30 13:07:18,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:07:18,640 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=721280.0, ans=0.0 2023-09-30 13:07:22,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:07:24,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:07:25,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:07:25,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:07:28,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 13:07:28,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 13:07:28,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:07:29,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:07:33,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:07:33,376 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:07:34,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:07:35,231 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=721346.6666666666, ans=0.125 2023-09-30 13:07:36,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:07:41,178 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:07:41,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:07:41,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:07:41,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:07:44,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:07:47,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:07:47,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:07:47,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 13:07:47,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 13:07:49,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:07:50,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:07:50,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:07:53,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:07:56,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:08:00,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:08:03,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:08:03,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:08:03,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 13:08:03,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:08:08,286 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=10.01 vs. limit=15.0 2023-09-30 13:08:10,958 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:08:12,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:08:12,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:08:13,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:08:21,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:08:23,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:08:25,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:08:29,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:08:33,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:08:33,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:08:35,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 13:08:35,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:08:36,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:08:38,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 13:08:39,451 INFO [train.py:1039] (3/4) Epoch 21, batch 2000, loss[loss=0.1753, simple_loss=0.2428, pruned_loss=0.05392, over 23605.00 frames. ], tot_loss[loss=0.1775, simple_loss=0.253, pruned_loss=0.05097, over 4711295.46 frames. ], batch size: 149, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:08:39,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:08:44,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:08:44,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:08:44,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:08:46,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:08:49,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:08:52,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 13:08:52,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:08:55,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:08:56,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=721680.0, ans=0.125 2023-09-30 13:08:57,576 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 13:08:59,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:08:59,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:09:00,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:09:02,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 13:09:02,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:04,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:04,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:06,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 13:09:07,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:09:08,950 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.918e+02 2.130e+02 2.425e+02 4.087e+02, threshold=4.260e+02, percent-clipped=1.0 2023-09-30 13:09:10,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 13:09:10,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:09:14,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:09:14,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 13:09:14,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:16,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:09:18,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:09:18,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 13:09:21,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 13:09:21,310 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:09:21,323 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:09:25,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:09:27,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:09:28,936 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:09:29,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:09:31,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:09:31,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:09:33,392 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:09:33,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:09:34,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:38,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:09:38,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 13:09:45,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:09:45,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:09:51,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:09:51,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:09:54,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:56,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:09:56,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:09:57,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 13:09:57,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:09:59,373 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=721880.0, ans=0.125 2023-09-30 13:10:00,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:10:02,017 INFO [train.py:1039] (3/4) Epoch 21, batch 2050, loss[loss=0.1834, simple_loss=0.2637, pruned_loss=0.05154, over 24353.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2522, pruned_loss=0.0502, over 4715139.41 frames. ], batch size: 77, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:10:02,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:10:05,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:10:06,196 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn1.whiten, num_groups=1, num_channels=192, metric=12.38 vs. limit=22.5 2023-09-30 13:10:06,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:10:10,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:10:11,731 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:10:13,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:10:15,238 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:10:17,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 13:10:17,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:10:20,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:10:20,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:10:30,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:10:30,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:10:33,850 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 13:10:35,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:10:35,931 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=722080.0, ans=0.2 2023-09-30 13:10:37,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 13:10:38,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:10:41,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:10:43,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:10:43,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 13:10:44,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:10:46,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:10:48,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:10:49,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:10:51,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:10:53,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:10:53,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=722146.6666666666, ans=0.125 2023-09-30 13:10:55,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:10:57,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:11:02,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:11:08,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:11:09,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 13:11:16,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:11:16,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:11:18,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:11:20,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 13:11:24,546 INFO [train.py:1039] (3/4) Epoch 21, batch 2100, loss[loss=0.1535, simple_loss=0.2006, pruned_loss=0.05317, over 18876.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2509, pruned_loss=0.05009, over 4707128.85 frames. ], batch size: 388, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:11:24,690 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 13:11:24,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:11:24,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:11:26,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:11:27,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:11:27,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 13:11:28,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 13:11:28,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:11:29,357 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.83 vs. limit=10.0 2023-09-30 13:11:32,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:11:33,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:11:33,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:11:34,136 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=722280.0, ans=0.1 2023-09-30 13:11:35,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:11:35,439 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 13:11:37,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:11:38,465 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 13:11:38,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 13:11:41,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:11:41,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:11:41,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 13:11:42,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 13:11:46,273 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 13:11:46,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:11:49,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:11:50,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:11:53,682 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.414e+02 1.855e+02 2.014e+02 2.188e+02 4.712e+02, threshold=4.028e+02, percent-clipped=1.0 2023-09-30 13:11:53,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:11:56,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 13:11:58,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:11:58,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 13:11:59,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 13:12:01,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:12:01,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 13:12:01,732 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 13:12:03,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 13:12:03,544 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=722413.3333333334, ans=0.1 2023-09-30 13:12:05,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:12:06,817 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:12:09,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:12:10,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:12:11,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:12:13,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:12:13,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 13:12:13,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:12:13,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:12:14,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:12:14,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 13:12:16,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 13:12:16,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 13:12:19,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=722480.0, ans=0.0 2023-09-30 13:12:19,970 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.14 vs. limit=15.0 2023-09-30 13:12:22,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:12:25,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:12:26,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 13:12:32,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:12:36,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:12:36,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:12:36,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:12:36,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 13:12:36,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:12:38,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:12:38,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:12:40,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:12:40,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:12:41,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 13:12:43,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 13:12:43,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:12:46,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:12:46,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:12:46,562 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=722613.3333333334, ans=0.04949747468305833 2023-09-30 13:12:47,697 INFO [train.py:1039] (3/4) Epoch 21, batch 2150, loss[loss=0.1518, simple_loss=0.2253, pruned_loss=0.03918, over 21466.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.25, pruned_loss=0.04962, over 4702202.37 frames. ], batch size: 47, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:12:47,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:12:47,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:12:52,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 13:12:54,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:12:54,822 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.80 vs. limit=15.0 2023-09-30 13:12:55,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:12:57,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:12:57,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:12:57,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:13:02,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:13:04,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:13:04,105 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:13:07,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:07,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 13:13:13,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:13:13,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:13:14,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:14,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:13:16,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:16,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:13:16,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:13:16,670 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=722680.0, ans=0.125 2023-09-30 13:13:17,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:13:17,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:13:19,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 13:13:20,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:13:20,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:13:21,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:13:24,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:13:24,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:13:26,091 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:13:27,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:13:27,751 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=722746.6666666666, ans=0.0 2023-09-30 13:13:28,324 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.59 vs. limit=22.5 2023-09-30 13:13:29,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:13:29,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 13:13:29,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:13:32,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:13:32,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:34,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:13:37,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:13:39,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:13:41,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:41,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 13:13:43,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 13:13:43,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:13:43,464 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 13:13:43,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:13:45,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:13:45,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 13:13:45,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:13:45,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 13:13:45,227 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 13:13:45,227 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 13:13:46,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 13:13:49,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:13:50,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:13:51,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:13:51,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:13:52,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:13:54,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:13:54,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:14:03,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:14:03,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 13:14:08,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:14:08,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=722946.6666666666, ans=0.125 2023-09-30 13:14:08,663 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=722946.6666666666, ans=0.125 2023-09-30 13:14:09,612 INFO [train.py:1039] (3/4) Epoch 21, batch 2200, loss[loss=0.1798, simple_loss=0.2514, pruned_loss=0.05415, over 23568.00 frames. ], tot_loss[loss=0.174, simple_loss=0.2496, pruned_loss=0.04922, over 4711796.40 frames. ], batch size: 256, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:14:13,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:14:15,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:14:15,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:14:16,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:14:20,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:14:20,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:14:20,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 13:14:25,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 13:14:26,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 13:14:31,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 13:14:34,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:14:36,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:14:36,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:14:37,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:14:39,229 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.525e+02 1.806e+02 1.948e+02 2.214e+02 3.228e+02, threshold=3.896e+02, percent-clipped=0.0 2023-09-30 13:14:39,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 13:14:42,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:14:44,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:14:46,698 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 13:14:50,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:14:52,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:14:53,936 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=723080.0, ans=0.07 2023-09-30 13:14:55,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:14:56,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:14:58,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 13:14:59,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:15:01,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 13:15:03,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:15:03,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:15:04,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:15:06,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:15:06,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:15:06,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:15:07,638 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:15:09,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:15:09,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:15:10,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:15:11,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=723146.6666666666, ans=0.0 2023-09-30 13:15:15,269 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 13:15:15,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:15:17,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:15:17,221 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 13:15:18,695 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.67 vs. limit=12.0 2023-09-30 13:15:21,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:15:21,672 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 13:15:25,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 13:15:25,759 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 13:15:27,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:15:27,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:15:28,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:15:31,744 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 13:15:33,227 INFO [train.py:1039] (3/4) Epoch 21, batch 2250, loss[loss=0.1915, simple_loss=0.2593, pruned_loss=0.06179, over 23600.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2506, pruned_loss=0.04943, over 4720146.80 frames. ], batch size: 256, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:15:33,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:15:34,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:15:41,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:15:41,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:15:45,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:15:46,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:15:46,901 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.65 vs. limit=12.0 2023-09-30 13:15:47,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:15:50,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 13:15:50,620 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:15:50,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:15:50,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=723346.6666666666, ans=0.2 2023-09-30 13:15:52,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 13:15:54,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:15:54,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:15:55,267 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.conv_module2.whiten, num_groups=1, num_channels=512, metric=5.19 vs. limit=15.0 2023-09-30 13:15:57,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:16:00,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=723346.6666666666, ans=0.125 2023-09-30 13:16:03,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:16:04,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 13:16:05,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 13:16:06,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 13:16:07,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:16:09,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:16:15,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:16:17,648 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=9.07 vs. limit=10.0 2023-09-30 13:16:18,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:16:19,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:16:20,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:16:22,210 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=18.55 vs. limit=22.5 2023-09-30 13:16:22,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:16:24,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:16:28,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:16:31,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 13:16:31,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=723480.0, ans=0.0 2023-09-30 13:16:38,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:16:38,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:16:38,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:16:43,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 13:16:46,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 13:16:46,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 13:16:47,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:16:47,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:16:50,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 13:16:53,656 INFO [train.py:1039] (3/4) Epoch 21, batch 2300, loss[loss=0.2022, simple_loss=0.2653, pruned_loss=0.06958, over 23561.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2514, pruned_loss=0.05015, over 4720595.94 frames. ], batch size: 120, lr: 4.91e-03, grad_scale: 16.0 2023-09-30 13:16:53,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:16:53,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:16:59,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:16:59,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:17:02,365 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=723613.3333333334, ans=0.125 2023-09-30 13:17:03,566 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 13:17:05,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:17:13,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:17:13,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 13:17:14,653 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn1.whiten, num_groups=1, num_channels=192, metric=10.13 vs. limit=22.5 2023-09-30 13:17:15,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:17:15,549 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=723680.0, ans=0.2 2023-09-30 13:17:16,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:17:16,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 13:17:16,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:17:18,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:17:19,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:17:22,891 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.584e+02 1.842e+02 2.058e+02 2.392e+02 4.261e+02, threshold=4.115e+02, percent-clipped=2.0 2023-09-30 13:17:24,549 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:17:27,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:17:31,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:17:37,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:17:38,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:17:41,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:17:44,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:17:45,107 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=723813.3333333334, ans=0.1 2023-09-30 13:17:50,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:17:50,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:17:52,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:17:52,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 13:17:56,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 13:17:56,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:17:57,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:17:57,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:17:58,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:17:58,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 13:17:58,593 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:17:58,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=723880.0, ans=0.2 2023-09-30 13:18:00,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 13:18:00,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:18:00,065 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:18:00,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 13:18:06,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:18:09,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:18:11,361 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff2_skip_rate, batch_count=723880.0, ans=0.0 2023-09-30 13:18:12,782 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=723946.6666666666, ans=0.2 2023-09-30 13:18:12,800 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.scale_min, batch_count=723946.6666666666, ans=0.2 2023-09-30 13:18:13,922 INFO [train.py:1039] (3/4) Epoch 21, batch 2350, loss[loss=0.2435, simple_loss=0.3061, pruned_loss=0.09049, over 19811.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2513, pruned_loss=0.04951, over 4726006.20 frames. ], batch size: 389, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:18:16,182 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:18:16,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:18:16,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:18:17,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 13:18:17,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:18:20,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:18:20,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 13:18:27,492 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=4.31 vs. limit=15.0 2023-09-30 13:18:28,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:18:28,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 13:18:33,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 13:18:37,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:18:39,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:18:39,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:18:39,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:18:39,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:18:40,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 13:18:42,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:18:49,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 13:18:50,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:18:53,199 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=724080.0, ans=0.125 2023-09-30 13:18:54,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:18:54,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:18:54,819 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=724080.0, ans=10.0 2023-09-30 13:18:56,494 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:18:58,511 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 13:18:59,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:19:01,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:19:01,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:19:01,774 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=724080.0, ans=0.125 2023-09-30 13:19:03,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:19:04,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:19:07,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 13:19:07,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:19:12,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:19:12,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:19:12,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=724146.6666666666, ans=0.125 2023-09-30 13:19:13,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 13:19:14,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:19:15,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=724146.6666666666, ans=0.1 2023-09-30 13:19:17,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 13:19:17,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:19:20,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 13:19:21,184 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=724213.3333333334, ans=0.1 2023-09-30 13:19:26,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 13:19:26,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:19:26,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 13:19:26,166 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 13:19:27,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 13:19:29,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 13:19:34,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:19:37,899 INFO [train.py:1039] (3/4) Epoch 21, batch 2400, loss[loss=0.1756, simple_loss=0.2631, pruned_loss=0.04399, over 24431.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2517, pruned_loss=0.04984, over 4719564.66 frames. ], batch size: 69, lr: 4.90e-03, grad_scale: 32.0 2023-09-30 13:19:38,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:19:41,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:19:44,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:19:45,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 13:19:45,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 13:19:53,554 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=14.88 vs. limit=22.5 2023-09-30 13:19:54,176 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:19:54,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:19:55,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 13:19:55,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:19:55,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:19:57,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 13:20:02,580 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:20:05,314 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=7.21 vs. limit=10.0 2023-09-30 13:20:06,035 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 13:20:06,955 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module1.whiten, num_groups=1, num_channels=192, metric=7.71 vs. limit=15.0 2023-09-30 13:20:07,654 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.507e+02 1.817e+02 2.030e+02 2.238e+02 3.635e+02, threshold=4.061e+02, percent-clipped=0.0 2023-09-30 13:20:12,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:20:14,366 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=724413.3333333334, ans=0.5 2023-09-30 13:20:17,096 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 13:20:20,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:20:20,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:20:20,743 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=724413.3333333334, ans=0.0 2023-09-30 13:20:25,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:20:25,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 13:20:26,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:20:34,078 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:20:34,578 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=724480.0, ans=0.125 2023-09-30 13:20:35,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:20:38,064 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=724480.0, ans=0.2 2023-09-30 13:20:39,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:20:40,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:20:40,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 13:20:40,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:20:40,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:20:40,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:20:40,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 13:20:47,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:20:47,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:20:47,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 13:20:49,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 13:20:52,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:20:52,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:20:52,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 13:20:52,630 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=724546.6666666666, ans=0.0 2023-09-30 13:20:53,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 13:20:53,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 13:20:53,835 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 13:20:55,371 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 13:20:55,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:20:58,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:20:58,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:20:59,986 INFO [train.py:1039] (3/4) Epoch 21, batch 2450, loss[loss=0.1694, simple_loss=0.2493, pruned_loss=0.0448, over 24478.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2503, pruned_loss=0.04953, over 4710981.07 frames. ], batch size: 58, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:21:00,119 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 13:21:00,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:21:02,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 13:21:04,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:21:06,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:21:09,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:21:09,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:21:11,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 13:21:18,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:21:18,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:21:21,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:21:21,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:21:21,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:21:21,848 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=724680.0, ans=0.1 2023-09-30 13:21:22,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 13:21:27,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:21:29,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:21:30,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:21:30,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=724680.0, ans=0.125 2023-09-30 13:21:33,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 13:21:33,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:21:33,941 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.attention_skip_rate, batch_count=724746.6666666666, ans=0.0 2023-09-30 13:21:36,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:21:36,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:21:38,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 13:21:40,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:21:43,797 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=724746.6666666666, ans=0.125 2023-09-30 13:21:45,469 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer1.prob, batch_count=724746.6666666666, ans=0.125 2023-09-30 13:21:47,575 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=724746.6666666666, ans=0.125 2023-09-30 13:21:48,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:21:50,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:21:50,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:21:52,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:21:52,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:21:53,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:21:55,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 13:21:57,170 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff2_skip_rate, batch_count=724813.3333333334, ans=0.0 2023-09-30 13:21:58,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:21:58,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:22:02,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:22:02,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:22:05,238 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=11.62 vs. limit=22.5 2023-09-30 13:22:09,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:22:09,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 13:22:10,448 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:22:10,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:22:10,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 13:22:10,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:22:13,074 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer2.prob, batch_count=724880.0, ans=0.125 2023-09-30 13:22:14,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:22:16,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:22:19,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:22:21,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:22:22,987 INFO [train.py:1039] (3/4) Epoch 21, batch 2500, loss[loss=0.1581, simple_loss=0.236, pruned_loss=0.04009, over 24457.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2496, pruned_loss=0.049, over 4714708.84 frames. ], batch size: 58, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:22:24,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 13:22:24,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:22:31,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:22:34,888 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=724946.6666666666, ans=0.1 2023-09-30 13:22:39,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:22:40,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:22:42,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:22:42,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 13:22:49,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:22:49,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:22:49,967 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.skip_rate, batch_count=725013.3333333334, ans=0.035 2023-09-30 13:22:51,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 13:22:51,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 13:22:52,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 13:22:54,206 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.323e+02 2.019e+02 2.359e+02 2.829e+02 4.327e+02, threshold=4.718e+02, percent-clipped=1.0 2023-09-30 13:22:54,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:22:56,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:22:56,373 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 13:22:56,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:22:57,765 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 13:22:57,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:23:03,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:23:04,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:23:07,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:23:09,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 13:23:09,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:23:09,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:23:13,784 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:23:13,937 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=725146.6666666666, ans=0.125 2023-09-30 13:23:18,824 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:23:21,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:23:27,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=725213.3333333334, ans=0.0 2023-09-30 13:23:28,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 13:23:30,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 13:23:30,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:23:32,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:23:33,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:23:33,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:23:34,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=725213.3333333334, ans=0.0 2023-09-30 13:23:35,291 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 13:23:35,291 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 13:23:35,300 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 13:23:38,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:23:41,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 13:23:41,827 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 13:23:41,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:23:44,784 INFO [train.py:1039] (3/4) Epoch 21, batch 2550, loss[loss=0.1789, simple_loss=0.2502, pruned_loss=0.05382, over 23715.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.25, pruned_loss=0.04908, over 4720434.22 frames. ], batch size: 179, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:23:44,880 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 13:23:46,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 13:23:49,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:23:51,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:23:51,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:23:54,985 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:23:55,546 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.59 vs. limit=15.0 2023-09-30 13:23:56,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 13:23:56,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:23:58,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer1.prob, batch_count=725280.0, ans=0.125 2023-09-30 13:24:00,095 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 13:24:03,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:24:05,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:05,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:24:06,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 13:24:06,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 13:24:08,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:24:08,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:24:08,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=725346.6666666666, ans=0.0 2023-09-30 13:24:11,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:24:11,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 13:24:11,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:24:11,845 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:11,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 13:24:24,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:24:32,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:24:32,755 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:32,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:24:32,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:24:33,052 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=725480.0, ans=0.125 2023-09-30 13:24:39,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:24:41,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 13:24:41,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:24:43,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:24:43,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:24:43,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:24:46,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:24:48,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:51,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:24:51,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 13:24:51,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:24:51,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:24:52,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:24:54,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:24:55,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:25:04,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:25:07,608 INFO [train.py:1039] (3/4) Epoch 21, batch 2600, loss[loss=0.187, simple_loss=0.2665, pruned_loss=0.05373, over 24061.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2511, pruned_loss=0.04956, over 4714564.77 frames. ], batch size: 86, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:25:07,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:25:08,058 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=725613.3333333334, ans=0.1 2023-09-30 13:25:08,549 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=15.28 vs. limit=22.5 2023-09-30 13:25:09,341 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 13:25:12,777 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 13:25:12,806 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:25:12,868 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 13:25:13,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 13:25:14,450 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 13:25:16,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:25:16,254 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 13:25:17,858 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 13:25:19,307 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 13:25:20,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:25:21,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=725613.3333333334, ans=0.125 2023-09-30 13:25:24,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 13:25:25,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 13:25:25,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:25:27,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 13:25:29,544 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 13:25:29,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 13:25:30,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=725680.0, ans=0.125 2023-09-30 13:25:37,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:25:37,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:25:37,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:25:37,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 13:25:39,020 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.842e+02 2.050e+02 2.225e+02 3.222e+02, threshold=4.100e+02, percent-clipped=0.0 2023-09-30 13:25:39,496 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=725746.6666666666, ans=0.0 2023-09-30 13:25:42,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:25:49,000 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 13:25:53,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:25:55,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:25:55,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 13:25:55,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:25:55,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:25:57,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 13:25:58,747 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.80 vs. limit=15.0 2023-09-30 13:26:00,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:26:00,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:26:02,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:26:04,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=725813.3333333334, ans=0.0 2023-09-30 13:26:07,523 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 13:26:07,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:26:07,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:26:15,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:26:15,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:26:15,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 13:26:17,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:26:19,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:26:21,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:26:27,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 13:26:27,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:26:28,814 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:26:30,207 INFO [train.py:1039] (3/4) Epoch 21, batch 2650, loss[loss=0.1844, simple_loss=0.2662, pruned_loss=0.05132, over 23998.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2523, pruned_loss=0.0497, over 4725272.69 frames. ], batch size: 86, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:26:33,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 13:26:33,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:26:34,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 13:26:35,517 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 13:26:36,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:26:40,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:26:42,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:26:45,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:26:46,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:26:48,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 13:26:48,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:26:48,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:26:51,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 13:26:54,508 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 13:26:56,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:26:59,217 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 13:26:59,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:27:00,775 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 13:27:05,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:27:05,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:27:06,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:27:06,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:13,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 13:27:13,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 13:27:16,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:27:19,938 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 13:27:19,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:27:21,515 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:21,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:27:23,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:27:23,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:27:25,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:27:26,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:27:28,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:27:29,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:27:29,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:27:32,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:27:32,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=726146.6666666666, ans=0.0 2023-09-30 13:27:33,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:27:34,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:27:35,196 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=726213.3333333334, ans=0.2 2023-09-30 13:27:35,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=726213.3333333334, ans=0.0 2023-09-30 13:27:37,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:27:37,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 13:27:39,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=726213.3333333334, ans=0.0 2023-09-30 13:27:40,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:41,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:27:41,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:27:41,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 13:27:46,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:27:46,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:47,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:27:49,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:27:51,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:27:51,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:27:52,796 INFO [train.py:1039] (3/4) Epoch 21, batch 2700, loss[loss=0.1694, simple_loss=0.2502, pruned_loss=0.04426, over 24340.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2535, pruned_loss=0.05045, over 4720668.51 frames. ], batch size: 61, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:27:55,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:27:55,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 13:27:58,417 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=726280.0, ans=0.125 2023-09-30 13:27:59,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:28:01,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 13:28:04,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:28:04,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:28:04,120 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:28:05,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:28:05,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:28:05,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:28:05,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 13:28:05,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 13:28:07,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:28:10,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:28:10,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:28:10,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:28:15,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:28:16,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 13:28:16,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:28:22,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:28:22,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:28:23,286 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.895e+02 2.064e+02 2.321e+02 3.195e+02, threshold=4.129e+02, percent-clipped=0.0 2023-09-30 13:28:27,208 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:28:27,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:28:27,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:28:27,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:28:31,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:28:34,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:28:34,150 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:28:34,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:28:39,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:28:39,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:28:44,325 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=726480.0, ans=0.1 2023-09-30 13:28:48,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:28:48,738 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:28:51,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:28:51,821 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:28:52,090 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff3_skip_rate, batch_count=726480.0, ans=0.0 2023-09-30 13:28:53,690 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=726480.0, ans=0.125 2023-09-30 13:28:57,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:28:57,489 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=726546.6666666666, ans=0.125 2023-09-30 13:28:58,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:28:58,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:28:58,753 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:01,039 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:29:01,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:29:04,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:29:05,229 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=7.90 vs. limit=15.0 2023-09-30 13:29:05,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:29:05,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:29:10,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 13:29:12,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:29:15,390 INFO [train.py:1039] (3/4) Epoch 21, batch 2750, loss[loss=0.1736, simple_loss=0.2349, pruned_loss=0.05613, over 23459.00 frames. ], tot_loss[loss=0.1773, simple_loss=0.2531, pruned_loss=0.05077, over 4713626.54 frames. ], batch size: 285, lr: 4.90e-03, grad_scale: 16.0 2023-09-30 13:29:15,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:29:15,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 13:29:15,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 13:29:15,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:29:20,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:29:21,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:29:23,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:23,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:29:23,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:28,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:29:28,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 13:29:28,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:29:30,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:30,410 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 13:29:30,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:29:30,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:29:36,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 13:29:37,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:29:38,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:39,512 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:29:39,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 13:29:41,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:29:42,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:29:42,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:29:42,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:29:47,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:29:47,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 13:29:49,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:29:51,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:29:52,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 13:30:00,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:30:03,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:30:03,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:30:08,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:30:08,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:30:08,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:30:09,846 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=726813.3333333334, ans=0.125 2023-09-30 13:30:15,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:30:15,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:30:15,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 13:30:20,777 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=726880.0, ans=0.125 2023-09-30 13:30:21,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:30:23,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 13:30:26,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 13:30:29,900 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:30:29,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 13:30:31,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:30:32,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:30:34,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 13:30:35,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:30:35,501 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=726880.0, ans=0.125 2023-09-30 13:30:37,940 INFO [train.py:1039] (3/4) Epoch 21, batch 2800, loss[loss=0.1553, simple_loss=0.2302, pruned_loss=0.04017, over 24446.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.252, pruned_loss=0.05023, over 4711745.14 frames. ], batch size: 58, lr: 4.89e-03, grad_scale: 32.0 2023-09-30 13:30:38,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 13:30:39,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:30:39,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:30:41,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 13:30:41,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:30:42,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:30:44,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:30:44,984 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 13:30:44,985 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 13:30:47,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:30:49,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:30:49,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:30:49,862 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=726946.6666666666, ans=0.1 2023-09-30 13:30:53,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:30:56,081 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 13:30:57,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 13:30:58,715 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.60 vs. limit=12.0 2023-09-30 13:30:59,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 13:31:00,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:31:00,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:31:02,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:31:05,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:31:05,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:31:05,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 13:31:07,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:31:10,732 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.859e+02 2.240e+02 2.757e+02 3.972e+02, threshold=4.479e+02, percent-clipped=0.0 2023-09-30 13:31:15,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:31:17,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:31:21,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:31:22,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:31:24,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:31:28,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:31:28,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 13:31:28,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:31:29,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:31:29,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:31:34,149 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:31:35,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:31:37,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:31:40,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:31:40,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:31:40,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:31:42,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:31:42,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:31:44,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:31:44,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 13:31:44,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:31:46,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:31:46,914 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:31:47,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 13:31:47,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:31:47,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:31:48,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:31:48,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 13:31:57,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:31:57,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:31:58,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:32:01,339 INFO [train.py:1039] (3/4) Epoch 21, batch 2850, loss[loss=0.1694, simple_loss=0.2394, pruned_loss=0.04969, over 23584.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2515, pruned_loss=0.04984, over 4711545.69 frames. ], batch size: 135, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:32:01,456 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:32:03,732 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=15.98 vs. limit=22.5 2023-09-30 13:32:04,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=727280.0, ans=0.125 2023-09-30 13:32:06,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:32:06,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:32:06,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:32:09,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:32:09,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:32:10,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:32:12,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 13:32:20,264 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 13:32:20,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:32:21,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 13:32:23,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:32:25,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 13:32:27,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 13:32:28,043 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff3_skip_rate, batch_count=727346.6666666666, ans=0.0 2023-09-30 13:32:29,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:32:31,553 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=727346.6666666666, ans=0.0 2023-09-30 13:32:38,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=727413.3333333334, ans=0.0 2023-09-30 13:32:40,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:32:42,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:32:42,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:32:42,568 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=727413.3333333334, ans=0.0 2023-09-30 13:32:43,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 13:32:43,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:32:43,970 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:32:45,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:32:45,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 13:32:47,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:32:47,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:32:48,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:32:48,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:32:49,164 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=727480.0, ans=0.0 2023-09-30 13:32:52,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:32:52,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:32:53,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:32:55,259 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:32:57,432 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:32:57,662 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=727480.0, ans=0.125 2023-09-30 13:32:58,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:33:00,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:33:03,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:33:03,742 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=727480.0, ans=0.125 2023-09-30 13:33:04,322 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=6.96 vs. limit=10.0 2023-09-30 13:33:08,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:33:10,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 13:33:10,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 13:33:11,773 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:33:13,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:33:13,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 13:33:13,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:33:14,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:33:14,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:33:14,960 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:33:14,961 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 13:33:16,421 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 13:33:16,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:33:16,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:33:21,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 13:33:21,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:33:22,472 INFO [train.py:1039] (3/4) Epoch 21, batch 2900, loss[loss=0.137, simple_loss=0.2161, pruned_loss=0.02891, over 24281.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2516, pruned_loss=0.04998, over 4700872.95 frames. ], batch size: 56, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:33:22,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:33:24,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 13:33:29,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:33:29,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 13:33:30,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 13:33:33,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:33:33,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:33:34,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:33:36,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:33:40,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:33:40,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:33:43,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:33:43,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 13:33:44,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:33:46,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:33:48,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 13:33:48,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 13:33:50,124 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.64 vs. limit=6.0 2023-09-30 13:33:51,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:33:51,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 13:33:51,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:33:54,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:33:54,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 13:33:55,589 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.550e+02 1.882e+02 2.103e+02 2.412e+02 3.503e+02, threshold=4.205e+02, percent-clipped=0.0 2023-09-30 13:33:57,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:33:58,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:34:04,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:34:05,495 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.77 vs. limit=10.0 2023-09-30 13:34:07,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:34:09,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 13:34:09,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 13:34:09,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:34:13,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:34:15,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 13:34:16,704 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:34:21,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:34:32,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:34:32,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:34:32,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 13:34:33,964 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.scale_min, batch_count=727880.0, ans=0.2 2023-09-30 13:34:37,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:34:39,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 13:34:39,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:34:40,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:34:45,994 INFO [train.py:1039] (3/4) Epoch 21, batch 2950, loss[loss=0.2653, simple_loss=0.3122, pruned_loss=0.1092, over 19620.00 frames. ], tot_loss[loss=0.1763, simple_loss=0.2523, pruned_loss=0.05015, over 4702753.34 frames. ], batch size: 388, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:34:46,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:34:47,771 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 13:34:47,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:34:47,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:34:49,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:34:52,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:34:53,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 13:34:54,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 13:34:55,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:34:55,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:35:02,400 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:35:02,817 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=728013.3333333334, ans=0.1 2023-09-30 13:35:04,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:35:05,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:35:07,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:35:12,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:35:12,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:35:14,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:35:15,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:35:15,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:35:17,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 13:35:23,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 13:35:23,916 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 13:35:25,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:35:27,459 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 13:35:29,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 13:35:29,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:35:29,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:35:29,134 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 13:35:29,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 13:35:32,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 13:35:33,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:35:33,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:35:35,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:35:38,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:35:38,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:35:38,500 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 13:35:38,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:35:40,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 13:35:43,625 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=728146.6666666666, ans=0.0 2023-09-30 13:35:45,333 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:35:46,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:35:46,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 13:35:46,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:35:49,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 13:35:51,548 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=728213.3333333334, ans=0.125 2023-09-30 13:35:52,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:35:52,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:35:54,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:35:55,894 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:35:55,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 13:35:56,264 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=728213.3333333334, ans=0.125 2023-09-30 13:35:58,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:35:59,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:35:59,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:35:59,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:36:01,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:36:01,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:36:02,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:36:02,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 13:36:04,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:36:06,208 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:36:06,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 13:36:09,252 INFO [train.py:1039] (3/4) Epoch 21, batch 3000, loss[loss=0.1861, simple_loss=0.258, pruned_loss=0.05713, over 22806.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2533, pruned_loss=0.05046, over 4714738.21 frames. ], batch size: 322, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:36:09,253 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-30 13:36:24,064 INFO [train.py:1071] (3/4) Epoch 21, validation: loss=0.3084, simple_loss=0.2796, pruned_loss=0.1686, over 1125622.00 frames. 2023-09-30 13:36:24,065 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-30 13:36:25,700 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 13:36:25,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 13:36:28,736 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=13.34 vs. limit=15.0 2023-09-30 13:36:30,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:36:30,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:36:30,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 13:36:32,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:36:32,813 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=728280.0, ans=0.2 2023-09-30 13:36:38,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:36:47,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:36:49,934 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.attention_skip_rate, batch_count=728346.6666666666, ans=0.0 2023-09-30 13:36:54,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 13:36:54,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:36:58,429 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.540e+02 1.897e+02 2.038e+02 2.302e+02 2.961e+02, threshold=4.076e+02, percent-clipped=0.0 2023-09-30 13:37:00,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:37:00,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:37:00,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:37:04,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:37:04,205 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 13:37:05,914 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 13:37:07,930 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1.whitening_limit, batch_count=728413.3333333334, ans=10.0 2023-09-30 13:37:08,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:37:08,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 13:37:11,803 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:37:11,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:37:11,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:37:11,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:37:17,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:37:18,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:37:18,026 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:37:20,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:37:22,648 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 13:37:24,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:37:24,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:37:24,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:37:26,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:37:28,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:37:29,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 13:37:29,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 13:37:31,343 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.56 vs. limit=22.5 2023-09-30 13:37:31,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:37:31,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 13:37:31,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:37:35,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 13:37:39,056 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:37:40,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 13:37:40,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 13:37:40,812 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 13:37:40,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:37:40,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:37:42,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:37:42,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 13:37:42,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:37:43,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:37:45,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 13:37:46,983 INFO [train.py:1039] (3/4) Epoch 21, batch 3050, loss[loss=0.1798, simple_loss=0.25, pruned_loss=0.05486, over 23748.00 frames. ], tot_loss[loss=0.1781, simple_loss=0.2543, pruned_loss=0.05088, over 4710885.67 frames. ], batch size: 150, lr: 4.89e-03, grad_scale: 8.0 2023-09-30 13:37:47,262 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:37:49,139 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=728613.3333333334, ans=0.2 2023-09-30 13:37:50,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:37:51,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:37:56,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:37:56,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=728613.3333333334, ans=0.125 2023-09-30 13:37:59,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 13:38:04,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 13:38:06,921 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 13:38:06,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:38:11,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:38:16,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:38:16,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:38:16,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:38:17,374 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=11.59 vs. limit=15.0 2023-09-30 13:38:19,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:38:20,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:38:20,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:38:22,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:38:22,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:38:23,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:38:25,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:38:27,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:38:27,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 13:38:28,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:38:28,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:38:33,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:38:33,703 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.99 vs. limit=22.5 2023-09-30 13:38:34,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 13:38:34,793 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:38:34,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:38:40,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:38:41,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:38:45,904 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.50 vs. limit=22.5 2023-09-30 13:38:48,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:38:48,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:38:49,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:38:51,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:38:52,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 13:38:52,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:38:54,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 13:38:55,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:38:55,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:38:57,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 13:38:58,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:39:06,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:39:07,096 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.05 vs. limit=15.0 2023-09-30 13:39:07,748 INFO [train.py:1039] (3/4) Epoch 21, batch 3100, loss[loss=0.1737, simple_loss=0.2468, pruned_loss=0.05031, over 23445.00 frames. ], tot_loss[loss=0.1771, simple_loss=0.2534, pruned_loss=0.0504, over 4718887.53 frames. ], batch size: 119, lr: 4.89e-03, grad_scale: 8.0 2023-09-30 13:39:09,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:39:10,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 13:39:13,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 13:39:14,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 13:39:14,979 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=728946.6666666666, ans=0.125 2023-09-30 13:39:17,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 13:39:17,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:39:21,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:39:21,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:39:25,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 13:39:29,204 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.91 vs. limit=15.0 2023-09-30 13:39:30,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:39:34,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 13:39:39,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 13:39:39,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:39:39,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:39:39,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:39:40,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 13:39:42,282 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.591e+02 1.906e+02 2.121e+02 2.536e+02 4.254e+02, threshold=4.242e+02, percent-clipped=1.0 2023-09-30 13:39:43,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:39:43,864 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 13:39:43,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:39:44,653 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=8.18 vs. limit=15.0 2023-09-30 13:39:45,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:39:47,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 13:39:48,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:39:52,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:39:54,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 13:39:56,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 13:39:56,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:39:57,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:40:01,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:02,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:40:02,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:40:04,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:40:04,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:40:05,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:40:05,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:40:05,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:40:05,823 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 13:40:06,408 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=4.29 vs. limit=10.0 2023-09-30 13:40:12,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:40:12,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 13:40:15,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:40:15,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 13:40:15,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:16,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:40:16,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 13:40:28,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 13:40:30,077 INFO [train.py:1039] (3/4) Epoch 21, batch 3150, loss[loss=0.1814, simple_loss=0.2662, pruned_loss=0.04825, over 23947.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2526, pruned_loss=0.04975, over 4728728.02 frames. ], batch size: 86, lr: 4.89e-03, grad_scale: 8.0 2023-09-30 13:40:30,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:40:32,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:40:34,739 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:40:35,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:40:36,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 13:40:37,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:40:37,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 13:40:39,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 13:40:40,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:42,360 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 13:40:44,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 13:40:44,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:40:45,845 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 13:40:45,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 13:40:47,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 13:40:47,807 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:40:48,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 13:40:48,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 13:40:48,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:48,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:40:50,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:40:50,722 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 13:40:52,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:40:54,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:40:54,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:40:57,934 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 13:41:02,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 13:41:02,850 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:41:07,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:41:07,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:41:07,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 13:41:10,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 13:41:11,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:41:12,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 13:41:12,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 13:41:13,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:41:13,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:41:15,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:41:15,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 13:41:16,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 13:41:18,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 13:41:18,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:18,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=729413.3333333334, ans=10.0 2023-09-30 13:41:19,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:41:19,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:41:21,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 13:41:22,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:41:24,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 13:41:24,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:26,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 13:41:26,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 13:41:28,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:41:29,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:41:31,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 13:41:31,172 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 13:41:32,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:41:34,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=729480.0, ans=0.1 2023-09-30 13:41:36,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:41:37,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:37,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:41:42,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:41:43,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:45,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 13:41:49,212 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:41:51,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:41:51,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 13:41:52,199 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:41:53,354 INFO [train.py:1039] (3/4) Epoch 21, batch 3200, loss[loss=0.1789, simple_loss=0.2614, pruned_loss=0.04822, over 23761.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2515, pruned_loss=0.0494, over 4726629.26 frames. ], batch size: 85, lr: 4.89e-03, grad_scale: 16.0 2023-09-30 13:41:56,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:41:56,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=729613.3333333334, ans=0.125 2023-09-30 13:41:58,033 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:41:58,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 13:42:00,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:42:06,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:42:10,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:42:12,739 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass_mid.scale_min, batch_count=729680.0, ans=0.2 2023-09-30 13:42:18,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:42:28,690 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.936e+02 2.183e+02 2.546e+02 4.680e+02, threshold=4.365e+02, percent-clipped=1.0 2023-09-30 13:42:28,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 13:42:30,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:42:34,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 13:42:35,121 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=192, metric=6.56 vs. limit=15.0 2023-09-30 13:42:35,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 13:42:39,350 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=9.23 vs. limit=15.0 2023-09-30 13:42:40,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:42:40,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:42:41,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:42:43,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=729813.3333333334, ans=0.1 2023-09-30 13:42:47,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 13:42:48,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 13:42:50,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 13:42:54,053 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 13:42:55,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:42:59,210 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=729880.0, ans=0.1 2023-09-30 13:43:01,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:43:03,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 13:43:03,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:43:03,468 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 13:43:03,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 13:43:06,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:43:08,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 13:43:10,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 13:43:10,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 13:43:13,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 13:43:14,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:43:16,205 INFO [train.py:1039] (3/4) Epoch 21, batch 3250, loss[loss=0.178, simple_loss=0.253, pruned_loss=0.05148, over 23609.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2508, pruned_loss=0.04928, over 4722608.62 frames. ], batch size: 149, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:43:17,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:43:17,798 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 13:43:17,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:43:19,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:19,424 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 13:43:23,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:43:26,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:43:27,345 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:43:34,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:43:34,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 13:43:36,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:43:36,298 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:43:36,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:43:39,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:43:39,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 13:43:42,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:42,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:43:42,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:43:44,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:44,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:44,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:43:47,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:43:48,063 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.min_positive, batch_count=730080.0, ans=0.025 2023-09-30 13:43:49,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:43:50,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:43:50,803 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:43:52,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:43:54,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:43:54,283 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:43:59,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 13:44:00,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:44:00,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:44:01,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=730080.0, ans=0.0 2023-09-30 13:44:02,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:44:02,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:44:08,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:44:09,026 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=730146.6666666666, ans=0.1 2023-09-30 13:44:09,381 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=6.38 vs. limit=15.0 2023-09-30 13:44:14,645 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:44:16,028 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:44:16,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 13:44:16,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:44:16,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 13:44:16,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:44:19,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 13:44:19,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 13:44:21,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:44:21,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:44:21,871 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=730213.3333333334, ans=0.125 2023-09-30 13:44:22,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:44:22,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 13:44:24,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:44:24,851 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=730213.3333333334, ans=0.0 2023-09-30 13:44:28,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:44:28,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:44:28,355 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=730213.3333333334, ans=0.0 2023-09-30 13:44:29,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 13:44:29,618 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:44:33,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:44:33,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 13:44:37,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:44:37,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 13:44:38,543 INFO [train.py:1039] (3/4) Epoch 21, batch 3300, loss[loss=0.1734, simple_loss=0.2634, pruned_loss=0.0417, over 24478.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2521, pruned_loss=0.04953, over 4731006.79 frames. ], batch size: 69, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:44:38,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 13:44:40,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 13:44:41,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:44:46,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:44:47,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:44:47,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:44:48,164 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=730280.0, ans=0.125 2023-09-30 13:44:50,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 13:44:50,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:44:54,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:44:56,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:44:56,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.prob, batch_count=730346.6666666666, ans=0.125 2023-09-30 13:45:01,060 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 13:45:01,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:45:01,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:45:01,547 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=730346.6666666666, ans=0.04949747468305833 2023-09-30 13:45:03,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:03,354 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 13:45:04,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:45:06,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 13:45:07,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 13:45:07,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:45:09,808 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 13:45:13,418 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.541e+02 1.837e+02 2.019e+02 2.374e+02 3.048e+02, threshold=4.038e+02, percent-clipped=0.0 2023-09-30 13:45:13,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:45:13,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 13:45:16,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:16,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 13:45:18,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 13:45:18,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:19,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:45:22,563 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 13:45:24,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 13:45:25,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:45:27,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 13:45:28,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:45:32,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 13:45:34,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:45:35,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:45:37,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:45:37,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:45:37,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:45:40,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:45:40,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:42,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:45:43,853 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 13:45:45,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 13:45:47,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 13:45:47,831 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:45:47,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:45:48,139 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=730546.6666666666, ans=0.125 2023-09-30 13:45:49,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:45:49,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:45:50,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:45:51,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:45:52,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:45:53,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:45:54,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:45:56,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 13:45:57,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:45:58,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:45:59,871 INFO [train.py:1039] (3/4) Epoch 21, batch 3350, loss[loss=0.1845, simple_loss=0.2569, pruned_loss=0.05603, over 23853.00 frames. ], tot_loss[loss=0.1761, simple_loss=0.2529, pruned_loss=0.04962, over 4734810.59 frames. ], batch size: 195, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:46:01,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 13:46:01,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:46:04,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:46:04,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:46:04,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:07,815 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=9.24 vs. limit=15.0 2023-09-30 13:46:09,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:46:10,450 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=730613.3333333334, ans=0.125 2023-09-30 13:46:11,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:11,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:46:13,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:16,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:46:18,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:46:20,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:46:21,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 13:46:23,887 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 13:46:23,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:46:27,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 13:46:27,102 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 13:46:27,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=730680.0, ans=0.125 2023-09-30 13:46:28,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:46:28,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:46:31,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:46:31,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 13:46:31,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:33,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:46:34,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:36,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:36,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.prob, batch_count=730746.6666666666, ans=0.125 2023-09-30 13:46:37,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:37,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:46:39,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=730746.6666666666, ans=0.2 2023-09-30 13:46:40,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:46:43,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:43,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:46:46,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:46:48,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:46:48,615 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=730813.3333333334, ans=0.1 2023-09-30 13:46:51,318 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:46:51,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:46:52,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:46:53,711 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.27 vs. limit=22.5 2023-09-30 13:46:56,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 13:46:56,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:46:56,659 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 13:46:56,723 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:46:58,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 13:46:58,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:47:00,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:47:06,636 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=730880.0, ans=0.0 2023-09-30 13:47:07,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:47:09,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 13:47:10,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 13:47:12,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:47:12,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:47:17,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:47:20,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 13:47:20,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:47:20,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:47:22,753 INFO [train.py:1039] (3/4) Epoch 21, batch 3400, loss[loss=0.1757, simple_loss=0.2378, pruned_loss=0.05677, over 22604.00 frames. ], tot_loss[loss=0.1774, simple_loss=0.2541, pruned_loss=0.05032, over 4736122.06 frames. ], batch size: 322, lr: 4.88e-03, grad_scale: 8.0 2023-09-30 13:47:24,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:47:25,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 13:47:27,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:47:27,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 13:47:29,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:47:29,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:47:29,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 13:47:31,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:47:31,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 13:47:36,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 13:47:36,454 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 13:47:36,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:47:40,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:47:40,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 13:47:42,429 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:47:43,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:47:49,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:47:50,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 13:47:58,450 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.479e+02 1.830e+02 2.005e+02 2.277e+02 4.714e+02, threshold=4.010e+02, percent-clipped=1.0 2023-09-30 13:47:58,558 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:47:58,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:48:00,186 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:48:00,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 13:48:07,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:48:07,552 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=731080.0, ans=0.125 2023-09-30 13:48:12,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 13:48:16,933 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:48:17,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:48:17,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 13:48:17,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:48:18,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:48:19,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:48:20,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:48:23,056 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=8.55 vs. limit=10.0 2023-09-30 13:48:23,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:48:26,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:48:26,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:48:30,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:48:34,218 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 13:48:40,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 13:48:45,281 INFO [train.py:1039] (3/4) Epoch 21, batch 3450, loss[loss=0.1925, simple_loss=0.2761, pruned_loss=0.05443, over 24020.00 frames. ], tot_loss[loss=0.177, simple_loss=0.2529, pruned_loss=0.05055, over 4712699.67 frames. ], batch size: 80, lr: 4.88e-03, grad_scale: 8.0 2023-09-30 13:48:45,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 13:48:47,285 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff3_skip_rate, batch_count=731280.0, ans=0.0 2023-09-30 13:48:50,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 13:48:50,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:48:51,685 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:48:51,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 13:48:53,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:48:55,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:49:01,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:49:01,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:49:04,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:49:04,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:49:05,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:49:11,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 13:49:15,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=731346.6666666666, ans=0.125 2023-09-30 13:49:18,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 13:49:20,039 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 13:49:20,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:49:21,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:49:26,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 13:49:26,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:49:26,694 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=731413.3333333334, ans=0.125 2023-09-30 13:49:32,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:49:32,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:49:33,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 13:49:33,808 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten1.whitening_limit, batch_count=731480.0, ans=10.0 2023-09-30 13:49:36,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:49:39,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 13:49:39,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:49:41,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:49:44,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:49:44,776 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=731480.0, ans=0.2 2023-09-30 13:49:45,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 13:49:49,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:49:49,405 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=731546.6666666666, ans=0.2 2023-09-30 13:49:55,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:49:57,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:49:58,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:50:02,216 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=731546.6666666666, ans=0.125 2023-09-30 13:50:03,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:03,525 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:50:05,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:50:05,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:50:07,250 INFO [train.py:1039] (3/4) Epoch 21, batch 3500, loss[loss=0.1557, simple_loss=0.2442, pruned_loss=0.03358, over 24440.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2511, pruned_loss=0.05042, over 4700813.40 frames. ], batch size: 69, lr: 4.88e-03, grad_scale: 8.0 2023-09-30 13:50:07,671 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 13:50:10,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:50:13,909 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:50:14,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 13:50:15,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 13:50:18,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 13:50:21,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:50:21,607 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 13:50:27,452 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.53 vs. limit=12.0 2023-09-30 13:50:28,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:50:29,726 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:50:29,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:50:29,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:50:29,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 13:50:31,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:31,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:50:31,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 13:50:34,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:35,926 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:50:37,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:50:41,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:43,333 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.924e+02 2.163e+02 2.586e+02 4.135e+02, threshold=4.325e+02, percent-clipped=1.0 2023-09-30 13:50:43,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 13:50:43,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:50:46,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:50:48,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 13:50:48,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:49,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:50:51,082 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:50:51,301 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 13:50:52,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 13:50:54,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 13:50:54,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:50:55,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:50:57,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:50:57,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 13:50:59,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:51:00,246 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.96 vs. limit=10.0 2023-09-30 13:51:01,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:51:07,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:51:07,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 13:51:07,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 13:51:07,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:51:12,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:51:12,522 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:51:14,073 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:51:17,628 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 13:51:19,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:51:20,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:51:20,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 13:51:23,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 13:51:25,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:51:25,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:51:26,840 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:51:26,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:51:28,333 INFO [train.py:1039] (3/4) Epoch 21, batch 3550, loss[loss=0.1612, simple_loss=0.2356, pruned_loss=0.04344, over 24426.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2492, pruned_loss=0.04999, over 4696759.96 frames. ], batch size: 58, lr: 4.88e-03, grad_scale: 8.0 2023-09-30 13:51:30,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 13:51:34,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=731946.6666666666, ans=0.125 2023-09-30 13:51:41,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:51:43,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 13:51:45,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:51:46,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=732013.3333333334, ans=0.0 2023-09-30 13:51:48,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 13:51:49,447 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn1.whiten, num_groups=1, num_channels=512, metric=11.83 vs. limit=22.5 2023-09-30 13:51:50,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:51:50,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:51:50,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:51:54,219 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:51:54,505 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.attention_skip_rate, batch_count=732013.3333333334, ans=0.0 2023-09-30 13:51:55,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:51:57,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:51:57,350 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 13:51:58,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 13:52:00,691 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=732080.0, ans=0.125 2023-09-30 13:52:03,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 13:52:03,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 13:52:06,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:52:06,868 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:52:06,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:52:08,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 13:52:08,873 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:52:10,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:52:12,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 13:52:16,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:52:18,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:52:18,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:52:19,259 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.38 vs. limit=15.0 2023-09-30 13:52:21,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 13:52:22,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:52:23,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 13:52:25,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 13:52:27,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 13:52:27,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:52:30,232 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.36 vs. limit=6.0 2023-09-30 13:52:30,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 13:52:32,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:52:38,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:52:39,925 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 13:52:40,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:52:44,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:52:46,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 13:52:51,310 INFO [train.py:1039] (3/4) Epoch 21, batch 3600, loss[loss=0.1651, simple_loss=0.2405, pruned_loss=0.04484, over 24631.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2498, pruned_loss=0.04972, over 4711454.89 frames. ], batch size: 60, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:52:51,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 13:52:52,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:52:54,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 13:52:56,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:52:58,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:52:58,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:53:02,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:53:03,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:53:04,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 13:53:05,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 13:53:06,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:53:07,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 13:53:10,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:53:11,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:53:11,921 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=732346.6666666666, ans=0.1 2023-09-30 13:53:14,897 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=732346.6666666666, ans=0.125 2023-09-30 13:53:16,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:53:19,966 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:53:21,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 13:53:21,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:53:21,612 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 13:53:23,126 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 13:53:24,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:53:26,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 13:53:27,435 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.568e+02 1.948e+02 2.290e+02 2.674e+02 4.312e+02, threshold=4.579e+02, percent-clipped=0.0 2023-09-30 13:53:27,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:53:31,310 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:53:32,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:53:33,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 13:53:40,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:53:41,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 13:53:43,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 13:53:48,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:53:49,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=732480.0, ans=0.1 2023-09-30 13:53:51,441 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=732480.0, ans=0.2 2023-09-30 13:53:54,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:53:59,135 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:54:00,894 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=732546.6666666666, ans=0.0 2023-09-30 13:54:05,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 13:54:06,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:54:06,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 13:54:06,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 13:54:08,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 13:54:11,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:54:11,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:54:12,709 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 13:54:14,040 INFO [train.py:1039] (3/4) Epoch 21, batch 3650, loss[loss=0.1843, simple_loss=0.2522, pruned_loss=0.05824, over 22879.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2501, pruned_loss=0.04945, over 4714076.88 frames. ], batch size: 322, lr: 4.88e-03, grad_scale: 16.0 2023-09-30 13:54:14,148 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:54:14,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:54:14,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:54:14,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 13:54:15,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 13:54:18,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:54:20,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 13:54:25,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 13:54:26,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:54:26,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=732613.3333333334, ans=0.2 2023-09-30 13:54:29,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 13:54:31,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 13:54:36,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:54:36,483 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 13:54:36,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 13:54:38,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 13:54:38,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:54:41,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 13:54:42,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 13:54:42,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:54:43,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 13:54:45,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:54:45,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:54:45,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:54:48,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:54:49,315 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=732746.6666666666, ans=0.0 2023-09-30 13:54:50,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 13:54:52,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 13:54:52,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:54:53,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 13:54:55,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:54:55,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:55:01,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 13:55:03,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:55:03,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 13:55:06,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 13:55:06,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:55:09,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:55:12,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:55:12,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:55:12,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:55:17,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 13:55:17,224 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:55:17,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:55:23,803 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 13:55:26,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:55:26,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:55:27,102 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 13:55:28,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:55:29,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 13:55:31,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:55:34,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 13:55:34,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:55:35,779 INFO [train.py:1039] (3/4) Epoch 21, batch 3700, loss[loss=0.1843, simple_loss=0.2506, pruned_loss=0.05901, over 23622.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2514, pruned_loss=0.0497, over 4705931.56 frames. ], batch size: 256, lr: 4.87e-03, grad_scale: 16.0 2023-09-30 13:55:37,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 13:55:37,865 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_skip_rate, batch_count=732946.6666666666, ans=0.0 2023-09-30 13:55:41,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:55:41,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:55:44,313 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:55:44,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 13:55:44,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:55:45,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 13:55:45,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 13:55:49,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 13:55:52,523 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.77 vs. limit=6.0 2023-09-30 13:55:54,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:55:55,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:55:56,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 13:55:56,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:55:57,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:56:00,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:56:00,899 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 13:56:08,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 13:56:08,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 13:56:10,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 13:56:10,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 13:56:10,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:56:11,745 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.508e+02 1.839e+02 1.995e+02 2.258e+02 3.801e+02, threshold=3.991e+02, percent-clipped=0.0 2023-09-30 13:56:15,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:56:16,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 13:56:18,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:56:18,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:56:22,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:56:22,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 13:56:26,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 13:56:26,682 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=733146.6666666666, ans=0.0 2023-09-30 13:56:29,786 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:56:29,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 13:56:29,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:56:29,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 13:56:36,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:56:37,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:56:39,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:56:39,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 13:56:42,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:56:42,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 13:56:42,683 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=733213.3333333334, ans=0.2 2023-09-30 13:56:43,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:56:43,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:56:47,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:56:47,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 13:56:48,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 13:56:50,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 13:56:50,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:56:51,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 13:56:51,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 13:56:52,532 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.43 vs. limit=22.5 2023-09-30 13:56:54,241 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.max_positive, batch_count=733213.3333333334, ans=0.95 2023-09-30 13:56:57,573 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:56:59,058 INFO [train.py:1039] (3/4) Epoch 21, batch 3750, loss[loss=0.1644, simple_loss=0.2476, pruned_loss=0.04056, over 24653.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2521, pruned_loss=0.05035, over 4702381.21 frames. ], batch size: 65, lr: 4.87e-03, grad_scale: 16.0 2023-09-30 13:56:59,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:56:59,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:57:02,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 13:57:04,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 13:57:04,734 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=733280.0, ans=0.125 2023-09-30 13:57:06,525 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.37 vs. limit=15.0 2023-09-30 13:57:07,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 13:57:07,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 13:57:08,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:57:09,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:57:10,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:57:12,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:57:12,467 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.balancer1.prob, batch_count=733280.0, ans=0.125 2023-09-30 13:57:15,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:57:18,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 13:57:18,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:57:21,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:57:24,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:57:24,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 13:57:24,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:57:27,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:57:27,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:57:32,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 13:57:35,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 13:57:37,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 13:57:38,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 13:57:40,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:57:44,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:57:46,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 13:57:52,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 13:57:55,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:57:59,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 13:57:59,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 13:58:04,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 13:58:06,862 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=733546.6666666666, ans=0.2 2023-09-30 13:58:08,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 13:58:09,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 13:58:11,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 13:58:13,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 13:58:17,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 13:58:20,406 INFO [train.py:1039] (3/4) Epoch 21, batch 3800, loss[loss=0.1436, simple_loss=0.2212, pruned_loss=0.03298, over 24363.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2515, pruned_loss=0.04979, over 4704725.71 frames. ], batch size: 56, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 13:58:25,156 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 13:58:25,580 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=733613.3333333334, ans=0.125 2023-09-30 13:58:28,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:58:29,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 13:58:30,032 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=733613.3333333334, ans=0.04949747468305833 2023-09-30 13:58:31,322 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 13:58:32,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:58:36,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:58:36,974 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 13:58:40,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 13:58:40,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:58:40,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 13:58:42,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 13:58:43,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 13:58:43,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:58:43,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 13:58:47,631 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 13:58:48,195 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.36 vs. limit=6.0 2023-09-30 13:58:48,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 13:58:50,812 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=733680.0, ans=0.125 2023-09-30 13:58:53,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:58:55,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 13:58:56,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 13:58:58,106 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.518e+02 1.864e+02 2.186e+02 2.612e+02 3.955e+02, threshold=4.372e+02, percent-clipped=0.0 2023-09-30 13:58:58,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 13:58:58,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:58:59,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:59:01,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 13:59:01,965 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.attention_skip_rate, batch_count=733746.6666666666, ans=0.0 2023-09-30 13:59:06,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 13:59:06,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 13:59:07,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:59:12,446 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=733813.3333333334, ans=0.2 2023-09-30 13:59:15,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:59:16,277 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=733813.3333333334, ans=0.125 2023-09-30 13:59:21,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 13:59:22,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 13:59:24,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 13:59:25,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 13:59:28,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 13:59:30,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:59:31,633 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 13:59:34,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 13:59:34,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 13:59:34,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:59:36,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 13:59:37,007 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=3.91 vs. limit=10.0 2023-09-30 13:59:41,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 13:59:41,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 13:59:43,081 INFO [train.py:1039] (3/4) Epoch 21, batch 3850, loss[loss=0.1484, simple_loss=0.231, pruned_loss=0.03297, over 24322.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2506, pruned_loss=0.04968, over 4706117.22 frames. ], batch size: 56, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 13:59:46,187 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.71 vs. limit=15.0 2023-09-30 13:59:48,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 13:59:50,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 13:59:50,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 13:59:50,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=733946.6666666666, ans=0.125 2023-09-30 13:59:52,275 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 13:59:55,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 13:59:59,674 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:00:01,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 14:00:01,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 14:00:09,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:10,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:00:15,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:00:15,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:00:16,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.skip_rate, batch_count=734080.0, ans=0.07 2023-09-30 14:00:18,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:19,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:00:21,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:00:21,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:00:23,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:00:24,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:00:26,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:26,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:00:26,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 14:00:26,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 14:00:27,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:00:27,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:31,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:00:31,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:31,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 14:00:32,102 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_skip_rate, batch_count=734146.6666666666, ans=0.0 2023-09-30 14:00:34,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 14:00:36,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:00:38,507 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.43 vs. limit=22.5 2023-09-30 14:00:39,319 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 14:00:40,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 14:00:46,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:00:47,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:00:50,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:00:52,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 14:00:54,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 14:00:59,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:00:59,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:01:03,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 14:01:03,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 14:01:03,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:04,730 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:04,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:01:04,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 14:01:04,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:01:06,279 INFO [train.py:1039] (3/4) Epoch 21, batch 3900, loss[loss=0.1821, simple_loss=0.2585, pruned_loss=0.05286, over 23446.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2492, pruned_loss=0.04908, over 4709864.52 frames. ], batch size: 105, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 14:01:07,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 14:01:07,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:07,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:01:09,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:01:09,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:11,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:01:11,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:01:11,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:01:12,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:01:12,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 14:01:12,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:14,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:01:15,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 14:01:16,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:01:17,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:01:22,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 14:01:22,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:24,043 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:01:26,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 14:01:27,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:01:28,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 14:01:28,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:01:31,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 14:01:31,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 14:01:38,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:01:39,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:01:39,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:01:41,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:01:44,190 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.861e+02 2.057e+02 2.419e+02 3.679e+02, threshold=4.115e+02, percent-clipped=0.0 2023-09-30 14:01:44,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:01:47,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:01:49,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:01:49,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:01:50,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:01:57,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:01:57,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:02:02,173 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=3.66 vs. limit=6.0 2023-09-30 14:02:06,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:02:06,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:02:17,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:02:19,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:02:21,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 14:02:21,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 14:02:22,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:02:22,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 14:02:24,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:02:25,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 14:02:26,155 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=734546.6666666666, ans=0.0 2023-09-30 14:02:29,003 INFO [train.py:1039] (3/4) Epoch 21, batch 3950, loss[loss=0.1591, simple_loss=0.2334, pruned_loss=0.04236, over 23270.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2484, pruned_loss=0.04893, over 4706964.63 frames. ], batch size: 105, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 14:02:32,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:02:34,278 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 14:02:34,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:02:38,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:02:38,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=734613.3333333334, ans=0.125 2023-09-30 14:02:41,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:02:45,359 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 14:02:46,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:02:46,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 14:02:48,849 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 14:02:48,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:02:51,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:02:51,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:02:51,928 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:02:55,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 14:02:56,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:02:58,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:02:58,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:02:58,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:02:59,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:03:00,140 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=734680.0, ans=0.125 2023-09-30 14:03:11,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:03:11,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:03:11,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=734746.6666666666, ans=0.2 2023-09-30 14:03:14,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=734746.6666666666, ans=0.0 2023-09-30 14:03:16,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 14:03:24,771 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 14:03:24,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 14:03:24,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:03:27,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:03:28,041 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=734813.3333333334, ans=0.1 2023-09-30 14:03:31,175 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward3.hidden_balancer.prob, batch_count=734813.3333333334, ans=0.125 2023-09-30 14:03:35,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:03:35,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:03:37,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:03:37,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:03:37,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 14:03:40,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:03:42,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:03:47,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 14:03:52,514 INFO [train.py:1039] (3/4) Epoch 21, batch 4000, loss[loss=0.177, simple_loss=0.2497, pruned_loss=0.05214, over 23788.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2493, pruned_loss=0.04888, over 4718812.28 frames. ], batch size: 212, lr: 4.87e-03, grad_scale: 16.0 2023-09-30 14:03:59,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:04:02,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=734946.6666666666, ans=0.125 2023-09-30 14:04:05,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:04:12,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:04:12,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:04:12,499 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=735013.3333333334, ans=0.0 2023-09-30 14:04:13,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:04:13,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 14:04:15,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:04:15,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 14:04:15,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:04:15,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 14:04:18,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:04:21,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:04:21,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:04:21,398 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:04:21,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:04:21,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 14:04:21,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=735013.3333333334, ans=10.0 2023-09-30 14:04:24,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:04:25,176 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff2_skip_rate, batch_count=735080.0, ans=0.0 2023-09-30 14:04:26,315 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 14:04:28,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:04:28,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:04:29,850 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.592e+02 1.867e+02 2.046e+02 2.259e+02 3.289e+02, threshold=4.093e+02, percent-clipped=0.0 2023-09-30 14:04:31,593 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 14:04:33,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 14:04:33,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:04:38,568 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 14:04:40,044 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:04:41,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:04:43,153 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 14:04:44,605 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:04:44,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 14:04:44,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:04:46,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:04:48,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:04:49,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:04:49,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:04:49,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:04:52,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 14:04:53,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:04:55,987 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 14:04:58,020 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:04:59,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:05:03,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 14:05:06,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:05:07,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:05:08,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:05:09,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:05:14,491 INFO [train.py:1039] (3/4) Epoch 21, batch 4050, loss[loss=0.1975, simple_loss=0.2654, pruned_loss=0.0648, over 22745.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.25, pruned_loss=0.04917, over 4722519.51 frames. ], batch size: 322, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 14:05:14,984 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=735280.0, ans=0.125 2023-09-30 14:05:16,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:05:17,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 14:05:19,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 14:05:20,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:05:22,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:05:22,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:05:24,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:05:26,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:05:29,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:05:32,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:05:33,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 14:05:35,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:05:35,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:05:38,475 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=735346.6666666666, ans=10.0 2023-09-30 14:05:40,020 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=735346.6666666666, ans=0.125 2023-09-30 14:05:40,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:05:43,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:05:46,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 14:05:47,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 14:05:47,994 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 14:05:49,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:05:57,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 14:05:57,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:06:00,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:06:03,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:06:05,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:06:05,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:06:09,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:06:14,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 14:06:14,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:06:16,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:06:17,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 14:06:21,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:06:26,840 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=5.05 vs. limit=15.0 2023-09-30 14:06:28,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 14:06:30,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:06:30,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:06:31,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 14:06:31,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 14:06:31,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:06:35,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:06:36,992 INFO [train.py:1039] (3/4) Epoch 21, batch 4100, loss[loss=0.2443, simple_loss=0.2995, pruned_loss=0.09452, over 19747.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2514, pruned_loss=0.04979, over 4723378.59 frames. ], batch size: 389, lr: 4.87e-03, grad_scale: 8.0 2023-09-30 14:06:37,090 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:06:37,143 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:06:43,694 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=9.35 vs. limit=22.5 2023-09-30 14:06:44,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 14:06:47,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 14:06:48,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 14:06:49,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 14:06:49,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:06:51,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:06:51,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:06:51,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=735613.3333333334, ans=0.125 2023-09-30 14:06:52,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:06:53,110 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 14:06:57,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:06:57,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:06:57,795 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:06:57,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:06:58,050 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:07:01,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:07:02,886 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:07:04,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:07:04,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 14:07:04,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:07:04,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:07:06,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:07:06,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:07:07,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 14:07:09,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:07:11,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 14:07:12,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:07:16,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:07:16,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 14:07:18,229 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.569e+02 1.830e+02 1.986e+02 2.444e+02 3.912e+02, threshold=3.973e+02, percent-clipped=0.0 2023-09-30 14:07:19,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:07:19,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:07:21,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:07:22,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 14:07:24,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:07:26,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:07:29,584 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 14:07:29,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:07:29,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:07:34,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:07:40,311 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:07:43,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:07:45,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:07:52,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:07:52,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:07:56,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:07:57,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:07:59,566 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=735946.6666666666, ans=0.125 2023-09-30 14:08:00,532 INFO [train.py:1039] (3/4) Epoch 21, batch 4150, loss[loss=0.1508, simple_loss=0.2347, pruned_loss=0.03341, over 24373.00 frames. ], tot_loss[loss=0.1762, simple_loss=0.2524, pruned_loss=0.04998, over 4716244.65 frames. ], batch size: 61, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:08:02,866 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:08:04,338 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:08:05,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:08:05,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:08:08,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 14:08:08,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:08:10,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 14:08:10,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 14:08:11,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 14:08:12,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:08:17,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:08:17,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:08:21,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:08:22,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:08:23,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 14:08:23,659 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=736013.3333333334, ans=0.125 2023-09-30 14:08:25,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:08:25,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:08:27,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 14:08:31,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:08:35,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:08:38,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 14:08:39,427 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.83 vs. limit=15.0 2023-09-30 14:08:40,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 14:08:40,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:08:42,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 14:08:42,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:08:42,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:08:43,497 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=736080.0, ans=0.04949747468305833 2023-09-30 14:08:44,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:08:46,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:08:46,975 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=26.56 vs. limit=22.5 2023-09-30 14:08:51,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 14:08:54,469 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:08:55,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:08:57,390 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 14:08:57,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:08:59,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 14:09:02,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:09:02,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:09:04,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:09:05,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 14:09:05,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:09:05,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 14:09:06,542 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.52 vs. limit=12.0 2023-09-30 14:09:08,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 14:09:10,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 14:09:10,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:09:10,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:09:10,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 14:09:12,492 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 14:09:12,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:09:12,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 14:09:14,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:09:14,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:09:14,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 14:09:15,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:09:20,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:09:22,188 INFO [train.py:1039] (3/4) Epoch 21, batch 4200, loss[loss=0.1714, simple_loss=0.2578, pruned_loss=0.04254, over 24671.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2514, pruned_loss=0.0498, over 4719867.32 frames. ], batch size: 73, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:09:22,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 14:09:23,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:09:25,754 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=736280.0, ans=0.125 2023-09-30 14:09:27,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:09:28,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:09:28,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:09:28,647 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:09:32,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 14:09:35,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 14:09:37,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:09:38,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:09:41,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:09:46,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 14:09:46,951 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:09:47,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:09:48,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 14:09:48,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:09:48,757 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=736346.6666666666, ans=0.125 2023-09-30 14:09:50,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:09:51,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:09:51,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:09:53,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:09:54,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 14:09:54,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:09:57,339 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=736413.3333333334, ans=0.125 2023-09-30 14:10:00,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 14:10:00,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:10:01,434 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.517e+02 1.766e+02 1.995e+02 2.283e+02 3.415e+02, threshold=3.990e+02, percent-clipped=0.0 2023-09-30 14:10:01,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:10:04,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:10:06,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:10:06,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 14:10:07,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:10:09,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:10:12,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:10:14,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:10:22,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:10:23,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 14:10:26,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:10:32,059 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=736546.6666666666, ans=0.125 2023-09-30 14:10:33,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:10:33,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:10:35,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 14:10:41,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:10:45,470 INFO [train.py:1039] (3/4) Epoch 21, batch 4250, loss[loss=0.1796, simple_loss=0.2669, pruned_loss=0.04608, over 24560.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2501, pruned_loss=0.04924, over 4722615.29 frames. ], batch size: 71, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:10:47,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:10:47,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 14:10:50,753 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=736613.3333333334, ans=0.1 2023-09-30 14:10:51,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:10:56,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:10:56,686 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 14:10:56,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:11:01,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:11:06,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:11:06,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=736680.0, ans=0.1 2023-09-30 14:11:09,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:11:09,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:11:12,316 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:11:12,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:11:13,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:11:16,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:11:17,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:11:19,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:11:20,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:11:21,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 14:11:25,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 14:11:25,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:11:25,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:11:26,747 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:11:26,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:11:26,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:11:26,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:11:31,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 14:11:31,563 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=736746.6666666666, ans=0.125 2023-09-30 14:11:32,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:11:35,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=736813.3333333334, ans=0.0 2023-09-30 14:11:37,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:11:39,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:11:41,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 14:11:41,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:11:41,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 14:11:42,623 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:11:44,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:11:45,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:11:45,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:11:46,001 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=736813.3333333334, ans=0.0 2023-09-30 14:11:46,987 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=5.70 vs. limit=6.0 2023-09-30 14:11:48,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 14:11:51,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:11:52,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:11:57,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:12:00,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:12:02,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:12:02,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:12:03,991 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:12:05,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:12:06,786 INFO [train.py:1039] (3/4) Epoch 21, batch 4300, loss[loss=0.1802, simple_loss=0.2502, pruned_loss=0.0551, over 23420.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2503, pruned_loss=0.04925, over 4716737.03 frames. ], batch size: 285, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:12:06,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:12:06,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 14:12:09,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:12:13,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:12:14,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=736946.6666666666, ans=0.2 2023-09-30 14:12:15,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:12:19,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:12:29,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:12:29,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 14:12:29,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:12:33,430 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:12:33,459 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:12:33,506 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 14:12:36,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 14:12:38,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:12:40,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.min_positive, batch_count=737080.0, ans=0.05 2023-09-30 14:12:41,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 14:12:41,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:12:42,575 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 14:12:44,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 14:12:45,603 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.504e+02 1.887e+02 2.170e+02 2.542e+02 3.657e+02, threshold=4.340e+02, percent-clipped=0.0 2023-09-30 14:12:45,853 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:12:49,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:12:49,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:12:51,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:12:52,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:12:52,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:12:54,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 14:12:54,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 14:12:54,590 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=737146.6666666666, ans=0.07 2023-09-30 14:12:54,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=737146.6666666666, ans=0.125 2023-09-30 14:12:57,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:12:57,748 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.attention_skip_rate, batch_count=737146.6666666666, ans=0.0 2023-09-30 14:13:01,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:13:01,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 14:13:01,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:13:01,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:13:01,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 14:13:02,504 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 14:13:02,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 14:13:04,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:13:04,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 14:13:06,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 14:13:09,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:13:11,260 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 14:13:11,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:13:13,771 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.95 vs. limit=15.0 2023-09-30 14:13:14,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:13:14,407 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:13:15,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 14:13:17,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:13:17,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:13:19,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:13:19,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:13:21,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:13:21,396 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=737213.3333333334, ans=0.0 2023-09-30 14:13:22,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:13:25,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:13:25,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:13:25,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:13:28,549 INFO [train.py:1039] (3/4) Epoch 21, batch 4350, loss[loss=0.1593, simple_loss=0.2369, pruned_loss=0.04084, over 24486.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2501, pruned_loss=0.0491, over 4724179.18 frames. ], batch size: 66, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:13:31,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 14:13:31,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 14:13:37,227 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:13:41,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:13:44,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:13:44,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:13:46,675 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer2.prob, batch_count=737346.6666666666, ans=0.125 2023-09-30 14:13:49,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:13:54,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:13:56,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:13:56,690 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten1.whitening_limit, batch_count=737346.6666666666, ans=10.0 2023-09-30 14:13:57,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:14:00,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:14:02,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:14:02,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:14:07,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 14:14:08,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:14:08,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:15,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:16,078 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=737413.3333333334, ans=0.125 2023-09-30 14:14:18,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 14:14:22,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:14:22,381 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.bypass.skip_rate, batch_count=737480.0, ans=0.09899494936611666 2023-09-30 14:14:23,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:14:25,805 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.29 vs. limit=22.5 2023-09-30 14:14:28,162 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 14:14:31,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:14:31,798 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:14:33,241 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 14:14:33,343 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 14:14:33,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:14:34,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:14:35,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:14:36,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:14:38,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:14:38,130 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:14:41,229 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 14:14:41,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:41,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:14:41,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:42,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 14:14:44,312 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 14:14:44,319 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 14:14:44,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 14:14:46,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:14:48,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:14:48,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:14:48,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:14:51,174 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=737613.3333333334, ans=0.0 2023-09-30 14:14:52,196 INFO [train.py:1039] (3/4) Epoch 21, batch 4400, loss[loss=0.1638, simple_loss=0.2407, pruned_loss=0.04346, over 23661.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2516, pruned_loss=0.04984, over 4727994.32 frames. ], batch size: 135, lr: 4.86e-03, grad_scale: 16.0 2023-09-30 14:14:52,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 14:14:53,745 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 14:14:53,759 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:14:54,738 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=6.12 vs. limit=10.0 2023-09-30 14:14:58,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:14:58,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:15:00,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:15:01,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 14:15:01,861 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 14:15:03,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 14:15:03,286 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 14:15:03,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:15:05,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:15:07,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 14:15:08,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:15:10,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:15:10,246 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 14:15:14,736 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:15:14,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 14:15:14,817 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 14:15:18,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 14:15:19,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 14:15:19,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 14:15:20,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:15:20,171 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:15:21,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:15:21,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:15:23,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 14:15:23,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 14:15:25,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:15:26,442 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=5.68 vs. limit=12.0 2023-09-30 14:15:27,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:15:27,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:15:30,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:15:30,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:15:30,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 14:15:31,465 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.552e+02 1.966e+02 2.237e+02 2.534e+02 3.532e+02, threshold=4.474e+02, percent-clipped=0.0 2023-09-30 14:15:31,665 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 14:15:34,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:15:36,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=737746.6666666666, ans=0.0 2023-09-30 14:15:43,143 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:15:44,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 14:15:49,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:15:50,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:15:56,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:15:57,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 14:15:58,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:15:58,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:15:58,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:15:58,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:16:00,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=737880.0, ans=0.1 2023-09-30 14:16:02,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 14:16:05,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 14:16:07,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=737880.0, ans=0.0 2023-09-30 14:16:08,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 14:16:08,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:16:08,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 14:16:08,546 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:16:11,756 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:16:13,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 14:16:15,281 INFO [train.py:1039] (3/4) Epoch 21, batch 4450, loss[loss=0.1823, simple_loss=0.2621, pruned_loss=0.05122, over 24046.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2519, pruned_loss=0.04995, over 4710010.90 frames. ], batch size: 80, lr: 4.86e-03, grad_scale: 16.0 2023-09-30 14:16:17,172 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:16:19,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=737946.6666666666, ans=0.125 2023-09-30 14:16:20,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:16:20,187 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:16:20,538 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=737946.6666666666, ans=0.125 2023-09-30 14:16:25,216 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:16:25,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:16:30,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:16:33,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:16:34,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:16:34,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:16:38,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 14:16:38,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:16:39,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:16:40,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:16:40,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:16:41,716 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 14:16:48,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:16:48,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:16:49,907 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:16:51,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:16:52,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:16:57,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 14:16:58,901 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 14:17:00,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 14:17:00,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:17:00,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass.skip_rate, batch_count=738080.0, ans=0.07 2023-09-30 14:17:02,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:17:03,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 14:17:07,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:17:10,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:17:12,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 14:17:12,555 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=738146.6666666666, ans=0.1 2023-09-30 14:17:14,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:17:14,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:17:14,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:17:14,235 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:17:15,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:17:20,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 14:17:21,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 14:17:22,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:17:24,337 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:17:25,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:17:27,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:17:27,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 14:17:30,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:17:32,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 14:17:33,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:17:33,979 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:17:37,157 INFO [train.py:1039] (3/4) Epoch 21, batch 4500, loss[loss=0.1644, simple_loss=0.2424, pruned_loss=0.04315, over 24502.00 frames. ], tot_loss[loss=0.176, simple_loss=0.2519, pruned_loss=0.05004, over 4715214.08 frames. ], batch size: 63, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:17:39,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:17:39,834 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=738280.0, ans=0.1 2023-09-30 14:17:41,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 14:17:41,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 14:17:43,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:17:48,947 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:17:50,454 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:17:52,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:17:52,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:17:53,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:17:54,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:18:05,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:18:06,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:18:09,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:18:10,967 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:18:11,116 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:18:13,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer1.max_abs, batch_count=738413.3333333334, ans=10.0 2023-09-30 14:18:16,924 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 14:18:17,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=738413.3333333334, ans=0.07 2023-09-30 14:18:18,455 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.886e+02 2.149e+02 2.495e+02 4.486e+02, threshold=4.299e+02, percent-clipped=1.0 2023-09-30 14:18:18,957 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=738413.3333333334, ans=0.1 2023-09-30 14:18:20,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:18:26,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:18:29,026 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:18:30,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 14:18:31,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:18:32,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:18:34,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:18:34,908 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:18:36,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:18:36,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 14:18:36,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:18:36,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:18:41,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:18:41,226 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:18:45,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:18:47,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:18:47,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:18:48,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 14:18:49,220 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.scale_min, batch_count=738546.6666666666, ans=0.2 2023-09-30 14:18:49,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer1.prob, batch_count=738546.6666666666, ans=0.125 2023-09-30 14:18:51,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 14:18:52,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 14:18:55,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 14:18:59,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 14:19:00,964 INFO [train.py:1039] (3/4) Epoch 21, batch 4550, loss[loss=0.1969, simple_loss=0.2551, pruned_loss=0.06931, over 23818.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2499, pruned_loss=0.04972, over 4705627.89 frames. ], batch size: 179, lr: 4.86e-03, grad_scale: 8.0 2023-09-30 14:19:01,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:19:05,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:19:05,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:19:08,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:19:10,669 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.min_positive, batch_count=738613.3333333334, ans=0.05 2023-09-30 14:19:12,231 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=738613.3333333334, ans=0.0 2023-09-30 14:19:13,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:19:15,231 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:19:16,953 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:19:16,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:19:16,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:19:21,931 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:19:21,992 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:19:25,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:19:28,185 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 14:19:28,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 14:19:29,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:19:31,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 14:19:36,253 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=5.65 vs. limit=10.0 2023-09-30 14:19:37,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 14:19:37,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:19:38,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 14:19:40,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:19:40,811 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=738746.6666666666, ans=0.125 2023-09-30 14:19:42,725 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=14.47 vs. limit=15.0 2023-09-30 14:19:43,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:19:43,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:19:43,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:19:46,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 14:19:48,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:19:49,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:19:51,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:19:53,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:19:56,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 14:19:56,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 14:19:57,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:19:57,644 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.53 vs. limit=15.0 2023-09-30 14:19:58,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 14:20:00,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 14:20:01,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:20:01,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:01,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:20:03,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:20:03,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:20:05,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 14:20:05,818 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=738880.0, ans=0.0 2023-09-30 14:20:06,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 14:20:08,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:20:08,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 14:20:08,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 14:20:08,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:20:09,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 14:20:13,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:20:13,087 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:20:16,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:20:16,211 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=738880.0, ans=0.0 2023-09-30 14:20:17,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:20:17,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 14:20:17,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=738880.0, ans=0.0 2023-09-30 14:20:19,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:20:20,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:20:22,175 INFO [train.py:1039] (3/4) Epoch 21, batch 4600, loss[loss=0.1698, simple_loss=0.2509, pruned_loss=0.04436, over 24441.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2488, pruned_loss=0.04933, over 4711165.17 frames. ], batch size: 63, lr: 4.85e-03, grad_scale: 8.0 2023-09-30 14:20:23,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:24,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:20:27,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:20:27,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:20:29,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:20:31,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 14:20:32,816 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=738946.6666666666, ans=0.125 2023-09-30 14:20:34,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:20:37,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:20:37,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:20:37,701 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=739013.3333333334, ans=0.0 2023-09-30 14:20:41,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:47,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 14:20:48,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:50,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:20:55,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:20:55,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:21:00,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 14:21:00,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 14:21:02,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:21:03,435 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.612e+02 1.897e+02 2.077e+02 2.452e+02 3.334e+02, threshold=4.153e+02, percent-clipped=0.0 2023-09-30 14:21:08,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:21:08,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:21:10,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:21:17,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 14:21:18,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 14:21:21,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:21:23,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:21:26,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:21:26,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 14:21:26,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:21:28,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 14:21:28,070 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:21:28,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:21:29,711 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:21:31,067 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:21:31,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:21:32,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 14:21:32,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 14:21:34,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 14:21:34,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:21:35,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:21:35,172 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=739213.3333333334, ans=0.125 2023-09-30 14:21:36,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:21:37,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:21:44,498 INFO [train.py:1039] (3/4) Epoch 21, batch 4650, loss[loss=0.1876, simple_loss=0.2724, pruned_loss=0.05143, over 24468.00 frames. ], tot_loss[loss=0.1732, simple_loss=0.2487, pruned_loss=0.0489, over 4715675.79 frames. ], batch size: 69, lr: 4.85e-03, grad_scale: 8.0 2023-09-30 14:21:48,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:21:51,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:21:51,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:21:52,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:21:52,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:21:52,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:21:53,010 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:21:56,177 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=739280.0, ans=0.0 2023-09-30 14:21:57,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 14:22:00,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:22:02,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 14:22:02,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:22:03,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 14:22:03,672 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:22:05,133 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 14:22:05,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 14:22:05,181 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:22:05,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:22:10,409 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:22:11,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:22:12,013 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 14:22:14,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:22:15,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 14:22:15,666 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=3.69 vs. limit=15.0 2023-09-30 14:22:18,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:22:18,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:22:18,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 14:22:21,298 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:22:22,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:22:23,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.attention_skip_rate, batch_count=739413.3333333334, ans=0.0 2023-09-30 14:22:25,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:22:28,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:22:35,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:22:38,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:22:39,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:22:39,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=739480.0, ans=0.125 2023-09-30 14:22:41,092 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:22:44,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 14:22:44,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 14:22:46,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 14:22:46,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 14:22:47,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:22:54,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:22:54,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:22:56,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 14:22:56,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:22:58,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:22:58,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:22:59,918 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:23:03,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:23:03,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:23:05,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:23:06,498 INFO [train.py:1039] (3/4) Epoch 21, batch 4700, loss[loss=0.1731, simple_loss=0.2542, pruned_loss=0.046, over 24033.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2487, pruned_loss=0.04853, over 4716339.21 frames. ], batch size: 86, lr: 4.85e-03, grad_scale: 8.0 2023-09-30 14:23:08,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:23:09,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:23:09,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:23:09,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 14:23:11,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:23:12,865 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 14:23:14,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=739613.3333333334, ans=0.0 2023-09-30 14:23:19,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:23:21,115 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:23:21,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:23:22,684 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:23:24,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:23:30,682 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 14:23:30,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 14:23:35,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:23:36,498 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:23:36,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:23:39,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:23:46,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 14:23:46,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 14:23:47,469 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.530e+02 1.882e+02 2.021e+02 2.237e+02 3.621e+02, threshold=4.043e+02, percent-clipped=0.0 2023-09-30 14:23:49,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:23:49,533 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer2.prob, batch_count=739746.6666666666, ans=0.125 2023-09-30 14:23:54,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=739813.3333333334, ans=0.125 2023-09-30 14:23:55,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 14:23:56,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:23:59,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:04,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 14:24:05,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:24:11,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:24:11,451 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=739880.0, ans=0.2 2023-09-30 14:24:12,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 14:24:14,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:14,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:24:15,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:24:17,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:24:17,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 14:24:17,468 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.5.prob, batch_count=739880.0, ans=0.125 2023-09-30 14:24:18,877 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 14:24:20,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:24:21,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:21,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:21,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 14:24:23,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:24:28,517 INFO [train.py:1039] (3/4) Epoch 21, batch 4750, loss[loss=0.1509, simple_loss=0.2287, pruned_loss=0.03656, over 24335.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2495, pruned_loss=0.04866, over 4726639.56 frames. ], batch size: 61, lr: 4.85e-03, grad_scale: 8.0 2023-09-30 14:24:28,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 14:24:30,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:24:31,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:24:35,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:24:35,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:24:35,822 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=9.49 vs. limit=15.0 2023-09-30 14:24:37,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 14:24:37,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:24:42,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 14:24:45,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:24:45,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:24:47,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:24:52,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 14:24:57,481 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.20 vs. limit=6.0 2023-09-30 14:24:58,136 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:25:00,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 14:25:00,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:25:02,224 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=740080.0, ans=0.1 2023-09-30 14:25:04,105 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.whiten, num_groups=1, num_channels=384, metric=4.43 vs. limit=12.0 2023-09-30 14:25:05,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:25:05,004 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:25:05,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:25:06,520 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 14:25:06,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 14:25:11,554 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.max_positive, batch_count=740080.0, ans=0.95 2023-09-30 14:25:12,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 14:25:14,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:25:18,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:25:20,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:25:20,652 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 14:25:20,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:25:23,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:25:23,919 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=740146.6666666666, ans=0.2 2023-09-30 14:25:25,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:25:26,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 14:25:26,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 14:25:28,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:25:28,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:25:29,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:25:29,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 14:25:29,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 14:25:33,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 14:25:36,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:25:38,627 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:25:39,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 14:25:40,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:25:40,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:25:41,746 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:25:43,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:25:43,685 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=740213.3333333334, ans=0.125 2023-09-30 14:25:44,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 14:25:46,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:25:46,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 14:25:48,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 14:25:49,275 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=9.45 vs. limit=15.0 2023-09-30 14:25:50,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 14:25:52,014 INFO [train.py:1039] (3/4) Epoch 21, batch 4800, loss[loss=0.1982, simple_loss=0.2623, pruned_loss=0.06705, over 23417.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.251, pruned_loss=0.0493, over 4734780.49 frames. ], batch size: 285, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:25:53,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:25:53,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:25:55,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 14:26:00,046 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:26:01,511 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:26:06,415 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:26:08,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:26:08,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:26:08,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 14:26:11,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:26:11,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:26:11,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:26:15,043 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:26:15,383 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=740346.6666666666, ans=0.125 2023-09-30 14:26:16,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:26:16,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:26:18,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:26:18,930 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 14:26:18,961 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:26:20,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:26:20,870 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=740346.6666666666, ans=0.125 2023-09-30 14:26:24,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:26:24,281 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass.scale_min, batch_count=740413.3333333334, ans=0.2 2023-09-30 14:26:27,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:26:28,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:26:28,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:26:30,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 14:26:31,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:26:31,837 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=740413.3333333334, ans=0.2 2023-09-30 14:26:32,879 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.509e+02 1.899e+02 2.131e+02 2.497e+02 3.417e+02, threshold=4.262e+02, percent-clipped=0.0 2023-09-30 14:26:34,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 14:26:34,533 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 14:26:35,956 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:26:35,990 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:26:36,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:26:36,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:26:36,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:26:37,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:26:39,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:26:42,080 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=2.99 vs. limit=15.0 2023-09-30 14:26:43,044 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:26:47,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:26:48,937 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:26:55,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0 from training. Duration: 0.94 2023-09-30 14:26:55,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_92-76040-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:26:55,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_227-102052-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:26:55,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:26:57,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_352-143891-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:27:00,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_65-106602-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:27:02,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:27:02,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_465-268827-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:27:02,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:27:03,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:27:05,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:27:08,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_276-112684-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:27:08,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_358-411-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:27:08,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_106-21199-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:27:10,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0 from training. Duration: 0.94 2023-09-30 14:27:12,857 INFO [train.py:1039] (3/4) Epoch 21, batch 4850, loss[loss=0.1795, simple_loss=0.2656, pruned_loss=0.0467, over 24431.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2512, pruned_loss=0.04891, over 4738832.62 frames. ], batch size: 77, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:27:12,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0 from training. Duration: 0.69 2023-09-30 14:27:13,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_363-194161-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:27:13,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_586-321321-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:27:13,126 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68900_210_2087_1_1535713157_3708529_164-292661-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:27:13,128 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_114-144878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:27:16,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_96-341874-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:27:26,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0 from training. Duration: 0.51 2023-09-30 14:27:26,392 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer1.prob, batch_count=740613.3333333334, ans=0.125 2023-09-30 14:27:27,651 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_121-320092-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:27:28,215 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=10.84 vs. limit=15.0 2023-09-30 14:27:32,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:27:33,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 14:27:34,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_389-200461-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:27:38,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_28-156553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:27:39,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:27:41,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:27:41,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0 from training. Duration: 0.98 2023-09-30 14:27:46,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_34-39905-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:27:47,726 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:27:47,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 14:27:49,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:27:49,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0 from training. Duration: 0.76 2023-09-30 14:27:52,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:27:52,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_10-297919-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:27:56,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_418-3055-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:27:56,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0 from training. Duration: 0.8 2023-09-30 14:27:56,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0 from training. Duration: 0.67 2023-09-30 14:27:57,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:27:59,654 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=740746.6666666666, ans=0.1 2023-09-30 14:28:06,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_188-55162-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:28:07,534 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0 from training. Duration: 0.89 2023-09-30 14:28:07,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_465-81427-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:28:07,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:28:09,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:28:13,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0 from training. Duration: 0.99 2023-09-30 14:28:13,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_330-240060-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:28:14,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0 from training. Duration: 0.82 2023-09-30 14:28:14,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_150-248939-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:28:16,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_556-206615-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:28:16,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0 from training. Duration: 0.73 2023-09-30 14:28:17,340 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=144, metric=9.11 vs. limit=10.0 2023-09-30 14:28:20,353 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.18 vs. limit=22.5 2023-09-30 14:28:27,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_66-236571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:28:32,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:28:32,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_231-78630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:28:32,546 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=740880.0, ans=0.125 2023-09-30 14:28:35,201 INFO [train.py:1039] (3/4) Epoch 21, batch 4900, loss[loss=0.164, simple_loss=0.2465, pruned_loss=0.04071, over 24481.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2509, pruned_loss=0.04889, over 4736828.33 frames. ], batch size: 63, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:28:37,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0 from training. Duration: 0.62 2023-09-30 14:28:37,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:28:42,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_255-216698-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:28:44,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_137-334143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:28:44,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:28:47,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0 from training. Duration: 0.99 2023-09-30 14:28:52,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_58-29064-0 from training. Duration: 0.99 2023-09-30 14:28:56,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0 from training. Duration: 0.93 2023-09-30 14:28:56,599 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=741013.3333333334, ans=0.0 2023-09-30 14:28:57,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0 from training. Duration: 0.91 2023-09-30 14:28:57,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:28:57,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_464-183853-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:28:57,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_385-210308-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:28:57,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_354-261865-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:28:57,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:28:59,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0 from training. Duration: 0.93 2023-09-30 14:29:03,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0 from training. Duration: 0.73 2023-09-30 14:29:05,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:29:06,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:29:08,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:29:10,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:29:12,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_290-117517-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:29:12,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_35-337140-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:29:12,330 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0 from training. Duration: 0.689 2023-09-30 14:29:15,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:29:16,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_185-14438-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:29:16,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0 from training. Duration: 0.81 2023-09-30 14:29:16,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0 from training. Duration: 0.8 2023-09-30 14:29:17,285 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.562e+02 1.865e+02 2.105e+02 2.513e+02 4.105e+02, threshold=4.210e+02, percent-clipped=0.0 2023-09-30 14:29:17,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=741080.0, ans=0.0 2023-09-30 14:29:19,827 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=8.42 vs. limit=15.0 2023-09-30 14:29:20,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0 from training. Duration: 0.93 2023-09-30 14:29:22,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:29:25,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:29:25,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:29:27,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_370-230614-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:29:27,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 14:29:27,814 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:29:27,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0 from training. Duration: 0.96 2023-09-30 14:29:30,710 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4399_210_1874_1_1526705485_2400757_220-47974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:29:32,317 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0_sp1.1 from training. Duration: 0.6 2023-09-30 14:29:33,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_102-57645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:29:35,904 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=741146.6666666666, ans=0.125 2023-09-30 14:29:37,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0 from training. Duration: 0.74 2023-09-30 14:29:38,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:29:38,646 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp0.9 from training. Duration: 0.488875 2023-09-30 14:29:40,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0 from training. Duration: 0.79 2023-09-30 14:29:40,395 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:29:48,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_438-340023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:29:50,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:29:51,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0 from training. Duration: 0.57 2023-09-30 14:29:52,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 14:29:52,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:29:52,759 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.61 vs. limit=12.0 2023-09-30 14:29:53,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_538-218448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:29:56,876 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=741280.0, ans=0.1 2023-09-30 14:29:57,029 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer1.prob, batch_count=741280.0, ans=0.125 2023-09-30 14:29:58,070 INFO [train.py:1039] (3/4) Epoch 21, batch 4950, loss[loss=0.1625, simple_loss=0.2404, pruned_loss=0.04233, over 19061.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2494, pruned_loss=0.04895, over 4717430.48 frames. ], batch size: 42, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:29:58,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_60-192970-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:29:58,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:30:00,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_643-37412-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:30:00,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp1.1 from training. Duration: 0.42725 2023-09-30 14:30:01,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 14:30:05,408 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_358-154143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:30:05,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 14:30:08,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0 from training. Duration: 0.78 2023-09-30 14:30:08,553 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0 from training. Duration: 0.8 2023-09-30 14:30:08,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:30:10,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0 from training. Duration: 0.78 2023-09-30 14:30:10,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_172-239045-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:10,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:30:12,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:30:12,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_492-95008-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:30:13,630 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_119-90326-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:30:15,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:30:16,641 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:30:18,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_50-242750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:30:20,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_358-98435-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:20,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_399-229778-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:30:22,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=741346.6666666666, ans=0.125 2023-09-30 14:30:23,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:30:28,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_196-221014-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:30,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:30:31,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_213-127634-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:33,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_921-231678-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:30:33,672 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=741413.3333333334, ans=0.1 2023-09-30 14:30:35,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:30:35,477 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0 from training. Duration: 0.88 2023-09-30 14:30:35,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0 from training. Duration: 0.85 2023-09-30 14:30:39,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_314-83429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:30:42,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:30:42,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:30:43,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:30:43,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_818-11907-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:30:45,270 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:30:48,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_6-279195-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:30:49,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:30:51,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:30:53,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_342-3879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:30:55,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_584-65630-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:30:55,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0 from training. Duration: 0.98 2023-09-30 14:30:55,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:30:55,628 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass_mid.scale_min, batch_count=741480.0, ans=0.2 2023-09-30 14:30:57,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:31:00,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_200-333643-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:31:03,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:31:03,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:31:03,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_139-346291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:31:03,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:31:05,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:31:06,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_216-48707-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:31:08,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:31:08,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_583-92842-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:31:10,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0 from training. Duration: 0.97 2023-09-30 14:31:15,228 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=28.80 vs. limit=22.5 2023-09-30 14:31:16,046 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_349-116173-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:31:19,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0 from training. Duration: 0.78 2023-09-30 14:31:19,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp1.1 from training. Duration: 0.52725 2023-09-30 14:31:22,332 INFO [train.py:1039] (3/4) Epoch 21, batch 5000, loss[loss=0.1707, simple_loss=0.2224, pruned_loss=0.05952, over 19129.00 frames. ], tot_loss[loss=0.1728, simple_loss=0.2487, pruned_loss=0.04845, over 4707285.33 frames. ], batch size: 388, lr: 4.85e-03, grad_scale: 16.0 2023-09-30 14:31:25,814 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=741613.3333333334, ans=0.2 2023-09-30 14:31:27,604 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_191-159384-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:31:27,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:31:29,843 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0 from training. Duration: 0.61 2023-09-30 14:31:31,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0 from training. Duration: 0.95 2023-09-30 14:31:32,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_925-191342-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:31:36,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0 from training. Duration: 0.77 2023-09-30 14:31:36,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:31:37,422 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:31:37,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0 from training. Duration: 0.94 2023-09-30 14:31:39,026 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_431-227273-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:31:39,132 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:31:40,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0 from training. Duration: 0.61 2023-09-30 14:31:40,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_746-164782-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:31:40,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_596-21066-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:31:43,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0 from training. Duration: 0.95 2023-09-30 14:31:45,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0 from training. Duration: 0.87 2023-09-30 14:31:45,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:31:46,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0 from training. Duration: 0.97 2023-09-30 14:31:46,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:31:48,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_586-297059-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:31:49,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:31:49,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0 from training. Duration: 0.99 2023-09-30 14:31:49,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0 from training. Duration: 0.6 2023-09-30 14:31:52,050 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=18.90 vs. limit=22.5 2023-09-30 14:31:52,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0 from training. Duration: 0.54 2023-09-30 14:31:52,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_57-309660-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:31:54,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_73-155814-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:31:54,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0 from training. Duration: 0.86 2023-09-30 14:31:54,336 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:31:55,862 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_315-212794-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:31:57,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_314-122795-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:31:57,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=741746.6666666666, ans=0.0 2023-09-30 14:31:58,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp0.9 from training. Duration: 0.47775 2023-09-30 14:32:00,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0 from training. Duration: 0.79 2023-09-30 14:32:01,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_255-228344-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:32:02,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:32:03,260 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.520e+02 1.857e+02 2.049e+02 2.517e+02 4.196e+02, threshold=4.099e+02, percent-clipped=0.0 2023-09-30 14:32:06,992 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0056-334889 from training. Duration: 0.768 2023-09-30 14:32:10,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:32:11,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_355-236205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:32:11,620 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_503-246893-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:14,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0 from training. Duration: 0.98 2023-09-30 14:32:16,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_205-311930-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:32:16,101 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_185-281218-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:32:16,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_191-332415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:32:19,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp1.1 from training. Duration: 0.336375 2023-09-30 14:32:19,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:32:21,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:32:23,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_786-164007-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:32:29,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0 from training. Duration: 0.89 2023-09-30 14:32:32,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_383-339075-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:35,900 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=741880.0, ans=0.5 2023-09-30 14:32:43,731 INFO [train.py:1039] (3/4) Epoch 21, batch 5050, loss[loss=0.1753, simple_loss=0.2473, pruned_loss=0.05163, over 23745.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2492, pruned_loss=0.0484, over 4715591.84 frames. ], batch size: 212, lr: 4.84e-03, grad_scale: 16.0 2023-09-30 14:32:43,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_834-322225-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:32:44,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer2.prob, batch_count=741946.6666666666, ans=0.125 2023-09-30 14:32:45,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_56-290559-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:45,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:32:45,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_56-13561-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:32:45,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:32:47,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:32:47,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_127-207553-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:51,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_494-203521-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:32:51,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0 from training. Duration: 0.99 2023-09-30 14:32:52,193 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=741946.6666666666, ans=0.125 2023-09-30 14:32:53,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:32:57,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_742-154017-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:32:58,003 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=741946.6666666666, ans=0.125 2023-09-30 14:32:59,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:32:59,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0 from training. Duration: 0.94 2023-09-30 14:33:00,624 WARNING [train.py:1197] (3/4) Exclude cut with ID _8393_210_6702_1_1528505860249_3457328_453-176500-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:33:02,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_102-179528-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:33:03,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:33:03,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:33:05,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:33:13,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.scale_min, batch_count=742013.3333333334, ans=0.2 2023-09-30 14:33:15,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0 from training. Duration: 0.91 2023-09-30 14:33:16,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp1.1 from training. Duration: 0.6 2023-09-30 14:33:17,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:33:18,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0 from training. Duration: 0.68 2023-09-30 14:33:20,026 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:33:20,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_47-4814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:33:20,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _43541_210_10626_1_1533796233418_3635440_40-247993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:33:20,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:33:20,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0 from training. Duration: 0.94 2023-09-30 14:33:21,833 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0 from training. Duration: 0.86 2023-09-30 14:33:23,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_683-284578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:33:23,711 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.prob, batch_count=742080.0, ans=0.125 2023-09-30 14:33:26,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:33:29,302 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_556-212082-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:33:29,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0 from training. Duration: 0.87 2023-09-30 14:33:33,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_333-159440-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:33:36,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0 from training. Duration: 0.99 2023-09-30 14:33:37,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:33:37,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:33:37,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_743-302085-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:33:37,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:33:39,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:33:42,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_421-19547-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:33:42,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_714-8668-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:33:42,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_49-101072-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:33:42,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:33:43,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0 from training. Duration: 0.87 2023-09-30 14:33:46,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_111-112362-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:33:47,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:33:51,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=742213.3333333334, ans=0.125 2023-09-30 14:33:52,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_98-255710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:33:52,630 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0003-515609 from training. Duration: 0.896 2023-09-30 14:33:52,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp0.9 from training. Duration: 0.688875 2023-09-30 14:33:54,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _26750_210_5665_1_1532226678442_4063230_7-346885-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:33:54,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_357-227078-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:33:54,271 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0119-520904 from training. Duration: 0.896 2023-09-30 14:33:56,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:33:56,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0 from training. Duration: 0.94 2023-09-30 14:33:56,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_158-21166-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:34:00,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_207-332194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:34:01,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_324-211031-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:34:02,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0 from training. Duration: 0.84 2023-09-30 14:34:04,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0 from training. Duration: 0.54 2023-09-30 14:34:05,805 INFO [train.py:1039] (3/4) Epoch 21, batch 5100, loss[loss=0.1848, simple_loss=0.2559, pruned_loss=0.05686, over 23664.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2507, pruned_loss=0.04918, over 4719735.90 frames. ], batch size: 232, lr: 4.84e-03, grad_scale: 16.0 2023-09-30 14:34:07,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_649-79538-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:34:07,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_343-115667-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:34:07,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:34:10,603 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0003-549749 from training. Duration: 0.768 2023-09-30 14:34:12,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=742280.0, ans=0.1 2023-09-30 14:34:13,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:34:14,136 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:34:15,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0 from training. Duration: 0.85 2023-09-30 14:34:15,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0 from training. Duration: 0.73 2023-09-30 14:34:15,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_46-57119-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:34:19,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_546-119312-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:34:22,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_468-306939-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:34:22,175 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_22-138274-0 from training. Duration: 0.96 2023-09-30 14:34:23,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0 from training. Duration: 0.76 2023-09-30 14:34:23,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=742346.6666666666, ans=0.1 2023-09-30 14:34:27,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_64-133721-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:34:28,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:34:31,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_704-101963-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:34:35,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0 from training. Duration: 0.94 2023-09-30 14:34:36,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_679-190385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:34:39,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_298-81459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:34:39,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp0.9 from training. Duration: 0.588875 2023-09-30 14:34:40,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_196-283734-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:34:42,308 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=742413.3333333334, ans=0.125 2023-09-30 14:34:43,597 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_294-198554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:34:43,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0 from training. Duration: 0.81 2023-09-30 14:34:45,218 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0113-610139 from training. Duration: 0.896 2023-09-30 14:34:45,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_638-171906-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:34:46,538 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.489e+02 2.007e+02 2.200e+02 2.519e+02 3.504e+02, threshold=4.400e+02, percent-clipped=0.0 2023-09-30 14:34:46,661 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0 from training. Duration: 0.67 2023-09-30 14:34:46,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0 from training. Duration: 0.97 2023-09-30 14:34:47,030 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:34:49,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_109-282568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:34:54,503 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.69 vs. limit=22.5 2023-09-30 14:35:01,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_121-45934-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:35:03,260 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.layerdrop_rate, batch_count=742480.0, ans=0.015 2023-09-30 14:35:04,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0 from training. Duration: 0.5 2023-09-30 14:35:04,757 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0192-640319 from training. Duration: 0.512 2023-09-30 14:35:04,770 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9310W0003-640649 from training. Duration: 0.896 2023-09-30 14:35:06,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0 from training. Duration: 0.81 2023-09-30 14:35:06,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _40380_210_9059_1_1533380594759_4187380_6-33508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:35:07,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0 from training. Duration: 0.89 2023-09-30 14:35:08,712 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.41 vs. limit=22.5 2023-09-30 14:35:13,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0 from training. Duration: 0.71 2023-09-30 14:35:15,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 14:35:16,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:35:19,945 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0 from training. Duration: 0.87 2023-09-30 14:35:21,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp1.1 from training. Duration: 0.57275 2023-09-30 14:35:22,811 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0 from training. Duration: 0.86 2023-09-30 14:35:28,129 INFO [train.py:1039] (3/4) Epoch 21, batch 5150, loss[loss=0.1644, simple_loss=0.2476, pruned_loss=0.04062, over 24405.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2515, pruned_loss=0.04973, over 4725885.38 frames. ], batch size: 63, lr: 4.84e-03, grad_scale: 16.0 2023-09-30 14:35:28,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:35:28,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_142-216831-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:35:28,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:35:29,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:35:31,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 14:35:32,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_98-311335-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:35:33,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0 from training. Duration: 0.89 2023-09-30 14:35:33,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0 from training. Duration: 0.73 2023-09-30 14:35:34,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0 from training. Duration: 0.74 2023-09-30 14:35:34,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:35:35,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0 from training. Duration: 0.97 2023-09-30 14:35:35,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19949_210_2161_1_1531706337_928079_8-160956-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:35:35,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 14:35:38,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_583-124650-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:35:39,606 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_164-182755-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:35:44,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:35:44,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0 from training. Duration: 0.8 2023-09-30 14:35:48,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_487-182647-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:35:48,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:35:51,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp0.9 from training. Duration: 0.711125 2023-09-30 14:35:51,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_504-72513-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:35:51,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_324-265233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:35:51,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:35:51,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:35:51,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0 from training. Duration: 0.84 2023-09-30 14:35:54,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:35:54,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:35:56,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=742680.0, ans=0.0 2023-09-30 14:35:57,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:35:59,459 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0 from training. Duration: 0.79 2023-09-30 14:35:59,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:35:59,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.out_combiner.scale_min, batch_count=742746.6666666666, ans=0.2 2023-09-30 14:36:06,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:36:06,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0 from training. Duration: 0.91 2023-09-30 14:36:11,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_125-158157-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:36:20,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_420-113180-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:36:20,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_236-146773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:36:24,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_175-35012-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:36:24,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_99-275843-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:36:26,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0 from training. Duration: 0.53 2023-09-30 14:36:32,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_234-65934-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:36:33,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:36:34,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:36:35,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_420-236465-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:36:37,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _46681_210_9642_1_1533893005571_7603359_333-4042-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:36:39,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0 from training. Duration: 0.55 2023-09-30 14:36:45,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_426-92599-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:36:45,655 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 14:36:47,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_140-345530-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:36:47,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:36:49,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp0.9 from training. Duration: 0.688875 2023-09-30 14:36:49,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:36:49,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_404-249994-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:36:50,692 INFO [train.py:1039] (3/4) Epoch 21, batch 5200, loss[loss=0.1506, simple_loss=0.2318, pruned_loss=0.03467, over 24628.00 frames. ], tot_loss[loss=0.1758, simple_loss=0.2518, pruned_loss=0.04985, over 4728374.11 frames. ], batch size: 60, lr: 4.84e-03, grad_scale: 32.0 2023-09-30 14:36:50,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_222-108810-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:36:54,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:36:56,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:36:57,298 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.whiten, num_groups=1, num_channels=384, metric=5.65 vs. limit=12.0 2023-09-30 14:36:59,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_43-192945-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:37:02,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0 from training. Duration: 0.71 2023-09-30 14:37:03,144 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=742946.6666666666, ans=0.1 2023-09-30 14:37:04,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:37:05,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_190-280227-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:37:07,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_398-56897-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:37:08,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:37:08,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_582-18084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:37:10,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0 from training. Duration: 0.49 2023-09-30 14:37:15,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:37:15,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_55-250624-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:37:18,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0 from training. Duration: 0.55 2023-09-30 14:37:20,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.prob, batch_count=743013.3333333334, ans=0.125 2023-09-30 14:37:21,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:37:21,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:37:23,719 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0 from training. Duration: 0.55 2023-09-30 14:37:23,801 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0 from training. Duration: 0.91 2023-09-30 14:37:25,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0 from training. Duration: 0.98 2023-09-30 14:37:27,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_201-160779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:37:27,514 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0406W0226-885434 from training. Duration: 0.997 2023-09-30 14:37:27,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_353-328201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:37:29,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_572-96029-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:37:29,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:37:30,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0 from training. Duration: 0.93 2023-09-30 14:37:31,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_685-90183-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:37:31,318 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module1.balancer1.prob, batch_count=743080.0, ans=0.125 2023-09-30 14:37:32,278 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.588e+02 1.872e+02 2.039e+02 2.379e+02 3.474e+02, threshold=4.079e+02, percent-clipped=0.0 2023-09-30 14:37:33,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_41-22245-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:37:35,660 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0 from training. Duration: 0.8 2023-09-30 14:37:35,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0 from training. Duration: 0.7 2023-09-30 14:37:37,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0 from training. Duration: 0.82 2023-09-30 14:37:39,470 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward1.out_whiten.whitening_limit, batch_count=743146.6666666666, ans=15.0 2023-09-30 14:37:43,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0 from training. Duration: 0.37 2023-09-30 14:37:43,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:37:49,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:37:49,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_354-296765-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:37:50,116 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:37:51,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0 from training. Duration: 0.57 2023-09-30 14:37:51,390 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1466-248056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:37:52,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp0.9 from training. Duration: 0.52225 2023-09-30 14:37:52,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_747-129941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:37:52,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:37:56,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:37:57,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:38:03,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _39204_210_4220_1_1533880547368_3191120_346-180306-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:38:04,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_641-158017-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:38:04,764 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_274-42137-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:38:08,264 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=743213.3333333334, ans=0.1 2023-09-30 14:38:09,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_87-327819-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:38:09,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0 from training. Duration: 0.47 2023-09-30 14:38:11,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:38:11,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_177-227063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:38:12,504 INFO [train.py:1039] (3/4) Epoch 21, batch 5250, loss[loss=0.1596, simple_loss=0.2313, pruned_loss=0.04393, over 23523.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2512, pruned_loss=0.04991, over 4721767.28 frames. ], batch size: 134, lr: 4.84e-03, grad_scale: 16.0 2023-09-30 14:38:12,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_510-45444-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:38:14,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp1.1 from training. Duration: 0.636375 2023-09-30 14:38:14,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:38:17,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_367-148411-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:38:19,455 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.bypass_mid.scale_min, batch_count=743280.0, ans=0.2 2023-09-30 14:38:20,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_389-144525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:38:20,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_158-238427-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:38:22,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:38:26,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_188-71831-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:38:28,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:38:28,888 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.skip_rate, batch_count=743346.6666666666, ans=0.09899494936611666 2023-09-30 14:38:30,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_469-49081-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:38:32,381 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:38:33,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:38:35,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0 from training. Duration: 0.95 2023-09-30 14:38:35,180 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_236-223652-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:38:37,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_220-163998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:38:51,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=743413.3333333334, ans=0.1 2023-09-30 14:38:51,764 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.ff2_skip_rate, batch_count=743413.3333333334, ans=0.0 2023-09-30 14:38:52,293 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=17.91 vs. limit=22.5 2023-09-30 14:38:57,557 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=5.33 vs. limit=6.0 2023-09-30 14:39:19,801 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.89 vs. limit=15.0 2023-09-30 14:39:23,183 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=743546.6666666666, ans=0.125 2023-09-30 14:39:23,615 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.00 vs. limit=22.5 2023-09-30 14:39:25,644 INFO [train.py:1039] (3/4) Epoch 21, batch 5300, loss[loss=0.1846, simple_loss=0.2566, pruned_loss=0.05633, over 23302.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2495, pruned_loss=0.04951, over 4696461.72 frames. ], batch size: 119, lr: 4.84e-03, grad_scale: 8.0 2023-09-30 14:39:30,803 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.71 vs. limit=15.0 2023-09-30 14:39:40,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:39:40,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0 from training. Duration: 0.9 2023-09-30 14:39:41,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0 from training. Duration: 0.8 2023-09-30 14:39:41,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_515-139111-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:39:41,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:39:41,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:39:41,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _70447_210_12392_1_1535972499300_3160420_123-312838-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:39:41,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_70-63596-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:39:41,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_225-188509-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:39:41,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_584-180532-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:39:41,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp1.1 from training. Duration: 0.57275 2023-09-30 14:39:42,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:39:42,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0 from training. Duration: 0.9 2023-09-30 14:39:42,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0 from training. Duration: 0.74 2023-09-30 14:39:42,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0 from training. Duration: 0.99 2023-09-30 14:39:42,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp0.9 from training. Duration: 0.82225 2023-09-30 14:39:42,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0 from training. Duration: 0.83 2023-09-30 14:39:43,057 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0 from training. Duration: 0.51 2023-09-30 14:39:43,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_899-80223-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:39:44,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_210-234949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:39:44,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_768-278176-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:39:44,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_315-128283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:39:44,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_330-323941-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:39:45,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:39:45,050 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_433-10563-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:39:45,114 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:39:45,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _14623_210_1925_1_1530773354962_3632609_157-218771-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:39:45,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1368-78165-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:39:45,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:39:45,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_291-138812-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:39:45,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:39:46,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0 from training. Duration: 0.97 2023-09-30 14:39:46,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_286-222073-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:39:46,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:39:46,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0 from training. Duration: 0.86 2023-09-30 14:39:46,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0 from training. Duration: 0.63 2023-09-30 14:39:47,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:39:47,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_154-74182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:39:47,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0 from training. Duration: 0.43 2023-09-30 14:39:47,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0 from training. Duration: 0.71 2023-09-30 14:39:47,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:39:48,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:39:48,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:39:48,976 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0287W0023-142350 from training. Duration: 0.956 2023-09-30 14:39:49,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0 from training. Duration: 0.96 2023-09-30 14:39:49,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:39:49,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_185-17516-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:39:49,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0 from training. Duration: 0.83 2023-09-30 14:39:49,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0 from training. Duration: 0.96 2023-09-30 14:39:49,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_116-21034-0 from training. Duration: 0.96 2023-09-30 14:39:49,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:39:58,331 INFO [train.py:1039] (3/4) Epoch 22, batch 0, loss[loss=0.1757, simple_loss=0.2562, pruned_loss=0.04758, over 23233.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2562, pruned_loss=0.04758, over 23233.00 frames. ], batch size: 93, lr: 4.73e-03, grad_scale: 16.0 2023-09-30 14:39:58,331 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-30 14:40:11,504 INFO [train.py:1071] (3/4) Epoch 22, validation: loss=0.3042, simple_loss=0.2741, pruned_loss=0.1671, over 1125622.00 frames. 2023-09-30 14:40:11,505 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-30 14:40:13,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0 from training. Duration: 0.54 2023-09-30 14:40:15,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_158-289345-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:40:16,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:40:21,391 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_788-278281-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:40:21,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:40:21,665 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer_ff3.min_abs, batch_count=743693.3333333334, ans=0.2 2023-09-30 14:40:22,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_267-139775-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:40:23,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0 from training. Duration: 0.68 2023-09-30 14:40:25,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0 from training. Duration: 0.94 2023-09-30 14:40:27,048 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=743760.0, ans=0.125 2023-09-30 14:40:28,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _16747_210_5986_1_1531811544733_2749109_87-349587-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:40:28,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_505-346452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:40:31,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_13-192107-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:40:31,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_430-307666-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:40:32,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:40:32,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:40:35,704 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.460e+02 1.911e+02 2.208e+02 2.678e+02 6.793e+02, threshold=4.416e+02, percent-clipped=10.0 2023-09-30 14:40:35,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0 from training. Duration: 0.79 2023-09-30 14:40:37,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:40:47,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:40:47,047 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_326-48066-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:40:48,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0 from training. Duration: 0.98 2023-09-30 14:40:48,698 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:40:52,076 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=256, metric=11.03 vs. limit=22.5 2023-09-30 14:40:54,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:40:54,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:40:57,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_88-225232-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:41:01,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_205-265518-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:41:06,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_271-253734-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:41:06,370 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:41:07,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=743893.3333333334, ans=0.0 2023-09-30 14:41:10,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0 from training. Duration: 0.86 2023-09-30 14:41:13,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0 from training. Duration: 0.86 2023-09-30 14:41:15,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_38-348116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:41:15,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _66763_210_17301_1_1535788667617_241750_21-328480-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:41:15,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:41:16,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_592-101075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:41:19,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0 from training. Duration: 0.85 2023-09-30 14:41:21,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_166-319773-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:41:23,931 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=5.86 vs. limit=15.0 2023-09-30 14:41:24,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_606-224984-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:41:29,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:41:31,614 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0620W0113-306765 from training. Duration: 0.768 2023-09-30 14:41:33,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_410-287857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:41:35,322 INFO [train.py:1039] (3/4) Epoch 22, batch 50, loss[loss=0.1797, simple_loss=0.2531, pruned_loss=0.05314, over 23466.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2501, pruned_loss=0.0483, over 1063141.13 frames. ], batch size: 134, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:41:38,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_78-95260-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:41:40,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _58059_210_5854_1_1534901471149_3164808_6-73080-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:41:40,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0 from training. Duration: 0.62 2023-09-30 14:41:40,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 14:41:40,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_620-144779-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:41:43,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_561-288575-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:41:43,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_22657_210_6890_1_1532086002_1637216_122-188128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:41:45,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward3.hidden_balancer.prob, batch_count=744026.6666666666, ans=0.125 2023-09-30 14:41:46,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_604-86607-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:41:51,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0 from training. Duration: 0.68 2023-09-30 14:41:51,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_10987_210_4163_1_1529760555_4123902_302-87926-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:41:59,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp0.9 from training. Duration: 0.87775 2023-09-30 14:42:00,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0 from training. Duration: 0.83 2023-09-30 14:42:02,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0 from training. Duration: 0.93 2023-09-30 14:42:03,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:42:06,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:42:06,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_623-346820-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:42:06,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_454-266903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:42:08,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp1.1 from training. Duration: 0.5 2023-09-30 14:42:09,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 14:42:09,902 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_957-138882-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:42:16,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_219-38004-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:42:18,567 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=744160.0, ans=0.0 2023-09-30 14:42:19,736 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:42:19,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:42:19,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0 from training. Duration: 0.96 2023-09-30 14:42:22,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:42:24,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:42:24,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0 from training. Duration: 0.98 2023-09-30 14:42:24,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_219-106541-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:42:25,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0 from training. Duration: 0.97 2023-09-30 14:42:35,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1051-182606-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:42:35,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_603-4396-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:42:37,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_159-144367-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:42:39,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:42:39,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp0.9 from training. Duration: 0.888875 2023-09-30 14:42:41,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0 from training. Duration: 0.71 2023-09-30 14:42:41,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0 from training. Duration: 0.56 2023-09-30 14:42:41,975 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:42:43,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_80-107757-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:42:43,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp0.9 from training. Duration: 0.888875 2023-09-30 14:42:44,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1033-168593-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:42:46,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_475-30711-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:42:47,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0 from training. Duration: 0.49 2023-09-30 14:42:47,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0 from training. Duration: 0.91 2023-09-30 14:42:49,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp0.9 from training. Duration: 0.47775 2023-09-30 14:42:52,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_550-170116-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:42:52,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:42:52,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0 from training. Duration: 0.59 2023-09-30 14:42:53,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0 from training. Duration: 0.74 2023-09-30 14:42:54,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_681-195827-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:42:54,636 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:42:56,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp0.9 from training. Duration: 0.62225 2023-09-30 14:42:57,524 INFO [train.py:1039] (3/4) Epoch 22, batch 100, loss[loss=0.1705, simple_loss=0.2474, pruned_loss=0.04681, over 23231.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2523, pruned_loss=0.04939, over 1877898.13 frames. ], batch size: 105, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:42:57,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_287-292174-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:43:00,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_200-292228-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:43:00,922 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=744360.0, ans=0.125 2023-09-30 14:43:03,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_291-194999-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:43:07,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:43:07,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0 from training. Duration: 0.83 2023-09-30 14:43:07,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_322-115153-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:43:12,737 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:43:12,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:43:12,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:43:12,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:43:12,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:43:14,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0 from training. Duration: 0.73 2023-09-30 14:43:16,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:43:17,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_204-137133-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:43:17,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_5-46496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:43:17,598 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:43:21,021 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=744426.6666666666, ans=0.125 2023-09-30 14:43:22,472 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.832e+02 2.020e+02 2.279e+02 4.259e+02, threshold=4.040e+02, percent-clipped=0.0 2023-09-30 14:43:22,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0 from training. Duration: 0.88 2023-09-30 14:43:22,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_110-79134-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:43:24,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_421-37641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:43:25,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:43:27,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:43:30,934 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=5.29 vs. limit=15.0 2023-09-30 14:43:31,637 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0120-509520 from training. Duration: 0.768 2023-09-30 14:43:31,674 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0433-509820 from training. Duration: 0.896 2023-09-30 14:43:32,512 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.whiten, num_groups=1, num_channels=384, metric=3.64 vs. limit=12.0 2023-09-30 14:43:33,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_163-334586-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:43:33,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:43:36,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp0.9 from training. Duration: 0.87775 2023-09-30 14:43:38,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_576-51198-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:43:38,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_306-216871-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:43:40,586 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=744493.3333333334, ans=0.125 2023-09-30 14:43:45,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _20684_210_6917_1_1531567751646_4203930_463-119982-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:43:47,250 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0012-536850 from training. Duration: 0.64 2023-09-30 14:43:48,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp1.1 from training. Duration: 0.4 2023-09-30 14:43:50,486 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=744560.0, ans=0.0 2023-09-30 14:43:53,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:43:53,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:43:56,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_333-155030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:44:00,016 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_588-257177-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:44:00,373 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.5.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:44:00,596 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=4.10 vs. limit=10.0 2023-09-30 14:44:03,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:44:04,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:44:04,976 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=744626.6666666666, ans=0.1 2023-09-30 14:44:06,426 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=744626.6666666666, ans=0.125 2023-09-30 14:44:06,957 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=6.16 vs. limit=15.0 2023-09-30 14:44:07,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_19-143553-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:44:07,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_582-130735-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:44:10,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_943-201356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:44:10,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_422-229276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:44:10,713 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_73-198717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:44:12,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0 from training. Duration: 0.91 2023-09-30 14:44:12,183 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0114-579000 from training. Duration: 0.896 2023-09-30 14:44:12,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_210-132675-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:44:14,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:44:14,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_615-322035-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:14,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_119-315767-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:44:14,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 14:44:14,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:44:14,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:44:16,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50428_210_5724_1_1534247532_4126418_464-18322-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:16,871 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=744626.6666666666, ans=0.0 2023-09-30 14:44:18,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_466-131402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:44:18,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1259-132768-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:44:20,006 INFO [train.py:1039] (3/4) Epoch 22, batch 150, loss[loss=0.177, simple_loss=0.2433, pruned_loss=0.05532, over 23696.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2519, pruned_loss=0.04923, over 2510814.48 frames. ], batch size: 232, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:44:20,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:44:21,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_59-133290-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:44:24,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_223-49525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:44:27,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:44:27,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_52-221627-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:44:27,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_296-115766-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:29,427 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=744693.3333333334, ans=0.125 2023-09-30 14:44:30,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _14624_210_1925_1_1530858690657_3642219_100-19871-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:44:30,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_50-293675-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:34,494 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:44:35,987 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_388-211626-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:40,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0 from training. Duration: 0.65 2023-09-30 14:44:40,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0 from training. Duration: 0.52 2023-09-30 14:44:40,603 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0 from training. Duration: 0.63 2023-09-30 14:44:43,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467391_239068_13-274633-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:44:43,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:44:45,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:44:47,265 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_285-219531-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:44:47,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_22-37858-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:44:47,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_200-88444-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:47,442 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_77-279042-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:44:48,931 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0193-640320 from training. Duration: 0.896 2023-09-30 14:44:51,202 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=512, metric=16.93 vs. limit=22.5 2023-09-30 14:44:51,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_516-324055-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:44:57,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_10-261240-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:45:00,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 14:45:01,736 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0 from training. Duration: 0.99 2023-09-30 14:45:06,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:45:06,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_934-325498-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:45:06,856 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module1.whiten, num_groups=1, num_channels=512, metric=5.03 vs. limit=15.0 2023-09-30 14:45:08,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:45:09,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:45:11,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_345-49721-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:45:12,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_440-141811-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:45:12,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_394-212120-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:45:12,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0 from training. Duration: 0.74 2023-09-30 14:45:17,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_191-346343-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:45:18,448 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.43 vs. limit=10.0 2023-09-30 14:45:19,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_351-347974-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:45:19,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_48-269402-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:45:19,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:45:22,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_158-49004-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:45:23,377 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass_mid.scale_min, batch_count=744960.0, ans=0.2 2023-09-30 14:45:24,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 14:45:26,121 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0_sp1.1 from training. Duration: 0.836375 2023-09-30 14:45:28,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:45:29,796 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_158-114960-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:45:31,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:45:32,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0 from training. Duration: 0.69 2023-09-30 14:45:32,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:45:32,881 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0079W0443-717795 from training. Duration: 0.541 2023-09-30 14:45:34,698 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:45:37,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _34734_210_4525_1_1533365977542_4448720_766-297928-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:45:40,457 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=5.03 vs. limit=15.0 2023-09-30 14:45:40,981 INFO [train.py:1039] (3/4) Epoch 22, batch 200, loss[loss=0.1808, simple_loss=0.264, pruned_loss=0.04882, over 24356.00 frames. ], tot_loss[loss=0.1772, simple_loss=0.2535, pruned_loss=0.05048, over 3001150.70 frames. ], batch size: 77, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:45:41,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _51471_210_1889_1_1534399391390_3094250_310-337793-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:45:41,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:45:45,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0 from training. Duration: 0.92 2023-09-30 14:45:45,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_67-103444-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:45:47,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_311-247578-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:45:48,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0 from training. Duration: 0.95 2023-09-30 14:45:50,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp0.9 from training. Duration: 0.688875 2023-09-30 14:45:51,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_299-247059-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:45:53,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _64002_210_9570_1_1535198605157_38979_2-240845-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:45:58,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:45:58,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_256-156611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:45:58,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_624-203729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:46:05,544 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.496e+02 1.853e+02 2.014e+02 2.371e+02 3.492e+02, threshold=4.028e+02, percent-clipped=0.0 2023-09-30 14:46:18,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:46:19,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_380-167520-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:46:20,123 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 14:46:21,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_519-174300-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:46:23,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 14:46:23,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:46:23,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_348-257723-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:46:24,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:46:26,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_17-86060-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:46:26,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_228-184657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:46:28,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_198-334870-0 from training. Duration: 0.92 2023-09-30 14:46:28,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 14:46:28,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_14-139186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:46:31,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=745226.6666666666, ans=0.125 2023-09-30 14:46:31,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=745226.6666666666, ans=0.1 2023-09-30 14:46:35,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:46:41,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_84-10310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:46:47,870 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_79-273076-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:46:47,960 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:46:57,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_170-92788-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:46:58,892 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.50 vs. limit=6.0 2023-09-30 14:47:00,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0 from training. Duration: 0.59 2023-09-30 14:47:00,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_297-12384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:47:00,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:47:02,314 INFO [train.py:1039] (3/4) Epoch 22, batch 250, loss[loss=0.1774, simple_loss=0.2533, pruned_loss=0.05077, over 23423.00 frames. ], tot_loss[loss=0.1765, simple_loss=0.2529, pruned_loss=0.05003, over 3385420.68 frames. ], batch size: 105, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:47:02,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_117-8859-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:47:02,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 14:47:04,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0 from training. Duration: 0.99 2023-09-30 14:47:04,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_118-311357-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:47:04,170 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0377W0003-870225 from training. Duration: 0.852 2023-09-30 14:47:05,331 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=745360.0, ans=0.04949747468305833 2023-09-30 14:47:06,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_56-5838-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:47:09,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:47:09,543 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_58-86657-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:47:11,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_344-113875-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:47:13,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:47:15,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_764-2387-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:47:16,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_157-14963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:47:19,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:47:25,893 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=745426.6666666666, ans=0.0 2023-09-30 14:47:32,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_190-138640-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:47:34,054 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=745493.3333333334, ans=0.1 2023-09-30 14:47:35,499 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_63-270922-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:47:35,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:47:43,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0_sp1.1 from training. Duration: 0.5 2023-09-30 14:47:43,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp0.9 from training. Duration: 0.6 2023-09-30 14:47:45,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:47:45,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_118-18960-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:47:47,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:47:47,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:47:47,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_307-145221-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:47:50,647 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff3_skip_rate, batch_count=745560.0, ans=0.0 2023-09-30 14:47:51,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:47:55,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0 from training. Duration: 0.59 2023-09-30 14:47:55,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_503-333510-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:47:56,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:47:56,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:47:56,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:47:57,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:47:57,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:47:58,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:48:00,112 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_577-220899-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:48:01,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:48:03,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_272-313512-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:48:05,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:48:09,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_74-341428-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:48:12,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:48:18,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_243-126020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:48:19,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_259-6561-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:48:25,003 INFO [train.py:1039] (3/4) Epoch 22, batch 300, loss[loss=0.1715, simple_loss=0.2635, pruned_loss=0.03976, over 24314.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2512, pruned_loss=0.0498, over 3682145.53 frames. ], batch size: 74, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:48:25,064 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0 from training. Duration: 0.91 2023-09-30 14:48:25,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_759-225084-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:48:25,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 14:48:28,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0 from training. Duration: 0.99 2023-09-30 14:48:28,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp0.9 from training. Duration: 0.62225 2023-09-30 14:48:29,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:48:29,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0 from training. Duration: 0.9 2023-09-30 14:48:34,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_137-64179-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:48:36,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48049_210_5632_1_1533864553_3519937_306-283794-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:48:39,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_25-120161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:48:41,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0 from training. Duration: 0.99 2023-09-30 14:48:41,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=745760.0, ans=0.035 2023-09-30 14:48:42,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_296-332325-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:48:44,184 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 14:48:44,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0 from training. Duration: 0.85 2023-09-30 14:48:44,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_443-195183-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:48:49,078 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.579e+02 1.843e+02 2.061e+02 2.404e+02 3.309e+02, threshold=4.123e+02, percent-clipped=0.0 2023-09-30 14:48:49,539 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_skip_rate, batch_count=745760.0, ans=0.0 2023-09-30 14:48:50,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_308-231735-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:48:52,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=745760.0, ans=0.125 2023-09-30 14:48:52,529 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=745760.0, ans=0.0 2023-09-30 14:48:53,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:48:53,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0 from training. Duration: 0.88 2023-09-30 14:48:56,920 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0 from training. Duration: 0.81 2023-09-30 14:48:57,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_217-239796-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:00,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_530-329400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:49:02,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_150-62920-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:02,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0 from training. Duration: 0.66 2023-09-30 14:49:02,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 14:49:05,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:49:05,506 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module1.balancer1.prob, batch_count=745826.6666666666, ans=0.125 2023-09-30 14:49:06,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_299-187801-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:49:06,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_92-345288-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:49:10,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp1.1 from training. Duration: 0.52725 2023-09-30 14:49:10,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0 from training. Duration: 0.76 2023-09-30 14:49:12,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:49:15,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_346-168820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:16,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0 from training. Duration: 0.55 2023-09-30 14:49:16,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_180-24517-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:49:22,280 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:49:25,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_293-212045-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:49:25,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0 from training. Duration: 0.57 2023-09-30 14:49:29,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_104-289874-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:29,872 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 14:49:31,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_256-150908-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:34,108 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:49:35,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0 from training. Duration: 0.64 2023-09-30 14:49:35,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:49:35,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_61-162325-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:49:37,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0 from training. Duration: 0.7 2023-09-30 14:49:38,139 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.36 vs. limit=22.5 2023-09-30 14:49:38,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_188-11581-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:49:38,840 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_136-8381-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:49:41,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_270-229555-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:49:42,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_372-266951-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:49:42,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_14-242149-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:49:43,018 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=745960.0, ans=0.0 2023-09-30 14:49:44,431 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=745960.0, ans=0.0 2023-09-30 14:49:46,227 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=11.13 vs. limit=15.0 2023-09-30 14:49:46,999 INFO [train.py:1039] (3/4) Epoch 22, batch 350, loss[loss=0.1683, simple_loss=0.2351, pruned_loss=0.05078, over 23540.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2487, pruned_loss=0.04961, over 3895520.58 frames. ], batch size: 120, lr: 4.72e-03, grad_scale: 16.0 2023-09-30 14:49:47,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=746026.6666666666, ans=0.125 2023-09-30 14:49:48,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_159-76512-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:49:48,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 14:49:53,771 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_694-238413-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:49:57,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_421-172710-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:50:00,510 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=746026.6666666666, ans=0.2 2023-09-30 14:50:01,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_215-282680-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:50:01,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_306-149449-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:50:05,524 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0 from training. Duration: 0.97 2023-09-30 14:50:07,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_412-15870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:50:07,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0 from training. Duration: 0.6 2023-09-30 14:50:10,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _37112_210_2575_1_1532997007673_7268610_347-238993-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:50:10,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0 from training. Duration: 0.59 2023-09-30 14:50:12,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_228-2961-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:50:15,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0 from training. Duration: 0.98 2023-09-30 14:50:18,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:50:20,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1038-146421-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:50:22,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:50:23,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_543-341117-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:50:23,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_52-168712-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:50:23,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:50:23,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_69-111987-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:50:23,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:50:26,168 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:50:27,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:50:27,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_294-191196-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:50:32,349 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=746160.0, ans=0.125 2023-09-30 14:50:35,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_707-342896-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:50:35,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp0.9 from training. Duration: 0.888875 2023-09-30 14:50:35,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:50:35,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_519-326393-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:50:40,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_52-96808-0 from training. Duration: 0.91 2023-09-30 14:50:40,587 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1079-94525-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:50:45,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_746-3647-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:50:45,767 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_70379_210_7925_1_1535941455_557455_49-101720-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:50:45,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_8-307291-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:50:46,150 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=746226.6666666666, ans=0.125 2023-09-30 14:50:47,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0 from training. Duration: 0.51 2023-09-30 14:50:49,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_944-339840-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:50:49,698 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0515W0043-254911 from training. Duration: 0.976 2023-09-30 14:50:51,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0 from training. Duration: 0.81 2023-09-30 14:50:52,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_269-267275-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:50:54,961 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=14.53 vs. limit=22.5 2023-09-30 14:50:55,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_72-251255-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:50:55,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0 from training. Duration: 0.45 2023-09-30 14:50:57,940 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.07 vs. limit=15.0 2023-09-30 14:50:58,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_921-209452-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:51:00,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 14:51:04,063 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_384-58445-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:51:05,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _44011_210_2046_1_1533722401691_3631310_96-315656-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:51:05,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68484_210_1794_1_1535849804_7678097_549-170613-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:51:07,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_214-148058-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:51:09,198 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=746360.0, ans=0.125 2023-09-30 14:51:10,213 INFO [train.py:1039] (3/4) Epoch 22, batch 400, loss[loss=0.1889, simple_loss=0.2712, pruned_loss=0.05329, over 24394.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2487, pruned_loss=0.04904, over 4093182.20 frames. ], batch size: 77, lr: 4.72e-03, grad_scale: 32.0 2023-09-30 14:51:10,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:51:11,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0_sp1.1 from training. Duration: 0.863625 2023-09-30 14:51:13,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0 from training. Duration: 0.64 2023-09-30 14:51:13,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_484-154786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:51:15,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_396-319412-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:51:15,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:51:17,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _45630_210_10556_1_1533949174498_3902049_558-240889-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:51:20,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_197-307458-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:51:22,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_163-254843-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:51:24,326 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0 from training. Duration: 0.69 2023-09-30 14:51:27,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0 from training. Duration: 0.71 2023-09-30 14:51:27,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_899-233658-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:51:28,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0 from training. Duration: 0.97 2023-09-30 14:51:30,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_424-134619-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:51:33,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:51:33,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_314-63879-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:51:33,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0 from training. Duration: 0.82 2023-09-30 14:51:34,940 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.581e+02 1.861e+02 2.105e+02 2.651e+02 3.953e+02, threshold=4.209e+02, percent-clipped=0.0 2023-09-30 14:51:35,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_511-306767-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:51:35,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_565-201775-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:51:35,416 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=746426.6666666666, ans=0.0 2023-09-30 14:51:36,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_778-173151-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:51:36,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_680-51001-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:51:38,932 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0059-334891 from training. Duration: 0.896 2023-09-30 14:51:40,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0 from training. Duration: 0.94 2023-09-30 14:51:43,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=746493.3333333334, ans=0.125 2023-09-30 14:51:45,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_446-240192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:51:46,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _65242_210_18628_1_1535369393717_3823149_38-6732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:51:46,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0 from training. Duration: 0.81 2023-09-30 14:51:50,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0 from training. Duration: 0.91 2023-09-30 14:51:53,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_329-285801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:51:55,343 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_343-334712-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:52:02,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0 from training. Duration: 0.83 2023-09-30 14:52:05,366 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp1.1 from training. Duration: 0.6 2023-09-30 14:52:08,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0 from training. Duration: 0.87 2023-09-30 14:52:09,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_570-262983-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:52:12,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:52:12,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0 from training. Duration: 0.83 2023-09-30 14:52:15,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:52:18,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 14:52:19,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_104-81711-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:52:23,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_129-343914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:52:23,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0 from training. Duration: 0.92 2023-09-30 14:52:25,360 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp1.1 from training. Duration: 0.563625 2023-09-30 14:52:31,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0 from training. Duration: 0.67 2023-09-30 14:52:31,820 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=746626.6666666666, ans=0.125 2023-09-30 14:52:33,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 14:52:33,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:52:35,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0 from training. Duration: 0.89 2023-09-30 14:52:35,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:52:36,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_325-155703-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:52:39,080 INFO [train.py:1039] (3/4) Epoch 22, batch 450, loss[loss=0.1756, simple_loss=0.2639, pruned_loss=0.04365, over 24628.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2495, pruned_loss=0.04909, over 4224675.54 frames. ], batch size: 68, lr: 4.72e-03, grad_scale: 32.0 2023-09-30 14:52:39,151 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp0.9 from training. Duration: 0.77775 2023-09-30 14:52:39,461 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff2_skip_rate, batch_count=746693.3333333334, ans=0.0 2023-09-30 14:52:40,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0 from training. Duration: 0.86 2023-09-30 14:52:41,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:52:42,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_821-110852-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:52:43,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:52:43,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0 from training. Duration: 0.99 2023-09-30 14:52:43,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:52:45,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 14:52:48,206 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 14:52:56,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_281-250040-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:52:58,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:52:59,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0 from training. Duration: 0.65 2023-09-30 14:52:59,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0 from training. Duration: 0.99 2023-09-30 14:53:03,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:53:05,254 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer2.min_abs, batch_count=746760.0, ans=0.5 2023-09-30 14:53:06,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_884-29515-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:53:07,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_474-119995-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:53:14,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_211-97124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:53:16,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_666-224463-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:53:16,935 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:53:18,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0 from training. Duration: 0.84 2023-09-30 14:53:19,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0 from training. Duration: 0.97 2023-09-30 14:53:21,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_479-125611-0 from training. Duration: 0.96 2023-09-30 14:53:21,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_427-153963-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:53:23,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_133-170571-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:53:24,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 14:53:25,116 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0121-509521 from training. Duration: 0.896 2023-09-30 14:53:25,129 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0434-509821 from training. Duration: 0.64 2023-09-30 14:53:26,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_289-307034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:53:28,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:53:28,358 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp1.1 from training. Duration: 0.463625 2023-09-30 14:53:30,368 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=3.65 vs. limit=15.0 2023-09-30 14:53:32,835 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp1.1 from training. Duration: 0.57275 2023-09-30 14:53:32,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp1.1 from training. Duration: 0.62725 2023-09-30 14:53:34,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp1.1 from training. Duration: 0.436375 2023-09-30 14:53:34,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0 from training. Duration: 0.96 2023-09-30 14:53:36,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:53:38,476 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp0.9 from training. Duration: 0.8 2023-09-30 14:53:38,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 14:53:39,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0 from training. Duration: 0.83 2023-09-30 14:53:45,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp0.9 from training. Duration: 0.92225 2023-09-30 14:53:46,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0 from training. Duration: 0.91 2023-09-30 14:53:46,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0 from training. Duration: 0.61 2023-09-30 14:53:48,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 14:53:48,438 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=746960.0, ans=0.2 2023-09-30 14:53:53,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_187-192817-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:53:56,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_277-204970-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:53:58,565 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:53:58,611 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9144W0121-566446 from training. Duration: 0.896 2023-09-30 14:53:58,799 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer1.prob, batch_count=746960.0, ans=0.125 2023-09-30 14:54:01,623 INFO [train.py:1039] (3/4) Epoch 22, batch 500, loss[loss=0.1853, simple_loss=0.26, pruned_loss=0.05527, over 23295.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2506, pruned_loss=0.04968, over 4343301.76 frames. ], batch size: 105, lr: 4.72e-03, grad_scale: 32.0 2023-09-30 14:54:01,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_351-315519-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:54:03,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 14:54:03,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_436-198965-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:54:03,337 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0117-576751 from training. Duration: 0.896 2023-09-30 14:54:04,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0 from training. Duration: 0.73 2023-09-30 14:54:05,022 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_65-167544-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:54:09,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 14:54:10,233 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=747026.6666666666, ans=0.0 2023-09-30 14:54:13,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 14:54:13,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=747026.6666666666, ans=0.125 2023-09-30 14:54:16,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp1.1 from training. Duration: 0.763625 2023-09-30 14:54:17,878 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_365-287106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:54:17,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_300-3747-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:54:19,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_487-303201-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:54:26,495 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.536e+02 1.831e+02 2.107e+02 2.492e+02 3.806e+02, threshold=4.214e+02, percent-clipped=0.0 2023-09-30 14:54:30,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_668-123026-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:54:31,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp1.1 from training. Duration: 0.536375 2023-09-30 14:54:31,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp0.9 from training. Duration: 0.72225 2023-09-30 14:54:31,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_902-168619-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:54:33,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0 from training. Duration: 0.92 2023-09-30 14:54:33,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 14:54:38,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:54:38,200 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=747160.0, ans=0.125 2023-09-30 14:54:39,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp1.1 from training. Duration: 0.736375 2023-09-30 14:54:39,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0_sp1.1 from training. Duration: 0.87275 2023-09-30 14:54:39,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_76-269320-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:54:41,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0 from training. Duration: 0.82 2023-09-30 14:54:44,356 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.out_balancer.prob, batch_count=747160.0, ans=0.125 2023-09-30 14:54:45,671 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0194-640321 from training. Duration: 0.64 2023-09-30 14:54:47,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_284-312178-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:54:49,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_401-55937-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:54:51,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_716-24918-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:54:51,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _19637_210_4220_1_1531537167676_3205060_314-305638-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:54:52,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp1.1 from training. Duration: 0.636375 2023-09-30 14:54:52,906 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=747226.6666666666, ans=0.0 2023-09-30 14:54:55,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0 from training. Duration: 0.62 2023-09-30 14:54:59,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_149-67212-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 14:55:01,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_63-63253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:55:03,284 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=747226.6666666666, ans=0.125 2023-09-30 14:55:04,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_660-203992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:55:09,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_83-190624-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:55:15,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_154-139635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:55:17,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0 from training. Duration: 0.7 2023-09-30 14:55:17,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1126-189071-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:55:17,706 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_310-214789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:55:20,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0 from training. Duration: 0.74 2023-09-30 14:55:22,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp0.9 from training. Duration: 0.77775 2023-09-30 14:55:24,280 INFO [train.py:1039] (3/4) Epoch 22, batch 550, loss[loss=0.1703, simple_loss=0.2427, pruned_loss=0.04898, over 23366.00 frames. ], tot_loss[loss=0.1754, simple_loss=0.2513, pruned_loss=0.0498, over 4428979.36 frames. ], batch size: 93, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 14:55:24,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_537-230640-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:55:27,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0 from training. Duration: 0.95 2023-09-30 14:55:30,677 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_434-83000-0 from training. Duration: 0.98 2023-09-30 14:55:30,726 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4405_210_1849_1_1526732205_3593939_493-32464-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:55:30,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0 from training. Duration: 0.54 2023-09-30 14:55:30,824 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_765-250133-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:55:32,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_81-312245-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:55:32,840 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1273-247611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:55:32,941 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_399-18591-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:55:32,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp0.9 from training. Duration: 0.9 2023-09-30 14:55:35,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:55:38,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_673-264125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:55:39,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_262-86750-0 from training. Duration: 0.81 2023-09-30 14:55:39,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:55:41,813 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=6.08 vs. limit=15.0 2023-09-30 14:55:43,086 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=747426.6666666666, ans=0.125 2023-09-30 14:55:46,064 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_516-14858-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:55:46,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_54-248218-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:55:47,781 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_300-119782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:55:49,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_21-240703-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:55:53,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 774-127930-0014-10412-0_sp1.1 from training. Duration: 0.95 2023-09-30 14:55:54,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_20-179346-0 from training. Duration: 0.97 2023-09-30 14:55:55,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:56:01,094 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=747493.3333333334, ans=0.1 2023-09-30 14:56:02,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:56:02,327 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:56:03,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp0.9 from training. Duration: 0.788875 2023-09-30 14:56:09,063 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40340_210_9024_1_1533433916_4132944_320-270277-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:56:09,081 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0203W0003-781711 from training. Duration: 0.958 2023-09-30 14:56:10,452 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30434_210_13049_1_1532519384_981746_52-12932-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:56:12,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 14:56:14,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 14:56:15,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 14:56:15,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp1.1 from training. Duration: 0.7 2023-09-30 14:56:17,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_742-267856-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:56:17,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0 from training. Duration: 0.58 2023-09-30 14:56:19,186 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.48 vs. limit=6.0 2023-09-30 14:56:19,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0 from training. Duration: 0.79 2023-09-30 14:56:21,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_612-56061-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:56:21,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_412-2403-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:56:21,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_120-196848-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:56:21,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_916-51000-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:56:24,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:56:25,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0_sp1.1 from training. Duration: 0.8 2023-09-30 14:56:28,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_43-325657-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:56:29,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_314-82041-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:56:30,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 14:56:32,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 14:56:34,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_140-336881-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:56:35,669 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:56:35,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_607-259685-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:56:38,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp1.1 from training. Duration: 0.67275 2023-09-30 14:56:38,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp1.1 from training. Duration: 0.4 2023-09-30 14:56:44,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0 from training. Duration: 0.59 2023-09-30 14:56:44,608 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=747626.6666666666, ans=0.125 2023-09-30 14:56:46,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.prob, batch_count=747693.3333333334, ans=0.125 2023-09-30 14:56:47,700 INFO [train.py:1039] (3/4) Epoch 22, batch 600, loss[loss=0.1829, simple_loss=0.2477, pruned_loss=0.059, over 23679.00 frames. ], tot_loss[loss=0.1756, simple_loss=0.2514, pruned_loss=0.04989, over 4495762.33 frames. ], batch size: 232, lr: 4.71e-03, grad_scale: 16.0 2023-09-30 14:56:49,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0 from training. Duration: 0.46 2023-09-30 14:56:50,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_287-130422-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:56:50,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:56:50,837 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_45-342307-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:56:57,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_478-326818-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:56:58,164 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=384, metric=11.96 vs. limit=22.5 2023-09-30 14:57:00,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 14:57:01,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0 from training. Duration: 0.83 2023-09-30 14:57:03,631 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0_sp0.9 from training. Duration: 0.87775 2023-09-30 14:57:07,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_153-153537-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:57:08,708 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_285-349105-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:57:11,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0 from training. Duration: 0.87 2023-09-30 14:57:11,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_172-294852-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:57:13,344 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.458e+02 1.858e+02 2.031e+02 2.339e+02 3.248e+02, threshold=4.061e+02, percent-clipped=0.0 2023-09-30 14:57:18,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0 from training. Duration: 0.83 2023-09-30 14:57:22,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1254-44082-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:57:22,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1115-180350-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:57:23,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 14:57:28,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_517-153649-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:57:28,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_37-266257-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:57:31,021 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_527-276118-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:57:37,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:57:40,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_478-156360-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:57:40,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0_sp1.1 from training. Duration: 0.82725 2023-09-30 14:57:40,995 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_680-317073-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:57:49,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0 from training. Duration: 0.97 2023-09-30 14:57:54,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp1.1 from training. Duration: 0.57275 2023-09-30 14:57:56,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_555-63066-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:57:59,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0 from training. Duration: 0.89 2023-09-30 14:58:01,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp1.1 from training. Duration: 0.77275 2023-09-30 14:58:03,375 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=747960.0, ans=0.1 2023-09-30 14:58:04,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_120-211630-0 from training. Duration: 0.92 2023-09-30 14:58:04,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 14:58:04,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 14:58:10,797 INFO [train.py:1039] (3/4) Epoch 22, batch 650, loss[loss=0.158, simple_loss=0.2068, pruned_loss=0.05465, over 19457.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2506, pruned_loss=0.05004, over 4531146.71 frames. ], batch size: 389, lr: 4.71e-03, grad_scale: 16.0 2023-09-30 14:58:10,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 14:58:12,567 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp1.1 from training. Duration: 0.663625 2023-09-30 14:58:12,759 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=748026.6666666666, ans=0.0 2023-09-30 14:58:16,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:58:16,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.bypass_mid.scale_min, batch_count=748026.6666666666, ans=0.2 2023-09-30 14:58:17,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0_sp0.9 from training. Duration: 0.988875 2023-09-30 14:58:19,288 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_570-295255-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:58:21,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0 from training. Duration: 0.72 2023-09-30 14:58:23,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_649-79655-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:58:26,928 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.38 vs. limit=15.0 2023-09-30 14:58:29,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 14:58:29,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_302-255963-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:58:35,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_573-301852-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:58:38,638 WARNING [train.py:1197] (3/4) Exclude cut with ID 6709-74022-0004-86860-0_sp1.1 from training. Duration: 0.9409375 2023-09-30 14:58:40,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_111-71991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:58:41,662 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_169-214186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:58:44,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_20-235313-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:58:44,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 14:58:47,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_444-198752-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:58:47,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_9-279442-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:58:48,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 14:58:51,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_567-250221-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:58:51,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 14:58:54,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 14:58:54,746 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0011-61817 from training. Duration: 0.88 2023-09-30 14:58:54,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_435-154371-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:58:54,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25836_210_16554_1_1532138167_53197_3-9394-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:58:59,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_57-55125-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:58:59,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_267-23560-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:59:01,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1150-278867-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:59:01,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp0.9 from training. Duration: 0.911125 2023-09-30 14:59:01,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0 from training. Duration: 0.87 2023-09-30 14:59:04,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0_sp1.1 from training. Duration: 0.9 2023-09-30 14:59:04,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp0.9 from training. Duration: 0.811125 2023-09-30 14:59:06,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp1.1 from training. Duration: 0.536375 2023-09-30 14:59:06,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_349-69727-0_sp1.1 from training. Duration: 0.936375 2023-09-30 14:59:08,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 14:59:09,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0 from training. Duration: 0.83 2023-09-30 14:59:11,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0 from training. Duration: 0.72 2023-09-30 14:59:11,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_225-97100-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:59:11,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_374-194712-0_sp1.1 from training. Duration: 0.92725 2023-09-30 14:59:12,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp0.9 from training. Duration: 0.97775 2023-09-30 14:59:12,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_389-317194-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 14:59:14,435 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_239-122918-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 14:59:20,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_128-117609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:59:20,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_160-276178-0_sp1.1 from training. Duration: 0.963625 2023-09-30 14:59:22,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31920_210_10633_1_1532860009_1686094_1-279735-0_sp1.1 from training. Duration: 0.97275 2023-09-30 14:59:23,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_325-262383-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:59:23,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 14:59:25,745 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51937_210_5014_1_1534573610_4022025_518-299248-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 14:59:32,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 14:59:32,458 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_1087-282939-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:59:32,516 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:59:32,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_145-89359-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 14:59:33,871 INFO [train.py:1039] (3/4) Epoch 22, batch 700, loss[loss=0.1612, simple_loss=0.2383, pruned_loss=0.04207, over 24330.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2486, pruned_loss=0.04951, over 4568699.10 frames. ], batch size: 56, lr: 4.71e-03, grad_scale: 16.0 2023-09-30 14:59:37,661 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0 from training. Duration: 0.93 2023-09-30 14:59:38,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0 from training. Duration: 0.97 2023-09-30 14:59:41,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0 from training. Duration: 0.54 2023-09-30 14:59:41,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_879-299350-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:59:43,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_905-160074-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 14:59:45,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_395-73570-0 from training. Duration: 0.99 2023-09-30 14:59:47,531 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 14:59:50,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 14:59:53,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 14:59:55,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_107-80772-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 14:59:58,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0_sp1.1 from training. Duration: 0.5 2023-09-30 14:59:58,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_156-253482-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:00:00,128 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.529e+02 1.824e+02 1.972e+02 2.211e+02 2.960e+02, threshold=3.944e+02, percent-clipped=0.0 2023-09-30 15:00:01,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_309-125532-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:00:07,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 15:00:07,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:00:07,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0 from training. Duration: 0.42 2023-09-30 15:00:11,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0 from training. Duration: 0.84 2023-09-30 15:00:15,948 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:00:16,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:00:18,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:00:22,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_676-51014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:00:23,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0 from training. Duration: 0.91 2023-09-30 15:00:26,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_747-114465-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:00:26,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:00:28,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0 from training. Duration: 0.81 2023-09-30 15:00:30,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:00:32,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1134-288498-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:00:35,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_211-237289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:00:40,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:00:41,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0 from training. Duration: 0.55 2023-09-30 15:00:45,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0 from training. Duration: 0.89 2023-09-30 15:00:45,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0 from training. Duration: 0.9 2023-09-30 15:00:49,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_9-219972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:00:52,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_656-283823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:00:52,642 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=748626.6666666666, ans=0.2 2023-09-30 15:00:53,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535789601_661049_77-216385-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:00:53,997 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_19952_210_2161_1_1531875469_3851750_210-308919-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:00:54,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0 from training. Duration: 0.95 2023-09-30 15:00:56,881 INFO [train.py:1039] (3/4) Epoch 22, batch 750, loss[loss=0.1877, simple_loss=0.263, pruned_loss=0.05613, over 23937.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2483, pruned_loss=0.04932, over 4585873.28 frames. ], batch size: 86, lr: 4.71e-03, grad_scale: 16.0 2023-09-30 15:00:58,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_224-343295-0 from training. Duration: 0.55 2023-09-30 15:00:58,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0 from training. Duration: 0.74 2023-09-30 15:00:58,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0 from training. Duration: 0.85 2023-09-30 15:01:00,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_160-24408-0 from training. Duration: 0.91 2023-09-30 15:01:00,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0 from training. Duration: 0.64 2023-09-30 15:01:00,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_528-259800-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:01:01,110 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.out_whiten.whitening_limit, batch_count=748693.3333333334, ans=15.0 2023-09-30 15:01:01,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0 from training. Duration: 0.89 2023-09-30 15:01:03,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_400-338274-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:01:03,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:01:05,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_331-291057-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:01:06,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5436_210_1794_1_1527331317_2541494_229-211016-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:01:08,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:01:08,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_96-81499-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:01:13,436 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:01:14,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:01:16,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_304-327750-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:01:20,347 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_245-226199-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:01:20,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_102-199499-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:01:20,499 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0 from training. Duration: 0.93 2023-09-30 15:01:24,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:01:24,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_497-311999-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:01:25,805 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_91-194765-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:01:27,333 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp0.9 from training. Duration: 0.77775 2023-09-30 15:01:28,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0 from training. Duration: 0.65 2023-09-30 15:01:28,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_209-154760-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:01:30,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0 from training. Duration: 0.54 2023-09-30 15:01:30,451 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0044-335852 from training. Duration: 0.768 2023-09-30 15:01:31,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0 from training. Duration: 0.92 2023-09-30 15:01:31,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp0.9 from training. Duration: 0.9 2023-09-30 15:01:33,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 15:01:34,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:01:41,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:01:41,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_647-337784-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:01:41,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:01:41,785 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_87-266334-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:01:45,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_526-261338-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:01:45,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0 from training. Duration: 0.88 2023-09-30 15:01:46,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_449-15508-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:01:46,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp0.9 from training. Duration: 0.488875 2023-09-30 15:01:48,298 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:01:52,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:01:54,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_47-167373-0 from training. Duration: 0.92 2023-09-30 15:01:54,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_628-282179-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:02:00,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_213-280898-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:02:01,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:02:03,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_16-78180-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:02:06,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:02:10,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0 from training. Duration: 0.75 2023-09-30 15:02:10,727 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_255-148539-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:02:10,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:02:13,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:02:14,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_97-151896-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:02:16,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_207-7026-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:02:17,388 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=4.87 vs. limit=15.0 2023-09-30 15:02:17,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:02:19,339 INFO [train.py:1039] (3/4) Epoch 22, batch 800, loss[loss=0.1917, simple_loss=0.2604, pruned_loss=0.06152, over 22893.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2493, pruned_loss=0.04961, over 4614921.27 frames. ], batch size: 322, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 15:02:22,624 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=6.03 vs. limit=15.0 2023-09-30 15:02:26,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_122-178847-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:02:26,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_94-348956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:02:28,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1018-244089-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:02:28,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_87-118423-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:02:30,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_504-212292-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:02:30,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_446-150327-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:02:33,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_682-245989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:02:37,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _35723_210_1794_1_1533088341043_628250_8-234524-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:02:38,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:02:40,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0 from training. Duration: 0.76 2023-09-30 15:02:41,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_743-915-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:02:41,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _61194_210_18628_1_1535007555628_3750909_353-323468-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:02:42,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:02:42,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=749093.3333333334, ans=0.0 2023-09-30 15:02:43,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _44424_210_18163_1_1533623394848_1213110_79-187603-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:02:43,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0 from training. Duration: 0.85 2023-09-30 15:02:45,043 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_103-283529-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:02:45,106 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0 from training. Duration: 0.39 2023-09-30 15:02:47,019 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.535e+02 1.835e+02 2.019e+02 2.284e+02 4.447e+02, threshold=4.039e+02, percent-clipped=1.0 2023-09-30 15:02:48,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_368-68029-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:02:51,731 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_446-339645-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:02:53,603 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.ff3_skip_rate, batch_count=749160.0, ans=0.0 2023-09-30 15:02:54,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:02:56,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_134-184387-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:02:57,171 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=749160.0, ans=0.04949747468305833 2023-09-30 15:02:59,856 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_550-184707-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:02:59,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_106-341780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:03:02,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_253-132552-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:03:03,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_395-310220-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:03:03,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp0.9 from training. Duration: 0.411125 2023-09-30 15:03:06,924 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9002W0003-495962 from training. Duration: 0.946 2023-09-30 15:03:06,979 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0 from training. Duration: 0.57 2023-09-30 15:03:07,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:03:08,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_788-232684-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:03:09,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_10-39296-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:03:09,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_341-20819-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:03:15,314 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0435-509822 from training. Duration: 0.896 2023-09-30 15:03:15,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0 from training. Duration: 0.4 2023-09-30 15:03:16,503 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=192, metric=6.78 vs. limit=15.0 2023-09-30 15:03:16,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:03:18,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:03:18,851 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:03:22,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:03:25,478 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_186-208240-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:03:27,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0 from training. Duration: 0.91 2023-09-30 15:03:28,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526742306_334530_28-238909-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:03:30,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0 from training. Duration: 0.86 2023-09-30 15:03:39,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:03:42,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_470-276077-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:03:42,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0 from training. Duration: 0.94 2023-09-30 15:03:42,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:03:44,065 INFO [train.py:1039] (3/4) Epoch 22, batch 850, loss[loss=0.2199, simple_loss=0.2817, pruned_loss=0.07909, over 18934.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2503, pruned_loss=0.04961, over 4632743.50 frames. ], batch size: 388, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 15:03:44,232 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1072-12671-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:03:45,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0 from training. Duration: 0.94 2023-09-30 15:03:45,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1194-270988-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:03:47,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_289-193637-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:03:49,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_313-263495-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:03:50,757 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:03:52,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_646-287209-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:03:52,491 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0 from training. Duration: 0.82 2023-09-30 15:03:53,974 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0 from training. Duration: 0.96 2023-09-30 15:03:53,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0 from training. Duration: 0.76 2023-09-30 15:03:55,595 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:03:55,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_347-232268-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:03:59,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_111-55056-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:03:59,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_812-271646-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:03:59,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:04:04,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_14-16530-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:04:04,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_568-304512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:04:04,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0 from training. Duration: 0.87 2023-09-30 15:04:07,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0 from training. Duration: 0.48 2023-09-30 15:04:12,922 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_251-288764-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:04:13,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0 from training. Duration: 0.88 2023-09-30 15:04:14,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0 from training. Duration: 0.62 2023-09-30 15:04:16,426 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0 from training. Duration: 0.8 2023-09-30 15:04:18,257 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=749493.3333333334, ans=0.1 2023-09-30 15:04:19,532 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9266W0001-627677 from training. Duration: 0.896 2023-09-30 15:04:19,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:04:19,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_687-237771-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:04:19,582 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 15:04:24,354 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5129_210_6400_1_1527161005_3579054_107-69738-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:04:24,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_36-301164-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:04:24,944 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=749493.3333333334, ans=0.1 2023-09-30 15:04:25,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0 from training. Duration: 0.59 2023-09-30 15:04:26,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:04:28,139 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_147-284024-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:04:28,283 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:04:29,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0_sp1.1 from training. Duration: 0.636375 2023-09-30 15:04:31,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:04:32,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp1.1 from training. Duration: 0.57275 2023-09-30 15:04:32,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0 from training. Duration: 0.98 2023-09-30 15:04:36,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:04:36,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_415-52611-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:04:38,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:04:38,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_368-195106-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:04:38,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.skip_rate, batch_count=749560.0, ans=0.07 2023-09-30 15:04:41,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_51-334124-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:04:44,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_317-161769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:04:46,414 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0_sp0.9 from training. Duration: 0.9 2023-09-30 15:04:49,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:04:49,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1004-304915-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:04:50,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:05:00,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp0.9 from training. Duration: 0.62225 2023-09-30 15:05:02,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_530-326202-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:05:02,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0 from training. Duration: 0.87 2023-09-30 15:05:02,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:05:02,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_762-282614-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:05:05,135 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.45 vs. limit=10.0 2023-09-30 15:05:05,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0 from training. Duration: 0.77 2023-09-30 15:05:07,239 INFO [train.py:1039] (3/4) Epoch 22, batch 900, loss[loss=0.2031, simple_loss=0.2693, pruned_loss=0.06849, over 23586.00 frames. ], tot_loss[loss=0.1757, simple_loss=0.2512, pruned_loss=0.05012, over 4650253.42 frames. ], batch size: 256, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 15:05:11,179 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=20.36 vs. limit=22.5 2023-09-30 15:05:13,378 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_27-170931-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:05:18,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_266-271628-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:05:18,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0 from training. Duration: 0.57 2023-09-30 15:05:21,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:05:22,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_147-111137-0 from training. Duration: 0.95 2023-09-30 15:05:23,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp1.1 from training. Duration: 0.463625 2023-09-30 15:05:25,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:05:25,144 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24677_210_1794_1_1532002693_4341296_149-303793-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:05:25,223 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:05:25,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.nonlin_attention.balancer.prob, batch_count=749760.0, ans=0.125 2023-09-30 15:05:26,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:05:31,908 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=749760.0, ans=0.2 2023-09-30 15:05:32,806 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.537e+02 1.809e+02 2.095e+02 2.574e+02 4.591e+02, threshold=4.190e+02, percent-clipped=1.0 2023-09-30 15:05:36,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_624-320169-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:05:36,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_127-325326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:05:36,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:05:38,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_120-304588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:05:43,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0 from training. Duration: 0.47 2023-09-30 15:05:44,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_64-54020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:05:48,506 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:05:49,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:05:50,053 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0205W0255-783002 from training. Duration: 0.958 2023-09-30 15:05:51,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0 from training. Duration: 0.83 2023-09-30 15:05:55,527 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:05:59,658 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:05:59,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:05:59,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:06:07,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47449_210_5399_1_1534076445_4006929_410-237806-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:06:07,429 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:06:09,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_108-21902-0 from training. Duration: 0.99 2023-09-30 15:06:10,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_309-174818-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:06:12,643 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0 from training. Duration: 0.88 2023-09-30 15:06:15,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:06:15,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_442-238692-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:06:16,080 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=749960.0, ans=0.125 2023-09-30 15:06:17,397 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_226-254446-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:06:17,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_287-299903-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:06:24,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0 from training. Duration: 0.62 2023-09-30 15:06:25,500 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0305W0008-833402 from training. Duration: 0.91 2023-09-30 15:06:25,722 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp1.1 from training. Duration: 0.436375 2023-09-30 15:06:25,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0 from training. Duration: 0.64 2023-09-30 15:06:28,575 INFO [train.py:1039] (3/4) Epoch 22, batch 950, loss[loss=0.1572, simple_loss=0.2268, pruned_loss=0.04376, over 23473.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2512, pruned_loss=0.04972, over 4676247.51 frames. ], batch size: 134, lr: 4.71e-03, grad_scale: 32.0 2023-09-30 15:06:28,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_13-205182-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:06:31,278 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=750026.6666666666, ans=0.2 2023-09-30 15:06:33,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0 from training. Duration: 0.68 2023-09-30 15:06:35,868 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_skip_rate, batch_count=750026.6666666666, ans=0.0 2023-09-30 15:06:38,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_9-148921-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:06:41,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_143-70317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:06:41,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_1056-236369-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:06:43,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 15:06:46,306 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0376W0255-869957 from training. Duration: 0.9510625 2023-09-30 15:06:48,568 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_483-240691-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:06:50,657 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:06:52,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_367-26344-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:06:52,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:06:52,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_69-138150-0 from training. Duration: 0.94 2023-09-30 15:06:52,438 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp0.9 from training. Duration: 0.52225 2023-09-30 15:06:55,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_287-191918-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:06:56,819 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0 from training. Duration: 0.97 2023-09-30 15:06:56,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:07:00,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_268-64520-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:07:00,700 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:07:00,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_828-313097-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:07:00,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0 from training. Duration: 0.88 2023-09-30 15:07:02,236 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn2.whiten, num_groups=1, num_channels=384, metric=12.23 vs. limit=22.5 2023-09-30 15:07:04,562 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 15:07:06,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:07:07,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:07:12,179 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_792-195353-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:07:12,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_311-54500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:07:13,211 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.94 vs. limit=8.0 2023-09-30 15:07:16,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0 from training. Duration: 0.47 2023-09-30 15:07:20,140 WARNING [train.py:1197] (3/4) Exclude cut with ID 2411-132532-0017-82279-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 15:07:20,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:07:20,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_320-229006-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:07:22,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_177-100554-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:07:22,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_787-294416-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:07:27,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0 from training. Duration: 0.76 2023-09-30 15:07:27,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:07:30,198 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_469-338558-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:07:32,398 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_367-126897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:07:32,431 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0 from training. Duration: 0.8 2023-09-30 15:07:32,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_197-79498-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:07:32,460 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:07:32,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0 from training. Duration: 0.9 2023-09-30 15:07:37,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:07:38,020 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.skip_rate, batch_count=750293.3333333334, ans=0.04949747468305833 2023-09-30 15:07:40,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_296-24733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:07:42,932 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer1.prob, batch_count=750293.3333333334, ans=0.125 2023-09-30 15:07:45,678 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_335-99364-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:07:47,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0 from training. Duration: 0.91 2023-09-30 15:07:47,160 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0 from training. Duration: 0.83 2023-09-30 15:07:51,489 INFO [train.py:1039] (3/4) Epoch 22, batch 1000, loss[loss=0.1767, simple_loss=0.25, pruned_loss=0.05171, over 23649.00 frames. ], tot_loss[loss=0.1752, simple_loss=0.2503, pruned_loss=0.04999, over 4664175.80 frames. ], batch size: 149, lr: 4.70e-03, grad_scale: 16.0 2023-09-30 15:07:54,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_189-298172-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:07:58,355 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0 from training. Duration: 0.76 2023-09-30 15:07:58,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_155-90874-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:08:03,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_164-216310-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:08:05,329 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0 from training. Duration: 0.86 2023-09-30 15:08:05,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_333-270325-0 from training. Duration: 0.93 2023-09-30 15:08:10,607 WARNING [train.py:1197] (3/4) Exclude cut with ID _5172_210_1883_1_1527246062567_3577219_18-101380-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:08:10,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_577-236858-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:08:14,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1006-177345-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:08:15,894 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0 from training. Duration: 0.78 2023-09-30 15:08:17,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=750426.6666666666, ans=0.0 2023-09-30 15:08:18,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0 from training. Duration: 0.76 2023-09-30 15:08:20,244 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.495e+02 1.832e+02 2.013e+02 2.220e+02 3.757e+02, threshold=4.026e+02, percent-clipped=0.0 2023-09-30 15:08:21,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0 from training. Duration: 0.75 2023-09-30 15:08:21,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_4-248404-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:08:24,813 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_596-306820-0 from training. Duration: 0.97 2023-09-30 15:08:25,002 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp0.9 from training. Duration: 0.5 2023-09-30 15:08:25,231 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.ff2_skip_rate, batch_count=750493.3333333334, ans=0.0 2023-09-30 15:08:26,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0 from training. Duration: 0.64 2023-09-30 15:08:27,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_143-343575-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:08:28,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_36-12256-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:08:36,094 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.self_attn2.whiten, num_groups=1, num_channels=512, metric=11.71 vs. limit=22.5 2023-09-30 15:08:38,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_38-67795-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:08:39,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_308-327832-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:08:40,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_726-81176-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:08:40,120 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_47-141521-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:08:40,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0 from training. Duration: 0.8 2023-09-30 15:08:41,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_239-24000-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:08:43,807 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:08:43,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_580-173454-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:08:43,970 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0124W0002-61308 from training. Duration: 0.982 2023-09-30 15:08:49,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0 from training. Duration: 0.89 2023-09-30 15:08:50,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0 from training. Duration: 0.86 2023-09-30 15:08:52,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0 from training. Duration: 0.82 2023-09-30 15:08:53,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:08:58,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_85-270092-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:08:58,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:08:59,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _47346_210_9476_1_1533815971553_3536160_201-40644-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:08:59,352 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=750626.6666666666, ans=0.125 2023-09-30 15:09:00,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:09:02,045 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0 from training. Duration: 0.67 2023-09-30 15:09:02,221 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:09:02,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528439452891_970220_68-181805-0 from training. Duration: 0.64 2023-09-30 15:09:04,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0 from training. Duration: 0.66 2023-09-30 15:09:04,369 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_43-171109-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:09:04,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _40094_210_16380_1_1533294005773_3641159_107-280739-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:09:06,766 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=750626.6666666666, ans=0.125 2023-09-30 15:09:09,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:09:11,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:09:14,071 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_451-82741-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:09:15,503 INFO [train.py:1039] (3/4) Epoch 22, batch 1050, loss[loss=0.1474, simple_loss=0.224, pruned_loss=0.03543, over 24368.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.249, pruned_loss=0.04961, over 4658923.99 frames. ], batch size: 56, lr: 4.70e-03, grad_scale: 16.0 2023-09-30 15:09:17,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:09:20,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=750693.3333333334, ans=0.125 2023-09-30 15:09:21,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:09:22,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 15:09:24,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_592-258841-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:09:24,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:09:29,009 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:09:30,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:09:32,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _54189_210_16380_1_1534332670039_3911710_606-311481-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:09:33,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:09:33,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:09:34,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:09:35,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0 from training. Duration: 0.98 2023-09-30 15:09:35,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_490-198354-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:09:37,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0 from training. Duration: 0.84 2023-09-30 15:09:41,260 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_790-199469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:09:41,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0 from training. Duration: 0.82 2023-09-30 15:09:41,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:09:49,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_363-282909-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:09:50,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:09:50,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _24612_210_8913_1_1531998054193_3806359_357-35487-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:09:52,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0 from training. Duration: 0.81 2023-09-30 15:09:54,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0 from training. Duration: 0.6 2023-09-30 15:09:54,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:09:57,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0 from training. Duration: 0.87 2023-09-30 15:09:58,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0 from training. Duration: 0.73 2023-09-30 15:09:59,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_454-339504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:10:02,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 15:10:04,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp0.9 from training. Duration: 0.67775 2023-09-30 15:10:05,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_99-11611-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:10:06,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:10:11,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:10:14,286 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0 from training. Duration: 0.86 2023-09-30 15:10:16,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_209-54533-0 from training. Duration: 0.96 2023-09-30 15:10:16,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0 from training. Duration: 0.55 2023-09-30 15:10:16,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_319-250868-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:10:17,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:10:19,503 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0 from training. Duration: 0.88 2023-09-30 15:10:23,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:10:24,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:10:24,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:10:24,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:10:24,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_224-172517-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:10:29,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _15113_210_8254_1_1530842438579_3716279_76-232507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:10:29,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0 from training. Duration: 0.98 2023-09-30 15:10:32,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:10:32,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0 from training. Duration: 0.92 2023-09-30 15:10:32,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0 from training. Duration: 0.66 2023-09-30 15:10:34,398 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_939-132910-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:10:37,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _49371_210_12071_1_1533952761021_3812190_16-246001-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:10:38,852 INFO [train.py:1039] (3/4) Epoch 22, batch 1100, loss[loss=0.1798, simple_loss=0.248, pruned_loss=0.05586, over 22778.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2486, pruned_loss=0.0494, over 4670830.21 frames. ], batch size: 322, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:10:45,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_680-189165-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:10:50,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:10:53,642 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_540-275685-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:10:53,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_453-106579-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:10:53,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0 from training. Duration: 0.76 2023-09-30 15:10:54,474 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=384, metric=13.54 vs. limit=22.5 2023-09-30 15:10:55,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_279-126205-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:10:56,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_5-55893-0_sp0.9 from training. Duration: 0.611125 2023-09-30 15:10:59,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_141-10835-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:11:02,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_201-85738-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:11:02,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0 from training. Duration: 0.75 2023-09-30 15:11:05,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 15:11:06,748 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_360-63661-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:11:06,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_449-31567-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:11:08,138 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.614e+02 1.829e+02 2.096e+02 2.478e+02 4.106e+02, threshold=4.191e+02, percent-clipped=1.0 2023-09-30 15:11:09,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:11:10,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:11:15,384 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.31 vs. limit=6.0 2023-09-30 15:11:16,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_256-206070-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:11:19,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_278-104676-0 from training. Duration: 0.98 2023-09-30 15:11:19,932 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0065-331908 from training. Duration: 0.896 2023-09-30 15:11:21,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_244-129917-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:11:22,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_1129-170857-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:11:23,103 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=751160.0, ans=0.2 2023-09-30 15:11:24,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:11:24,913 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_250-332999-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:11:26,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0 from training. Duration: 0.74 2023-09-30 15:11:27,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:11:27,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:11:28,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1341-26728-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:11:29,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40222_210_6228_1_1533385298_4481939_375-328051-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:11:29,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0 from training. Duration: 0.77 2023-09-30 15:11:34,889 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:11:34,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_365-1492-0 from training. Duration: 0.91 2023-09-30 15:11:37,943 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:11:41,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:11:44,288 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0 from training. Duration: 0.99 2023-09-30 15:11:44,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _13942_210_9988_1_1530604906036_3733360_499-55676-0_sp1.1 from training. Duration: 0.563625 2023-09-30 15:11:45,880 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_375-65143-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:11:47,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_199-325608-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:11:48,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_443-124391-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:11:50,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0 from training. Duration: 0.86 2023-09-30 15:11:52,321 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_567-135114-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:11:52,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_462-288299-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:11:52,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0 from training. Duration: 0.86 2023-09-30 15:11:54,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:11:54,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0 from training. Duration: 0.86 2023-09-30 15:11:56,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_23-74512-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:11:56,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:11:57,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:12:00,851 INFO [train.py:1039] (3/4) Epoch 22, batch 1150, loss[loss=0.1584, simple_loss=0.2459, pruned_loss=0.03551, over 24686.00 frames. ], tot_loss[loss=0.1743, simple_loss=0.2494, pruned_loss=0.04955, over 4678045.87 frames. ], batch size: 65, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:12:01,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_176-174327-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:12:04,525 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_432-96140-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:12:08,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_76-14910-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:12:08,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_40-19458-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:12:08,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0 from training. Duration: 0.95 2023-09-30 15:12:09,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _21631_210_2253_1_1531907880778_3160339_321-35325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:12:12,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0 from training. Duration: 0.69 2023-09-30 15:12:15,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_301-93324-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:12:15,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:12:21,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0 from training. Duration: 0.49 2023-09-30 15:12:21,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=751426.6666666666, ans=0.2 2023-09-30 15:12:22,297 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=384, metric=4.44 vs. limit=15.0 2023-09-30 15:12:23,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_103-77513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:12:28,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_662-18193-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:12:29,667 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_439-52438-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:12:29,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0 from training. Duration: 0.54 2023-09-30 15:12:29,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:12:29,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_446-308321-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:12:33,961 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.max_abs, batch_count=751493.3333333334, ans=10.0 2023-09-30 15:12:35,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0 from training. Duration: 0.42 2023-09-30 15:12:36,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_344-337148-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:12:38,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_759-179071-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:12:44,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=751493.3333333334, ans=0.1 2023-09-30 15:12:48,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_293-12188-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:12:52,484 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=751560.0, ans=0.1 2023-09-30 15:12:55,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_235-320304-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:12:56,580 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0 from training. Duration: 0.86 2023-09-30 15:12:56,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_375-218677-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:12:56,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_829-202956-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:13:01,962 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0436-509823 from training. Duration: 0.768 2023-09-30 15:13:03,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_231-112214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:13:10,494 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9059W0470-524883 from training. Duration: 0.3415625 2023-09-30 15:13:17,023 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43266_210_1794_1_1533521789_4497218_206-79512-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:13:19,332 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_294-163793-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:13:19,383 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6290_210_3273_1_1527595224_1282787_115-87095-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:13:19,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:13:19,624 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=751626.6666666666, ans=0.125 2023-09-30 15:13:21,586 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=512, metric=13.09 vs. limit=15.0 2023-09-30 15:13:22,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _14621_210_1925_1_1530686901185_3675420_419-234200-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:13:23,913 INFO [train.py:1039] (3/4) Epoch 22, batch 1200, loss[loss=0.1826, simple_loss=0.2593, pruned_loss=0.05295, over 23366.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2501, pruned_loss=0.04936, over 4681370.34 frames. ], batch size: 93, lr: 4.70e-03, grad_scale: 16.0 2023-09-30 15:13:27,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:13:27,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:13:27,826 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=256, metric=12.08 vs. limit=15.0 2023-09-30 15:13:28,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_466-3492-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:13:28,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_199-242167-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:13:30,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_237-89684-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:13:33,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_70-84121-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:13:35,380 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:13:35,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_40-321822-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:13:35,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_227-83612-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:13:37,459 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=751693.3333333334, ans=0.0 2023-09-30 15:13:38,655 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0015-572508 from training. Duration: 0.896 2023-09-30 15:13:38,938 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer2.prob, batch_count=751760.0, ans=0.125 2023-09-30 15:13:41,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0 from training. Duration: 0.51 2023-09-30 15:13:43,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:13:45,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:13:47,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1102-324568-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:13:51,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_465-75446-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:13:51,389 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9203W0126-595623 from training. Duration: 0.896 2023-09-30 15:13:51,611 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer2.prob, batch_count=751760.0, ans=0.125 2023-09-30 15:13:52,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_139-328410-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:13:55,741 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.455e+02 1.831e+02 1.967e+02 2.216e+02 3.733e+02, threshold=3.933e+02, percent-clipped=0.0 2023-09-30 15:14:00,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:14:00,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _30039_210_13270_1_1532484612900_2725150_239-304872-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:14:00,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0 from training. Duration: 0.94 2023-09-30 15:14:00,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_135-5501-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:14:05,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0 from training. Duration: 0.72 2023-09-30 15:14:10,389 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0 from training. Duration: 0.88 2023-09-30 15:14:10,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_415-288492-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:14:11,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_544-287179-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:14:13,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_122-32606-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:14:13,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_121-284152-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:14:15,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_350-299477-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:14:15,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:14:15,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:14:15,977 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=384, metric=4.84 vs. limit=10.0 2023-09-30 15:14:16,819 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0 from training. Duration: 0.88 2023-09-30 15:14:18,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:14:18,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_146-162495-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:14:18,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:14:22,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68717_210_3528_1_1535628683_1556759_95-36722-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:14:22,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_654-162611-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:14:27,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9302_210_2550_1_1528889962_4655416_638-301620-0_sp0.9 from training. Duration: 0.52225 2023-09-30 15:14:29,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:14:32,224 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0 from training. Duration: 0.97 2023-09-30 15:14:35,722 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=751960.0, ans=0.1 2023-09-30 15:14:36,903 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0190-675468 from training. Duration: 0.9935 2023-09-30 15:14:38,918 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=9.38 vs. limit=22.5 2023-09-30 15:14:40,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49730_210_13049_1_1534075294_1830676_214-84436-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:14:40,345 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_skip_rate, batch_count=751960.0, ans=0.0 2023-09-30 15:14:42,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_259-322252-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:14:43,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_599-50220-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:14:45,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_174-109016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:14:46,638 INFO [train.py:1039] (3/4) Epoch 22, batch 1250, loss[loss=0.1481, simple_loss=0.2219, pruned_loss=0.03714, over 24322.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2508, pruned_loss=0.04957, over 4685613.14 frames. ], batch size: 56, lr: 4.70e-03, grad_scale: 4.0 2023-09-30 15:14:48,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0 from training. Duration: 0.67 2023-09-30 15:14:50,353 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=752026.6666666666, ans=0.125 2023-09-30 15:14:52,870 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:14:52,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:14:54,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0 from training. Duration: 0.47 2023-09-30 15:14:54,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_94-126128-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:14:54,882 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.dropout.p, batch_count=752026.6666666666, ans=0.1 2023-09-30 15:14:56,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_356-31216-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:15:00,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_443-30034-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 15:15:02,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _26907_210_7234_1_1532221740694_3205263_337-2494-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:15:02,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:15:02,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_553-170128-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:15:06,408 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:15:10,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 15:15:10,954 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp1.1 from training. Duration: 0.636375 2023-09-30 15:15:10,962 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_613-78769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:15:12,547 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53861_210_5399_1_1534422156_4272077_150-336899-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:15:14,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1146-56843-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:15:17,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1194-299928-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:15:18,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:15:24,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0 from training. Duration: 0.93 2023-09-30 15:15:24,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:15:26,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_229-187367-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:15:26,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0 from training. Duration: 0.85 2023-09-30 15:15:28,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:15:28,353 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0171W0508-765603 from training. Duration: 0.541 2023-09-30 15:15:28,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_521-219439-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:15:28,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_741-302616-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:15:32,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _65293_210_9070_1_1535545549765_4004210_438-154735-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:15:36,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_564-132950-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:15:37,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_291-270881-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:15:39,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_353-213656-0 from training. Duration: 0.95 2023-09-30 15:15:39,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0 from training. Duration: 0.77 2023-09-30 15:15:40,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0 from training. Duration: 0.78 2023-09-30 15:15:43,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_347-95003-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:15:43,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0 from training. Duration: 0.88 2023-09-30 15:15:43,775 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_370-229434-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:15:48,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp1.1 from training. Duration: 0.47275 2023-09-30 15:15:48,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_165-123581-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:15:50,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0 from training. Duration: 0.98 2023-09-30 15:15:50,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp0.9 from training. Duration: 0.72225 2023-09-30 15:15:51,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40036_210_13405_1_1533648647_1315388_48-146016-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:15:51,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp0.9 from training. Duration: 0.67775 2023-09-30 15:15:53,197 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_37-207114-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:15:54,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0 from training. Duration: 0.85 2023-09-30 15:15:57,808 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36492_210_7927_1_1533261987_4790834_703-343351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:15:57,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:15:59,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:16:01,674 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0_sp1.1 from training. Duration: 0.636375 2023-09-30 15:16:06,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_697-171068-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:16:08,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0 from training. Duration: 0.93 2023-09-30 15:16:09,841 INFO [train.py:1039] (3/4) Epoch 22, batch 1300, loss[loss=0.1674, simple_loss=0.255, pruned_loss=0.03993, over 24455.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2512, pruned_loss=0.04965, over 4695493.69 frames. ], batch size: 69, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:16:11,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1186-261957-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:16:11,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:16:13,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_159-313751-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:16:14,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_227-203721-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:16:16,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:16:17,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0 from training. Duration: 0.81 2023-09-30 15:16:19,517 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=752360.0, ans=0.1 2023-09-30 15:16:25,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:16:25,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:16:27,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0 from training. Duration: 0.88 2023-09-30 15:16:31,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:16:35,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_449-341946-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:16:37,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_884-327007-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:16:37,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_308-222998-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:16:39,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_344-238708-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:16:41,054 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:16:41,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp0.9 from training. Duration: 0.52225 2023-09-30 15:16:41,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0 from training. Duration: 0.96 2023-09-30 15:16:43,121 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.575e+02 1.847e+02 1.992e+02 2.255e+02 3.036e+02, threshold=3.984e+02, percent-clipped=0.0 2023-09-30 15:16:47,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:16:47,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:16:50,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0 from training. Duration: 0.62 2023-09-30 15:16:52,459 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 15:16:54,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_105-63309-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:16:56,782 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.72 vs. limit=15.0 2023-09-30 15:16:57,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_578-177271-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:16:57,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_32-188147-0 from training. Duration: 0.96 2023-09-30 15:16:57,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_300-254337-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:16:59,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0 from training. Duration: 0.94 2023-09-30 15:17:00,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_53-279178-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:17:01,281 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:17:06,871 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_273-334088-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:17:06,876 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_128-241008-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:17:10,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0 from training. Duration: 0.87 2023-09-30 15:17:10,107 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0 from training. Duration: 0.84 2023-09-30 15:17:12,328 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0 from training. Duration: 0.66 2023-09-30 15:17:19,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:17:22,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0 from training. Duration: 0.72 2023-09-30 15:17:22,618 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=752626.6666666666, ans=0.125 2023-09-30 15:17:23,763 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_616-16795-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:17:28,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=752626.6666666666, ans=0.125 2023-09-30 15:17:28,854 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_abs, batch_count=752626.6666666666, ans=0.5 2023-09-30 15:17:28,885 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass_mid.scale_min, batch_count=752626.6666666666, ans=0.2 2023-09-30 15:17:30,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0 from training. Duration: 0.91 2023-09-30 15:17:31,378 INFO [train.py:1039] (3/4) Epoch 22, batch 1350, loss[loss=0.1685, simple_loss=0.2598, pruned_loss=0.03859, over 24296.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2504, pruned_loss=0.04918, over 4693296.34 frames. ], batch size: 74, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:17:35,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_201-205677-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:17:38,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_43-80483-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:17:39,969 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_240-134124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:17:41,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_907-120940-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:17:41,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:17:43,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:17:45,089 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=752693.3333333334, ans=0.0 2023-09-30 15:17:48,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:17:50,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0 from training. Duration: 0.79 2023-09-30 15:17:50,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:17:51,604 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=5.08 vs. limit=12.0 2023-09-30 15:17:52,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:17:55,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0 from training. Duration: 0.39 2023-09-30 15:17:57,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_275-293897-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:17:58,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_722-52880-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:17:58,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_128-268029-0 from training. Duration: 0.98 2023-09-30 15:18:00,496 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0 from training. Duration: 0.86 2023-09-30 15:18:02,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0 from training. Duration: 0.99 2023-09-30 15:18:02,373 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:18:03,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_290-212862-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:18:03,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0 from training. Duration: 0.86 2023-09-30 15:18:15,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_138-36031-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:18:25,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_300-285878-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:18:26,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_114-340229-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:18:26,086 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0 from training. Duration: 0.85 2023-09-30 15:18:29,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_322-81243-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:18:31,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0 from training. Duration: 0.83 2023-09-30 15:18:31,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_166-314607-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:18:31,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_340-343302-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:18:35,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_286-112015-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:18:37,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0 from training. Duration: 0.67 2023-09-30 15:18:38,520 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=192, metric=6.73 vs. limit=15.0 2023-09-30 15:18:39,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:18:45,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0 from training. Duration: 0.69 2023-09-30 15:18:47,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0 from training. Duration: 0.91 2023-09-30 15:18:49,159 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=752960.0, ans=0.0 2023-09-30 15:18:50,724 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=752960.0, ans=0.0 2023-09-30 15:18:53,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0 from training. Duration: 0.87 2023-09-30 15:18:53,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_100-230287-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:18:55,672 INFO [train.py:1039] (3/4) Epoch 22, batch 1400, loss[loss=0.1667, simple_loss=0.255, pruned_loss=0.03914, over 24443.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2491, pruned_loss=0.04887, over 4691642.31 frames. ], batch size: 69, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:18:57,461 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_276-215622-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:18:58,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_698-309430-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:19:00,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module2.balancer1.prob, batch_count=753026.6666666666, ans=0.125 2023-09-30 15:19:04,609 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0 from training. Duration: 0.63 2023-09-30 15:19:06,233 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0 from training. Duration: 0.86 2023-09-30 15:19:14,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_49-97158-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:19:16,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _61132_210_9969_1_1535021792964_3755419_61-57720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:19:19,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_345-306832-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:19:19,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp0.9 from training. Duration: 0.77775 2023-09-30 15:19:25,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:19:25,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 4133-6541-0027-40495-0_sp1.1 from training. Duration: 0.9681875 2023-09-30 15:19:26,428 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.30 vs. limit=6.0 2023-09-30 15:19:29,130 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.512e+02 1.828e+02 2.054e+02 2.330e+02 3.479e+02, threshold=4.107e+02, percent-clipped=0.0 2023-09-30 15:19:36,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer1.prob, batch_count=753160.0, ans=0.125 2023-09-30 15:19:38,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_514-211165-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:19:38,226 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41211_210_2544_1_1533348859_3702910_97-212695-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:19:42,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0 from training. Duration: 0.8 2023-09-30 15:19:44,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:19:44,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:19:45,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _4366_210_1388_1_1526792053633_3613819_231-211402-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:19:47,757 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_98-21576-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:19:47,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:19:47,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_98-329557-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:19:48,005 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6846_210_6753_1_1527850767_4295158_2-315073-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:19:50,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0 from training. Duration: 0.83 2023-09-30 15:19:50,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_133-158308-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:19:55,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_320-11758-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:19:57,404 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=753226.6666666666, ans=0.125 2023-09-30 15:19:58,637 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:20:08,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0 from training. Duration: 0.58 2023-09-30 15:20:09,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 15:20:10,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_444-286390-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:20:12,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 15:20:16,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_118-345621-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:20:17,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533709215_471063_7-79165-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:20:19,329 INFO [train.py:1039] (3/4) Epoch 22, batch 1450, loss[loss=0.1791, simple_loss=0.2443, pruned_loss=0.05692, over 22837.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.248, pruned_loss=0.04809, over 4695535.79 frames. ], batch size: 322, lr: 4.70e-03, grad_scale: 8.0 2023-09-30 15:20:21,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:20:23,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50188_210_4198_1_1534503621_3646041_353-81682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:20:23,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_175-80565-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:23,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp1.1 from training. Duration: 0.47275 2023-09-30 15:20:29,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_137-267579-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:20:29,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:20:31,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_48-68507-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:20:32,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_128-328162-0 from training. Duration: 0.9 2023-09-30 15:20:32,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:20:33,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_383-316000-0 from training. Duration: 0.9 2023-09-30 15:20:35,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_185-295153-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:35,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:20:35,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0 from training. Duration: 0.83 2023-09-30 15:20:37,237 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_164-152777-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:20:38,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_18-130336-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:20:38,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 15:20:38,828 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:20:40,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:20:40,690 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=753426.6666666666, ans=0.1 2023-09-30 15:20:43,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_32-142780-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:47,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:20:51,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_348-127789-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:20:51,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_288-285445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:20:54,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_308-181732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:20:54,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_895-117428-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:55,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:20:57,334 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:20:57,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _40551_210_10635_1_1533776424632_3416179_361-3964-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:20:57,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_485-122742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:21:02,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0 from training. Duration: 0.73 2023-09-30 15:21:04,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _30548_210_16040_1_1532608097130_3146969_371-280610-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:21:07,271 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0081-331924 from training. Duration: 0.896 2023-09-30 15:21:08,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_380-160775-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:21:10,452 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:21:11,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_853-150237-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:21:14,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0 from training. Duration: 0.98 2023-09-30 15:21:18,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_836-244045-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:21:20,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _14880_210_8895_1_1530680426655_3795259_289-212817-0 from training. Duration: 0.69 2023-09-30 15:21:21,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_303-340446-0 from training. Duration: 0.9 2023-09-30 15:21:23,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_418-41736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:21:25,240 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=753626.6666666666, ans=0.0 2023-09-30 15:21:26,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_371-18169-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:21:26,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_187-219984-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:21:28,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0 from training. Duration: 0.9 2023-09-30 15:21:31,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0 from training. Duration: 0.98 2023-09-30 15:21:33,492 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0 from training. Duration: 0.55 2023-09-30 15:21:33,648 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_557-163391-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:21:35,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:21:41,325 INFO [train.py:1039] (3/4) Epoch 22, batch 1500, loss[loss=0.1597, simple_loss=0.2329, pruned_loss=0.04323, over 23302.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2489, pruned_loss=0.04853, over 4707447.40 frames. ], batch size: 119, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:21:46,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0 from training. Duration: 0.77 2023-09-30 15:21:46,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:21:46,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:21:47,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_237-136449-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:21:47,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _3120_210_3525_1_1526036143112_4072980_74-178400-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:21:47,924 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=753693.3333333334, ans=0.1 2023-09-30 15:21:49,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:21:51,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_172-171750-0 from training. Duration: 0.65 2023-09-30 15:21:52,947 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:21:53,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:21:53,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_818-213552-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:21:54,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:21:56,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_291-309749-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:21:57,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_715-3462-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:21:57,951 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=753760.0, ans=0.125 2023-09-30 15:22:02,760 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_13-331827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:22:02,791 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0 from training. Duration: 0.87 2023-09-30 15:22:04,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:22:04,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:22:04,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_480-77655-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:22:07,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0 from training. Duration: 0.87 2023-09-30 15:22:12,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0 from training. Duration: 0.76 2023-09-30 15:22:14,241 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_645-233480-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:22:14,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0 from training. Duration: 0.58 2023-09-30 15:22:14,535 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=753826.6666666666, ans=0.125 2023-09-30 15:22:15,596 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.832e+02 2.019e+02 2.350e+02 4.853e+02, threshold=4.037e+02, percent-clipped=1.0 2023-09-30 15:22:17,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp1.1 from training. Duration: 0.52725 2023-09-30 15:22:20,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:22:21,700 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_136-264444-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:22:21,727 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2168_210_2290_1_1525606920_1849400_136-222991-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:22:23,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0 from training. Duration: 0.72 2023-09-30 15:22:23,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:22:24,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _39688_210_19596_1_1533371626725_4154159_551-167834-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:22:24,842 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0 from training. Duration: 0.87 2023-09-30 15:22:25,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_113-113534-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:22:30,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_319-349181-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:22:30,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0 from training. Duration: 0.94 2023-09-30 15:22:33,586 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.convnext.out_whiten, num_groups=1, num_channels=128, metric=4.45 vs. limit=5.0 2023-09-30 15:22:37,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:22:38,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:22:42,280 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0112-501244 from training. Duration: 0.768 2023-09-30 15:22:43,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_177-326952-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:22:43,730 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9016W0116-502789 from training. Duration: 0.896 2023-09-30 15:22:44,057 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=753893.3333333334, ans=0.0 2023-09-30 15:22:46,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_557-204815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:22:46,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_279-229469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:22:46,972 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=753960.0, ans=0.125 2023-09-30 15:22:48,212 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0375-509764 from training. Duration: 0.896 2023-09-30 15:22:49,657 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:22:52,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0 from training. Duration: 0.65 2023-09-30 15:22:52,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_289-163927-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:22:54,787 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=753960.0, ans=0.2 2023-09-30 15:22:57,431 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_86-225314-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:22:57,472 WARNING [train.py:1197] (3/4) Exclude cut with ID _7877_210_6014_1_1528286358099_3750136_618-284064-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:22:58,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1017-31469-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:22:59,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_617-241446-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:22:59,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:23:01,592 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=754026.6666666666, ans=0.0 2023-09-30 15:23:01,622 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=754026.6666666666, ans=0.125 2023-09-30 15:23:03,218 INFO [train.py:1039] (3/4) Epoch 22, batch 1550, loss[loss=0.176, simple_loss=0.2616, pruned_loss=0.0452, over 24671.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2503, pruned_loss=0.04895, over 4714797.50 frames. ], batch size: 68, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:23:03,417 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0 from training. Duration: 0.56 2023-09-30 15:23:04,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0 from training. Duration: 0.97 2023-09-30 15:23:05,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_287-18120-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:23:06,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0 from training. Duration: 0.84 2023-09-30 15:23:07,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0 from training. Duration: 0.9 2023-09-30 15:23:08,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_449-65969-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:23:09,619 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=10.87 vs. limit=15.0 2023-09-30 15:23:10,283 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_553-341452-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:23:10,349 WARNING [train.py:1197] (3/4) Exclude cut with ID _16909_210_5724_1_1531292079912_4118919_300-47574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:23:11,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:23:11,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_727-28074-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:23:13,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_357-123664-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:23:16,998 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0004-555964 from training. Duration: 0.896 2023-09-30 15:23:17,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_445-145208-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:23:17,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:23:18,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:23:21,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_146-282400-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:23:21,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0 from training. Duration: 0.71 2023-09-30 15:23:21,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_128-146364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:23:23,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_224-322394-0 from training. Duration: 0.66 2023-09-30 15:23:24,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0 from training. Duration: 0.88 2023-09-30 15:23:24,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0 from training. Duration: 0.98 2023-09-30 15:23:24,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_523-79838-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:23:26,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_398-334558-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:23:30,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_171-36697-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:23:32,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0 from training. Duration: 0.68 2023-09-30 15:23:32,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0 from training. Duration: 0.69 2023-09-30 15:23:40,261 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.hidden_balancer.prob, batch_count=754160.0, ans=0.125 2023-09-30 15:23:41,626 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:23:46,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_1022-83740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:23:48,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp0.9 from training. Duration: 0.9 2023-09-30 15:23:48,448 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:23:49,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0 from training. Duration: 0.41 2023-09-30 15:23:55,342 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.whiten, num_groups=1, num_channels=512, metric=4.54 vs. limit=12.0 2023-09-30 15:23:56,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_299-34413-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:23:56,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_664-159457-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:23:59,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_483-291760-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:24:01,004 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_287-255474-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:24:02,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:24:02,488 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0 from training. Duration: 0.83 2023-09-30 15:24:03,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:24:05,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:24:05,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_953-80868-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:24:06,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0_sp0.9 from training. Duration: 0.57775 2023-09-30 15:24:06,971 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0197-640324 from training. Duration: 0.896 2023-09-30 15:24:10,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_361-30822-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:24:16,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0 from training. Duration: 0.94 2023-09-30 15:24:22,794 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_468-333827-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:24:22,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_623-191759-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:24:24,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0 from training. Duration: 0.42 2023-09-30 15:24:26,439 INFO [train.py:1039] (3/4) Epoch 22, batch 1600, loss[loss=0.1486, simple_loss=0.2276, pruned_loss=0.03474, over 24573.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2496, pruned_loss=0.04872, over 4720490.17 frames. ], batch size: 60, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:24:26,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:24:28,101 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_401-346646-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:24:28,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:24:28,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:24:29,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_379-49457-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:24:34,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_200-105218-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:24:34,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0 from training. Duration: 0.67 2023-09-30 15:24:35,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_916-195848-0 from training. Duration: 0.92 2023-09-30 15:24:38,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0 from training. Duration: 0.65 2023-09-30 15:24:40,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_302-180617-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:24:41,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0 from training. Duration: 0.99 2023-09-30 15:24:42,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_265-215428-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:24:42,720 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=384, metric=3.03 vs. limit=15.0 2023-09-30 15:24:45,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_438-198993-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:24:47,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module1.balancer1.prob, batch_count=754426.6666666666, ans=0.125 2023-09-30 15:24:50,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:24:54,030 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0 from training. Duration: 0.37 2023-09-30 15:24:54,374 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=754426.6666666666, ans=0.125 2023-09-30 15:24:57,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_452-256669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:24:59,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0 from training. Duration: 0.83 2023-09-30 15:24:59,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_903-158756-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:00,929 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.593e+02 1.869e+02 2.022e+02 2.307e+02 3.871e+02, threshold=4.044e+02, percent-clipped=0.0 2023-09-30 15:25:01,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_397-216497-0 from training. Duration: 0.96 2023-09-30 15:25:07,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0 from training. Duration: 0.97 2023-09-30 15:25:08,995 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=754493.3333333334, ans=0.0 2023-09-30 15:25:10,536 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=754493.3333333334, ans=0.0 2023-09-30 15:25:12,203 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.skip_rate, batch_count=754493.3333333334, ans=0.04949747468305833 2023-09-30 15:25:13,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_453-267730-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:25:15,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0 from training. Duration: 0.56 2023-09-30 15:25:16,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_312-329122-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:25:16,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_97-126471-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:25:16,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:25:18,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp0.9 from training. Duration: 0.511125 2023-09-30 15:25:19,966 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=754560.0, ans=0.1 2023-09-30 15:25:25,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_282-136641-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 15:25:26,635 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_493-160724-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:25:26,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_259-33442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:28,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_289-62384-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:28,239 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_379-14182-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:25:31,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:25:32,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:25:33,532 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:25:39,019 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=754626.6666666666, ans=0.2 2023-09-30 15:25:40,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_814-107647-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:40,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:25:43,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0 from training. Duration: 0.89 2023-09-30 15:25:43,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:25:45,390 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=754626.6666666666, ans=0.125 2023-09-30 15:25:46,468 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0 from training. Duration: 0.72 2023-09-30 15:25:47,985 INFO [train.py:1039] (3/4) Epoch 22, batch 1650, loss[loss=0.1732, simple_loss=0.2565, pruned_loss=0.04494, over 24439.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.25, pruned_loss=0.04882, over 4720379.74 frames. ], batch size: 69, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:25:51,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1326-276386-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:25:51,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:25:52,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_631-257029-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:25:52,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0 from training. Duration: 0.87 2023-09-30 15:25:52,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp1.1 from training. Duration: 0.37275 2023-09-30 15:25:52,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_441-91466-0 from training. Duration: 0.97 2023-09-30 15:25:52,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_108-53929-0 from training. Duration: 0.59 2023-09-30 15:25:58,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_260-1942-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:25:58,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_198-61547-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:26:00,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_412-142122-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:26:00,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:26:03,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_334-298697-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:26:06,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5465_210_4211_1_1527227253_55357_8-2499-0_sp1.1 from training. Duration: 0.996375 2023-09-30 15:26:08,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_447-201565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:26:08,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_253-161789-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:26:08,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:26:08,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:26:09,984 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_68-212801-0 from training. Duration: 0.73 2023-09-30 15:26:10,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0 from training. Duration: 0.81 2023-09-30 15:26:16,397 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:26:19,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:26:27,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_125-322172-0 from training. Duration: 0.93 2023-09-30 15:26:27,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_493-60474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:26:28,979 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_156-302675-0 from training. Duration: 0.55 2023-09-30 15:26:32,961 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_123-267862-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:26:35,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_201-143747-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:26:35,913 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:26:37,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1347-145675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:26:38,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:26:38,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_400-201896-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:26:41,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_326-174612-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:26:43,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_390-116233-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:26:43,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:26:43,812 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=10.85 vs. limit=15.0 2023-09-30 15:26:44,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:26:46,138 WARNING [train.py:1197] (3/4) Exclude cut with ID _7857_210_1534_1_1528286515745_6831470_431-338528-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:26:47,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:26:50,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _24055_210_12651_1_1532310907143_976979_92-59356-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:26:52,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0 from training. Duration: 0.56 2023-09-30 15:26:53,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:26:53,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0 from training. Duration: 0.76 2023-09-30 15:26:55,480 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0 from training. Duration: 0.69 2023-09-30 15:26:56,899 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467760_2130483_170-259044-0 from training. Duration: 0.7 2023-09-30 15:26:56,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_153-216718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:26:57,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:26:57,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_126-347531-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:26:57,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=754960.0, ans=0.1 2023-09-30 15:26:58,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _7614_210_4915_1_1529326764092_4537359_96-162197-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:26:58,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0 from training. Duration: 0.97 2023-09-30 15:27:01,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _64029_210_4381_1_1535178502772_3756459_275-16710-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:27:04,093 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7264_210_4938_1_1527939911_8132095_800-91995-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:27:04,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_50-193879-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:27:07,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0 from training. Duration: 0.96 2023-09-30 15:27:10,557 INFO [train.py:1039] (3/4) Epoch 22, batch 1700, loss[loss=0.1751, simple_loss=0.2599, pruned_loss=0.04513, over 24652.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2497, pruned_loss=0.04855, over 4725015.76 frames. ], batch size: 68, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:27:12,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_206-14076-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:27:12,790 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:27:14,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0 from training. Duration: 0.96 2023-09-30 15:27:16,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:27:16,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:27:16,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_88-23776-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:27:17,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_245-256666-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:27:19,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_636-251213-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:27:19,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0 from training. Duration: 0.92 2023-09-30 15:27:22,352 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:27:27,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_373-225403-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:27:29,227 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.bypass.scale_min, batch_count=755093.3333333334, ans=0.2 2023-09-30 15:27:30,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _20622_210_3852_1_1531622427080_1093839_57-326027-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:27:37,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_605-257205-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:27:37,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:27:37,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:27:38,267 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=755093.3333333334, ans=0.1 2023-09-30 15:27:39,373 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_306-44736-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:27:42,331 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0 from training. Duration: 0.94 2023-09-30 15:27:43,963 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:27:43,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _10301_210_5886_1_1529222275332_3910620_216-46959-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:27:45,842 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.451e+02 1.865e+02 2.081e+02 2.403e+02 3.253e+02, threshold=4.162e+02, percent-clipped=0.0 2023-09-30 15:27:46,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_91-277684-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:27:48,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:27:49,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0 from training. Duration: 0.79 2023-09-30 15:27:51,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0 from training. Duration: 0.92 2023-09-30 15:27:51,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49348_210_7927_1_1533954500_3691300_109-276430-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:27:53,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0 from training. Duration: 0.68 2023-09-30 15:27:54,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:27:57,806 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=755160.0, ans=0.1 2023-09-30 15:28:02,582 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module1.balancer2.prob, batch_count=755226.6666666666, ans=0.125 2023-09-30 15:28:03,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1252-235481-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:28:05,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_813-261554-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:28:06,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:28:07,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp1.1 from training. Duration: 0.663625 2023-09-30 15:28:07,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp1.1 from training. Duration: 0.363625 2023-09-30 15:28:07,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _42321_210_7770_1_1533468631352_4366895_104-215915-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:28:10,752 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_216-22824-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:28:10,753 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0 from training. Duration: 0.91 2023-09-30 15:28:11,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1024-265645-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:28:11,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_412-233904-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:28:12,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_911-223461-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:28:13,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_558-148266-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:28:16,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _58480_210_12392_1_1534811121111_3646160_212-164495-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:28:16,166 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:28:17,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _24736_210_3279_1_1532332894218_2838700_73-197086-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:28:19,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:28:19,815 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_345-171438-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:28:23,602 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_28211_210_6388_1_1532861506_4523224_22-25766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:28:25,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0 from training. Duration: 0.86 2023-09-30 15:28:28,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _56927_210_9969_1_1534848970840_3695229_409-225129-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:28:29,637 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26192_210_7927_1_1532137269_1040115_21-219331-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:28:32,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0 from training. Duration: 0.57 2023-09-30 15:28:34,071 INFO [train.py:1039] (3/4) Epoch 22, batch 1750, loss[loss=0.1648, simple_loss=0.236, pruned_loss=0.04683, over 23263.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2481, pruned_loss=0.04823, over 4711939.91 frames. ], batch size: 105, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:28:38,892 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_582-191016-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:28:41,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_270-210032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:28:41,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0_sp0.9 from training. Duration: 0.62225 2023-09-30 15:28:42,203 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=755360.0, ans=0.125 2023-09-30 15:28:43,326 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0 from training. Duration: 0.44 2023-09-30 15:28:43,383 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_573-46532-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:28:46,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_247-102591-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:28:46,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_200-21395-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:28:51,669 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=7.64 vs. limit=15.0 2023-09-30 15:28:52,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0 from training. Duration: 0.43 2023-09-30 15:28:54,662 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_53-75025-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:28:57,609 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0 from training. Duration: 0.64 2023-09-30 15:28:57,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_337-294431-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:28:59,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_110-274654-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:29:02,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 15:29:02,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0 from training. Duration: 0.43 2023-09-30 15:29:06,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_215-294589-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:29:06,096 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_548-294316-0 from training. Duration: 0.81 2023-09-30 15:29:13,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:29:16,796 WARNING [train.py:1197] (3/4) Exclude cut with ID _18199_210_6109_1_1531475789864_4288790_295-198247-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:29:16,829 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_103-313224-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:29:20,024 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_67-294673-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:29:20,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_350-101734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:29:22,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_204-153986-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:29:24,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_673-115569-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:29:28,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:29:28,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_575-102496-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:29:29,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_669-93163-0 from training. Duration: 0.98 2023-09-30 15:29:32,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_249-281349-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:29:35,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_106-30782-0 from training. Duration: 0.8 2023-09-30 15:29:36,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:29:38,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_516-151727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:29:38,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:29:43,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:29:44,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_123-24428-0_sp1.1 from training. Duration: 0.436375 2023-09-30 15:29:44,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1185-56473-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:29:46,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:29:46,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.min_positive, batch_count=755626.6666666666, ans=0.05 2023-09-30 15:29:47,949 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=755626.6666666666, ans=0.125 2023-09-30 15:29:48,093 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=755626.6666666666, ans=0.0 2023-09-30 15:29:50,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_104-72815-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:29:52,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:29:53,991 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6239_210_4211_1_1527765584_4306698_528-102186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:29:54,782 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_351-147136-0 from training. Duration: 0.87 2023-09-30 15:29:56,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_520-51744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:29:56,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:29:56,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_450-165710-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:29:56,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_280-120942-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:29:58,269 INFO [train.py:1039] (3/4) Epoch 22, batch 1800, loss[loss=0.1613, simple_loss=0.2427, pruned_loss=0.04001, over 24531.00 frames. ], tot_loss[loss=0.1721, simple_loss=0.2479, pruned_loss=0.04818, over 4717060.40 frames. ], batch size: 63, lr: 4.69e-03, grad_scale: 16.0 2023-09-30 15:29:58,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _65585_210_12210_1_1535450451584_3764098_112-294301-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:29:58,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:29:58,660 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=755693.3333333334, ans=0.125 2023-09-30 15:30:02,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:30:02,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_336-208232-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:30:03,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 15:30:05,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_559-29840-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:30:10,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 15:30:11,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_185-112289-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:30:13,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_155-298797-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:30:13,990 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:30:16,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_246-206531-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:30:18,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_469-151944-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:30:18,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_88-276668-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:30:19,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:30:19,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0 from training. Duration: 0.9 2023-09-30 15:30:21,150 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_400-254159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:30:24,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_158-345341-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:30:27,898 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0 from training. Duration: 0.83 2023-09-30 15:30:31,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0 from training. Duration: 0.85 2023-09-30 15:30:31,561 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0 from training. Duration: 0.76 2023-09-30 15:30:33,015 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_571-249460-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:30:34,969 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.922e+02 2.224e+02 2.607e+02 3.579e+02, threshold=4.447e+02, percent-clipped=0.0 2023-09-30 15:30:35,128 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_230-325568-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:30:35,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _34415_210_8750_1_1533085213375_3451116_213-267570-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:30:35,524 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=755826.6666666666, ans=0.0 2023-09-30 15:30:37,162 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:30:42,171 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0653W0001-322925 from training. Duration: 0.896 2023-09-30 15:30:43,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:30:45,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_187-204971-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:30:48,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0 from training. Duration: 0.99 2023-09-30 15:30:49,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0 from training. Duration: 0.86 2023-09-30 15:30:49,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:30:51,359 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_488-330913-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:30:52,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:30:57,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0 from training. Duration: 0.67 2023-09-30 15:31:04,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:31:04,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0 from training. Duration: 0.5 2023-09-30 15:31:05,955 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_428-32114-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:31:05,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_467-263151-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:31:08,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:31:08,079 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0 from training. Duration: 0.44 2023-09-30 15:31:09,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:31:09,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_56-307796-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:31:11,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0 from training. Duration: 0.87 2023-09-30 15:31:11,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_558-155750-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:31:12,141 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn2.whiten, num_groups=1, num_channels=512, metric=12.01 vs. limit=22.5 2023-09-30 15:31:12,300 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=256, metric=13.44 vs. limit=22.5 2023-09-30 15:31:15,361 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_142-309370-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:31:16,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_18-46756-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:31:16,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_279-212159-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:31:18,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_164-2641-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:31:18,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:31:20,007 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1077-184261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:31:20,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1176-204992-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:31:21,272 INFO [train.py:1039] (3/4) Epoch 22, batch 1850, loss[loss=0.1681, simple_loss=0.2567, pruned_loss=0.03974, over 24681.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2485, pruned_loss=0.04831, over 4710242.97 frames. ], batch size: 73, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:31:22,250 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=11.80 vs. limit=15.0 2023-09-30 15:31:24,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:31:24,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:31:32,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:31:32,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0 from training. Duration: 0.89 2023-09-30 15:31:35,366 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0 from training. Duration: 0.82 2023-09-30 15:31:39,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_187-80815-0 from training. Duration: 0.93 2023-09-30 15:31:44,233 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_414-349640-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:31:44,273 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0 from training. Duration: 0.91 2023-09-30 15:31:45,646 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 15:31:54,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:31:55,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0 from training. Duration: 0.98 2023-09-30 15:31:57,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=756160.0, ans=0.0 2023-09-30 15:31:58,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _36399_210_16049_1_1532998771377_3698333_207-259013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:31:58,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_567-111756-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:32:03,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0 from training. Duration: 0.75 2023-09-30 15:32:04,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_43-305214-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:04,972 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:32:06,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_146-218763-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:32:08,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:32:09,989 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=756226.6666666666, ans=0.125 2023-09-30 15:32:13,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_709-16721-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:32:17,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:32:17,099 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_267-197536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:17,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_658-282610-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 15:32:17,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_257-67050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:32:18,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14906_210_7927_1_1530754190_9408633_961-53402-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:32:20,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_490-280290-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:32:23,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0 from training. Duration: 0.77 2023-09-30 15:32:25,255 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_565-326898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:32:29,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_239-316802-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:32:31,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_745-98391-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:32:31,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0 from training. Duration: 0.8 2023-09-30 15:32:31,248 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0 from training. Duration: 0.57 2023-09-30 15:32:34,187 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0005-507335 from training. Duration: 0.896 2023-09-30 15:32:34,320 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0034-509435 from training. Duration: 0.64 2023-09-30 15:32:35,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:32:35,929 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46639_210_8341_1_1533726088_4495500_66-346325-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:32:35,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_145-137790-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:32:35,978 WARNING [train.py:1197] (3/4) Exclude cut with ID _36522_210_8887_1_1533177030611_1772478_7-327299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:36,251 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=756293.3333333334, ans=0.125 2023-09-30 15:32:36,619 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=384, metric=12.17 vs. limit=22.5 2023-09-30 15:32:37,474 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0009-515615 from training. Duration: 0.896 2023-09-30 15:32:37,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:32:37,551 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_362-242912-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:39,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:32:40,602 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_272-215563-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:32:41,985 INFO [train.py:1039] (3/4) Epoch 22, batch 1900, loss[loss=0.1766, simple_loss=0.2704, pruned_loss=0.0414, over 24329.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.25, pruned_loss=0.04844, over 4714683.69 frames. ], batch size: 74, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:32:42,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_553-175603-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:32:42,127 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_963-187109-0 from training. Duration: 0.99 2023-09-30 15:32:42,493 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=756360.0, ans=0.0 2023-09-30 15:32:43,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _44973_210_1511_1_1533794381163_3634770_495-211466-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:32:43,815 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9068W0002-529085 from training. Duration: 0.896 2023-09-30 15:32:43,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:32:45,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_149-49689-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:32:52,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_36-220794-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:32:54,623 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:32:56,156 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9104W0004-547160 from training. Duration: 0.896 2023-09-30 15:32:57,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0 from training. Duration: 0.67 2023-09-30 15:32:59,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:32:59,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_476-322050-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:33:01,269 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9118W0017-553385 from training. Duration: 0.896 2023-09-30 15:33:01,323 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9120W0028-554435 from training. Duration: 0.896 2023-09-30 15:33:01,830 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:33:04,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _56524_210_9642_1_1534572011779_3619170_145-310842-0 from training. Duration: 0.99 2023-09-30 15:33:07,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _51631_210_8254_1_1534129228420_4084794_270-222508-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:33:08,996 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=756426.6666666666, ans=0.125 2023-09-30 15:33:11,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_331-105351-0 from training. Duration: 0.65 2023-09-30 15:33:12,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_40-129756-0 from training. Duration: 0.81 2023-09-30 15:33:17,134 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=756493.3333333334, ans=0.0 2023-09-30 15:33:18,268 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.485e+02 1.802e+02 1.975e+02 2.316e+02 3.738e+02, threshold=3.949e+02, percent-clipped=0.0 2023-09-30 15:33:23,514 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0 from training. Duration: 0.78 2023-09-30 15:33:27,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_214-291369-0 from training. Duration: 0.63 2023-09-30 15:33:27,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_369-84347-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:33:28,067 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9210W0111-599240 from training. Duration: 0.896 2023-09-30 15:33:28,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_92-246270-0 from training. Duration: 0.85 2023-09-30 15:33:29,388 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0 from training. Duration: 0.98 2023-09-30 15:33:29,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0 from training. Duration: 0.85 2023-09-30 15:33:29,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _65962_210_29927_1_1535436026519_4174930_368-44639-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:33:32,006 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2, num_groups=1, num_channels=512, metric=5.93 vs. limit=15.0 2023-09-30 15:33:32,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0 from training. Duration: 0.97 2023-09-30 15:33:36,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:33:38,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_196-152868-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:33:38,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_140-181176-0 from training. Duration: 0.92 2023-09-30 15:33:40,067 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=756560.0, ans=0.0 2023-09-30 15:33:41,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_168-32906-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:33:43,763 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.41 vs. limit=10.0 2023-09-30 15:33:44,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0 from training. Duration: 0.82 2023-09-30 15:33:45,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:33:49,344 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=756626.6666666666, ans=0.1 2023-09-30 15:33:53,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_63-267585-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:33:53,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:33:53,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_194-143381-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:33:53,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_194-16667-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:33:55,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:33:57,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp1.1 from training. Duration: 0.42725 2023-09-30 15:33:57,372 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass_mid.scale_min, batch_count=756626.6666666666, ans=0.2 2023-09-30 15:33:59,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:34:00,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_1037-45317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:34:00,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:34:03,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:34:03,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_225-180155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:34:05,180 INFO [train.py:1039] (3/4) Epoch 22, batch 1950, loss[loss=0.1907, simple_loss=0.2619, pruned_loss=0.05975, over 23437.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2505, pruned_loss=0.04868, over 4725558.58 frames. ], batch size: 285, lr: 4.69e-03, grad_scale: 8.0 2023-09-30 15:34:05,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:34:06,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_691-70244-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:34:10,068 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_91-197981-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:34:13,002 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:34:13,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_5-214617-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:34:13,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:34:18,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_660-211088-0 from training. Duration: 0.62 2023-09-30 15:34:18,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 15:34:18,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_362-66225-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:34:19,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _61118_210_9527_1_1534897668884_3752630_173-258555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:34:22,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:34:22,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _15140_210_9288_1_1530791388450_4880149_539-153708-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:34:22,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_253-42544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:34:25,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_479-296877-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:34:26,572 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.48 vs. limit=15.0 2023-09-30 15:34:29,121 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:34:29,154 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:34:29,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:34:30,993 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_893-296030-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:34:32,895 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=756760.0, ans=0.2 2023-09-30 15:34:34,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_401-343820-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:34:38,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_127-566-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:34:38,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_126-272847-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:34:38,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp1.1 from training. Duration: 0.52725 2023-09-30 15:34:38,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0 from training. Duration: 0.72 2023-09-30 15:34:40,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:34:40,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_391-219682-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:34:41,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_96-137189-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:34:42,046 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module2.balancer1.prob, batch_count=756826.6666666666, ans=0.125 2023-09-30 15:34:43,626 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer2.min_abs, batch_count=756826.6666666666, ans=0.5 2023-09-30 15:34:46,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_15-276301-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:34:47,825 WARNING [train.py:1197] (3/4) Exclude cut with ID _59502_210_12210_1_1535107024698_1835139_110-289074-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:34:52,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:34:56,409 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=756893.3333333334, ans=0.125 2023-09-30 15:34:57,627 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_353-9541-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:34:57,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:34:57,750 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0 from training. Duration: 0.81 2023-09-30 15:34:59,255 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_411-204131-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:35:03,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_822-289909-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:35:03,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:35:05,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:35:05,697 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer1.prob, batch_count=756893.3333333334, ans=0.125 2023-09-30 15:35:14,873 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_439-257766-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:35:14,996 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1294-63152-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:35:17,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_319-166512-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:35:19,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_146-3470-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:35:20,044 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.balancer2.prob, batch_count=756960.0, ans=0.125 2023-09-30 15:35:22,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_678-159307-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:35:22,830 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_343-48822-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:35:24,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0 from training. Duration: 0.86 2023-09-30 15:35:24,193 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:35:25,654 WARNING [train.py:1197] (3/4) Exclude cut with ID _55644_210_11856_1_1534593599582_4103719_293-248694-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:35:27,168 INFO [train.py:1039] (3/4) Epoch 22, batch 2000, loss[loss=0.1784, simple_loss=0.2574, pruned_loss=0.04971, over 23395.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2511, pruned_loss=0.04913, over 4728034.57 frames. ], batch size: 93, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:35:27,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0 from training. Duration: 0.75 2023-09-30 15:35:29,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:35:31,940 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.whiten, num_groups=1, num_channels=256, metric=3.77 vs. limit=12.0 2023-09-30 15:35:32,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:35:34,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:35:34,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_52-43530-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:35:37,148 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_96-320736-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:35:38,673 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_1159-225508-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:35:40,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0 from training. Duration: 0.94 2023-09-30 15:35:42,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:35:44,674 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_357-257062-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:35:46,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0 from training. Duration: 0.73 2023-09-30 15:35:49,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:35:49,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:35:51,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.bypass.skip_rate, batch_count=757093.3333333334, ans=0.04949747468305833 2023-09-30 15:35:52,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_278-251666-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:35:54,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_115-326778-0 from training. Duration: 0.93 2023-09-30 15:35:54,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_31-188961-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:35:56,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1073-6264-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:35:57,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_435-259515-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:35:57,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0 from training. Duration: 0.83 2023-09-30 15:35:59,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:36:00,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_395-340852-0 from training. Duration: 0.93 2023-09-30 15:36:00,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_138-187031-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:36:04,195 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.620e+02 2.020e+02 2.308e+02 2.637e+02 3.987e+02, threshold=4.617e+02, percent-clipped=1.0 2023-09-30 15:36:04,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13757_210_3528_1_1530440682_4107239_48-106347-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:36:05,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_164-224052-0_sp1.1 from training. Duration: 0.636375 2023-09-30 15:36:05,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_408-322648-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:05,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_827-36359-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:36:07,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _4680_210_3525_1_1527249757953_893386_75-78979-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:36:08,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_383-165234-0 from training. Duration: 0.94 2023-09-30 15:36:10,792 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward2.hidden_balancer.prob, batch_count=757160.0, ans=0.125 2023-09-30 15:36:11,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_94-229927-0 from training. Duration: 0.97 2023-09-30 15:36:11,913 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6788_210_1388_1_1527898698_4562021_180-75288-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:36:11,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_54-321349-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:36:17,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _56108_210_16380_1_1534505446159_3887959_265-161493-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:36:18,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_395-111602-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:36:18,609 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:36:18,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_381-116041-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:36:20,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_65-104124-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:36:22,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _6536_210_1883_1_1527850783339_3319069_169-23811-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:36:22,645 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=757226.6666666666, ans=0.0 2023-09-30 15:36:23,846 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:36:23,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_226-242400-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:36:24,042 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_276-216330-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:24,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=757226.6666666666, ans=0.125 2023-09-30 15:36:27,097 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_198-174328-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:36:27,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_338-96520-0 from training. Duration: 0.94 2023-09-30 15:36:27,388 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=757226.6666666666, ans=0.0 2023-09-30 15:36:33,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:36:33,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _36692_210_5665_1_1533608967420_398809_8-16261-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:36:38,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_86-212657-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:36:38,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_626-298884-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:36:41,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _49755_210_18053_1_1534581005846_3218709_120-257918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:43,592 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_400-113821-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:36:43,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_387-236773-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:45,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:36:45,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:36:50,066 INFO [train.py:1039] (3/4) Epoch 22, batch 2050, loss[loss=0.1631, simple_loss=0.2272, pruned_loss=0.04949, over 23489.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.251, pruned_loss=0.0494, over 4722422.04 frames. ], batch size: 285, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:36:50,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_175-17485-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:36:51,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1004-183422-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:36:55,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_32-65962-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:36:55,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_40-298664-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:37:00,724 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=17.41 vs. limit=22.5 2023-09-30 15:37:01,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_380-261863-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:37:03,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:37:03,142 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_57-82205-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:37:04,601 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:37:06,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0 from training. Duration: 0.97 2023-09-30 15:37:06,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_286-251333-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:37:06,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_518-80679-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:37:07,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:37:08,134 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff3_skip_rate, batch_count=757426.6666666666, ans=0.0 2023-09-30 15:37:17,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:37:17,621 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_154-31455-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:37:19,836 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0 from training. Duration: 0.72 2023-09-30 15:37:22,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_674-204247-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:37:23,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0 from training. Duration: 0.82 2023-09-30 15:37:23,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _4779_210_2084_1_1527318022944_3914893_500-285013-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:37:26,947 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.conv_module2.whiten, num_groups=1, num_channels=192, metric=9.38 vs. limit=15.0 2023-09-30 15:37:27,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _14421_210_8750_1_1530604795825_3542290_190-313578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:37:30,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_538-273574-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:37:32,104 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:37:32,164 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_468-143126-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:37:35,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_258-121681-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:37:36,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:37:36,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:37:39,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_297-86890-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:37:41,448 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_55-301844-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:37:43,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:37:45,110 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_55-19758-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:37:48,382 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:37:55,012 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:37:56,667 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=757626.6666666666, ans=0.0 2023-09-30 15:37:58,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0 from training. Duration: 0.86 2023-09-30 15:37:59,537 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=757626.6666666666, ans=0.125 2023-09-30 15:38:01,741 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.77 vs. limit=6.0 2023-09-30 15:38:02,620 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=757626.6666666666, ans=0.0 2023-09-30 15:38:03,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_230-317241-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:38:05,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_125-203105-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:38:07,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=757626.6666666666, ans=0.1 2023-09-30 15:38:08,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:38:08,628 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_18-174647-0 from training. Duration: 0.91 2023-09-30 15:38:11,776 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0165W0005-81726 from training. Duration: 0.952 2023-09-30 15:38:11,777 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_31583_210_3852_1_1532595851_4024413_148-171847-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:38:11,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_116-138769-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:38:13,224 INFO [train.py:1039] (3/4) Epoch 22, batch 2100, loss[loss=0.1757, simple_loss=0.2555, pruned_loss=0.04796, over 23357.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2494, pruned_loss=0.04845, over 4721080.69 frames. ], batch size: 94, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:38:13,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_489-292631-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:38:13,462 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_202-27960-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:38:13,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0 from training. Duration: 0.55 2023-09-30 15:38:14,920 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0 from training. Duration: 0.96 2023-09-30 15:38:17,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:38:21,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:38:21,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_234-326497-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:38:24,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_301-77873-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:38:24,539 WARNING [train.py:1197] (3/4) Exclude cut with ID _45297_210_2721_1_1533715351381_3264949_152-154697-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:38:24,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3371_210_1794_1_1526384462_5580777_396-339812-0 from training. Duration: 0.85 2023-09-30 15:38:26,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:38:28,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0 from training. Duration: 0.76 2023-09-30 15:38:28,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0 from training. Duration: 0.88 2023-09-30 15:38:28,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=757760.0, ans=0.0 2023-09-30 15:38:31,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_519-343462-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:38:31,212 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:38:31,223 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0 from training. Duration: 0.78 2023-09-30 15:38:31,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 15:38:38,289 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_113-157651-0 from training. Duration: 0.68 2023-09-30 15:38:38,291 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:38:38,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=757760.0, ans=0.125 2023-09-30 15:38:41,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_195-64967-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:38:41,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_365-215065-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:38:42,338 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=9.51 vs. limit=15.0 2023-09-30 15:38:44,991 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module1.balancer2.prob, batch_count=757826.6666666666, ans=0.125 2023-09-30 15:38:46,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:38:46,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0 from training. Duration: 0.69 2023-09-30 15:38:47,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _30918_210_6400_1_1532754078600_3125920_407-223394-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:38:47,868 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0_sp0.9 from training. Duration: 0.5333125 2023-09-30 15:38:49,197 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.528e+02 1.822e+02 2.007e+02 2.255e+02 3.053e+02, threshold=4.015e+02, percent-clipped=0.0 2023-09-30 15:38:49,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0 from training. Duration: 0.87 2023-09-30 15:38:49,483 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_222-131118-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:38:49,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0 from training. Duration: 0.84 2023-09-30 15:38:50,932 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0 from training. Duration: 0.96 2023-09-30 15:38:50,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0 from training. Duration: 0.94 2023-09-30 15:38:52,648 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:38:52,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=757826.6666666666, ans=0.0 2023-09-30 15:38:54,364 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=757826.6666666666, ans=0.0 2023-09-30 15:38:56,290 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:38:57,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:38:59,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 15:39:01,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1166-90654-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:01,833 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass_mid.scale_min, batch_count=757893.3333333334, ans=0.2 2023-09-30 15:39:05,046 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_218-255858-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:39:05,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0 from training. Duration: 0.84 2023-09-30 15:39:05,071 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_205-286261-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:39:05,095 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_6-154119-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:39:05,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _22982_210_1883_1_1531882790447_5449409_565-199393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:06,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0 from training. Duration: 0.76 2023-09-30 15:39:08,906 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0 from training. Duration: 0.53 2023-09-30 15:39:09,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0 from training. Duration: 0.8 2023-09-30 15:39:13,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:39:15,250 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=757893.3333333334, ans=0.07 2023-09-30 15:39:15,808 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=4.58 vs. limit=15.0 2023-09-30 15:39:18,011 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:39:18,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0 from training. Duration: 0.62 2023-09-30 15:39:22,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_528-88083-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:39:24,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_32-31837-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:39:25,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_744-86168-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:39:25,946 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:39:25,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_124-345840-0_sp0.9 from training. Duration: 0.57775 2023-09-30 15:39:26,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:39:27,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_469-166615-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:39:27,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:39:31,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:39:31,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_524-188242-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:32,812 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_938-205135-0 from training. Duration: 0.87 2023-09-30 15:39:34,444 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0 from training. Duration: 0.67 2023-09-30 15:39:34,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_445-269521-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:39:36,523 INFO [train.py:1039] (3/4) Epoch 22, batch 2150, loss[loss=0.1599, simple_loss=0.2414, pruned_loss=0.03917, over 24269.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.2488, pruned_loss=0.04824, over 4713067.84 frames. ], batch size: 56, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:39:36,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_3-33116-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:39:36,735 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4633_210_2040_1_1526824163_4145343_416-76117-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:39:36,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:39:38,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:39:44,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_321-215742-0_sp0.9 from training. Duration: 0.5555625 2023-09-30 15:39:45,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_272-190950-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:39:47,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_31-299371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:49,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:39:49,041 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_887-9547-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:39:50,469 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:39:53,666 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_258-250196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:39:53,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1186-72702-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:39:53,780 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14831_210_7927_1_1530700200_5583610_181-27467-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:39:56,906 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_278-132109-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:39:58,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0 from training. Duration: 0.99 2023-09-30 15:40:01,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_313-265903-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:40:05,183 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_105-114797-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:40:07,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _53755_210_8254_1_1534577312941_4707360_9-65843-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:07,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_361-224549-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:40:07,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_403-310975-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:07,331 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:40:08,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_464-154904-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:40:08,735 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:40:10,248 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37839_210_7234_1_1533542462_3689303_158-181212-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:40:10,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_398-320049-0 from training. Duration: 0.88 2023-09-30 15:40:12,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_718-250907-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:40:14,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_641-126395-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:40:14,694 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_166-241493-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:40:16,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:40:16,368 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_439-275083-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:40:19,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66907_210_21581_1_1535615661_3590177_493-229032-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:40:20,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:40:22,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_273-226195-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:40:22,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_404-75889-0 from training. Duration: 0.89 2023-09-30 15:40:22,346 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0_sp0.9 from training. Duration: 0.87775 2023-09-30 15:40:24,044 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_156-339685-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:40:25,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_425-231927-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:25,752 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.attention_skip_rate, batch_count=758226.6666666666, ans=0.0 2023-09-30 15:40:26,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_102-288670-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:40:28,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:40:28,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_86-1699-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:40:30,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _2891_210_2550_1_1525874398521_4427710_327-121590-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:30,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0 from training. Duration: 0.71 2023-09-30 15:40:31,746 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer1.prob, batch_count=758226.6666666666, ans=0.125 2023-09-30 15:40:32,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_762-109886-0 from training. Duration: 0.91 2023-09-30 15:40:33,010 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_477-255782-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:40:34,543 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0061-330906 from training. Duration: 0.768 2023-09-30 15:40:34,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_347-213071-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:40:34,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:40:36,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0 from training. Duration: 0.64 2023-09-30 15:40:36,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:40:36,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0 from training. Duration: 0.98 2023-09-30 15:40:37,574 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0064-335871 from training. Duration: 0.768 2023-09-30 15:40:37,574 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0079-335886 from training. Duration: 0.896 2023-09-30 15:40:37,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0 from training. Duration: 0.61 2023-09-30 15:40:37,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=758226.6666666666, ans=0.0 2023-09-30 15:40:39,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_411-323967-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:40:39,394 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_350-231748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:40:39,411 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:40:39,619 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=758226.6666666666, ans=0.125 2023-09-30 15:40:41,524 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_314-144055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:42,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 15:40:44,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _23038_210_9570_1_1532001584158_3652419_82-334733-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:40:44,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_225-22526-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:40:52,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:40:52,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0 from training. Duration: 0.83 2023-09-30 15:40:58,098 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_258-211605-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:40:58,448 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=758360.0, ans=0.125 2023-09-30 15:40:59,527 INFO [train.py:1039] (3/4) Epoch 22, batch 2200, loss[loss=0.1916, simple_loss=0.2746, pruned_loss=0.05428, over 23966.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2497, pruned_loss=0.04865, over 4702490.39 frames. ], batch size: 86, lr: 4.68e-03, grad_scale: 8.0 2023-09-30 15:41:01,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_550-335198-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:41:01,498 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.attention_skip_rate, batch_count=758360.0, ans=0.0 2023-09-30 15:41:02,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_98-741-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:41:02,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_251-238988-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:41:04,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0_sp0.9 from training. Duration: 0.82225 2023-09-30 15:41:07,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1060-180633-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:41:08,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _8755_210_1876_1_1528628559357_3258209_211-37473-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:41:08,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_528-206771-0 from training. Duration: 0.88 2023-09-30 15:41:14,219 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_64-248359-0 from training. Duration: 0.69 2023-09-30 15:41:17,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_402-253027-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:41:24,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_625-102955-0 from training. Duration: 0.77 2023-09-30 15:41:26,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_611-219324-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:41:26,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:41:27,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:41:30,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5960_210_1794_1_1527403865_3990999_515-2613-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:41:31,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_342-265572-0 from training. Duration: 0.94 2023-09-30 15:41:34,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:41:34,500 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.bypass.scale_min, batch_count=758493.3333333334, ans=0.2 2023-09-30 15:41:36,961 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.488e+02 1.820e+02 1.960e+02 2.258e+02 2.788e+02, threshold=3.920e+02, percent-clipped=0.0 2023-09-30 15:41:37,084 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15396_210_6890_1_1531308473_1938936_213-312506-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:41:38,513 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_521-214276-0_sp1.1 from training. Duration: 0.4 2023-09-30 15:41:41,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _15026_210_8750_1_1530777602316_3291850_211-195043-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:41:42,000 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=758493.3333333334, ans=0.125 2023-09-30 15:41:43,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_108-257494-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:41:45,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_403-58151-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:41:46,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_109-177787-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:41:48,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_498-40153-0 from training. Duration: 0.63 2023-09-30 15:41:50,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _56447_210_1389_1_1535074245910_3697189_161-334523-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:41:53,074 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_238-245655-0 from training. Duration: 0.81 2023-09-30 15:41:56,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_108-43865-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:41:56,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0_sp1.1 from training. Duration: 0.663625 2023-09-30 15:41:56,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_594-132160-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:41:59,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:41:59,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_137-100595-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:41:59,629 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_12-271669-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:41:59,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_206-80421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:42:01,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_113-199933-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:42:01,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _55440_210_12651_1_1534496713364_2980520_31-59289-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:42:02,054 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=4.82 vs. limit=12.0 2023-09-30 15:42:04,367 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 15:42:04,572 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.ff2_skip_rate, batch_count=758626.6666666666, ans=0.0 2023-09-30 15:42:07,404 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 15:42:08,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_324-94843-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:42:10,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:42:12,037 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0006-501141 from training. Duration: 0.896 2023-09-30 15:42:13,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:42:15,390 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0008-506301 from training. Duration: 0.896 2023-09-30 15:42:15,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_60-191556-0_sp0.9 from training. Duration: 0.67775 2023-09-30 15:42:16,882 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0331-509721 from training. Duration: 0.896 2023-09-30 15:42:18,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _35823_210_6917_1_1533187881914_7297830_567-108596-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:42:19,857 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528544010_1110402_25-183860-0_sp1.1 from training. Duration: 0.42725 2023-09-30 15:42:20,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_495-260035-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:42:20,282 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=758693.3333333334, ans=0.0 2023-09-30 15:42:21,877 INFO [train.py:1039] (3/4) Epoch 22, batch 2250, loss[loss=0.1838, simple_loss=0.2737, pruned_loss=0.04697, over 24370.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.25, pruned_loss=0.04847, over 4695180.38 frames. ], batch size: 77, lr: 4.68e-03, grad_scale: 8.0 2023-09-30 15:42:22,051 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9051W0015-520281 from training. Duration: 0.9045625 2023-09-30 15:42:25,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_707-66193-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:42:27,020 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:42:34,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:42:35,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:42:38,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_317-179212-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:42:39,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:42:40,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:42:42,304 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0 from training. Duration: 0.93 2023-09-30 15:42:42,321 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_344-179642-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:42:42,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:42:43,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0 from training. Duration: 0.71 2023-09-30 15:42:45,536 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_78-116075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:42:45,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _64086_210_7994_1_1535270417019_7435949_52-235756-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:42:47,177 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 15:42:51,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_287-27950-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:42:51,840 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward2.hidden_balancer.prob, batch_count=758760.0, ans=0.125 2023-09-30 15:42:53,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 15:42:55,074 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:42:56,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_220-191080-0 from training. Duration: 0.98 2023-09-30 15:42:58,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1081-137641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:43:03,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_278-235459-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:43:07,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1181-77470-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:43:09,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _60860_210_8254_1_1534896015875_3557169_198-5531-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:43:10,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_17-289781-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:43:10,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_146-199752-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:43:13,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_969-213354-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:43:15,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _70519_210_4163_1_1535961595994_7458230_66-344156-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:43:19,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:43:21,208 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp0.9 from training. Duration: 0.7 2023-09-30 15:43:25,827 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_51-1049-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 15:43:25,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:43:27,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:43:29,663 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=758960.0, ans=10.0 2023-09-30 15:43:33,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 15:43:35,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:43:35,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0 from training. Duration: 0.98 2023-09-30 15:43:35,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_97-266746-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:43:37,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:43:41,115 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0 from training. Duration: 0.78 2023-09-30 15:43:44,066 INFO [train.py:1039] (3/4) Epoch 22, batch 2300, loss[loss=0.1703, simple_loss=0.2573, pruned_loss=0.04165, over 24563.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2502, pruned_loss=0.04869, over 4702219.26 frames. ], batch size: 71, lr: 4.68e-03, grad_scale: 8.0 2023-09-30 15:43:44,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:43:44,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_498-186993-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:43:50,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_54-77610-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:43:50,574 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:43:53,675 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0193-675471 from training. Duration: 0.9656875 2023-09-30 15:43:55,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_243-240536-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:44:02,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_32360_210_4198_1_1532689160_3648436_60-51908-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:44:03,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:44:03,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_392-173500-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:44:05,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_71-329150-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:44:05,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_866-173440-0 from training. Duration: 0.9 2023-09-30 15:44:05,178 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:44:08,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:44:08,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_356-45054-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:44:10,390 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.44 vs. limit=12.0 2023-09-30 15:44:11,863 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:44:14,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:44:18,766 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_667-145945-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:44:19,136 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=759160.0, ans=0.125 2023-09-30 15:44:21,556 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.502e+02 1.881e+02 2.122e+02 2.530e+02 4.417e+02, threshold=4.245e+02, percent-clipped=2.0 2023-09-30 15:44:23,526 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.scale_min, batch_count=759160.0, ans=0.2 2023-09-30 15:44:24,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:44:24,912 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_416-333249-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:44:27,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:44:31,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_965-176824-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:44:35,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:44:35,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:44:37,921 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:44:37,953 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0 from training. Duration: 0.47 2023-09-30 15:44:38,273 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=759226.6666666666, ans=0.125 2023-09-30 15:44:38,293 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.balancer.prob, batch_count=759226.6666666666, ans=0.125 2023-09-30 15:44:38,324 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=759226.6666666666, ans=0.1 2023-09-30 15:44:42,591 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 15:44:42,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _49748_210_24116_1_1533986971737_3158910_263-52113-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:44:43,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_340-24580-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:44:44,015 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_479-114941-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:44:45,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_506-119659-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:44:45,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_314-126819-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 15:44:45,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp0.9 from training. Duration: 0.9 2023-09-30 15:44:47,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0 from training. Duration: 0.97 2023-09-30 15:44:47,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_791-45149-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:44:47,127 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534227054_1261425_10-118295-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:44:47,410 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=759293.3333333334, ans=10.0 2023-09-30 15:44:49,240 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0 from training. Duration: 0.41 2023-09-30 15:44:55,398 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34726_210_4525_1_1533019474_5030395_687-291305-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:44:57,266 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.balancer.prob, batch_count=759293.3333333334, ans=0.125 2023-09-30 15:44:59,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_264-324537-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:45:02,057 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=16.30 vs. limit=22.5 2023-09-30 15:45:04,678 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_591-46386-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:45:04,733 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_86-19601-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:45:04,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp0.9 from training. Duration: 0.77775 2023-09-30 15:45:06,188 INFO [train.py:1039] (3/4) Epoch 22, batch 2350, loss[loss=0.1941, simple_loss=0.2589, pruned_loss=0.06467, over 23771.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2511, pruned_loss=0.04894, over 4713291.23 frames. ], batch size: 212, lr: 4.68e-03, grad_scale: 8.0 2023-09-30 15:45:06,905 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module1.whiten, num_groups=1, num_channels=256, metric=6.75 vs. limit=15.0 2023-09-30 15:45:07,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _8696_210_1876_1_1528545632440_3345058_209-54069-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:45:07,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_369-325184-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:45:07,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _14687_210_5854_1_1530703838259_3664889_115-239046-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:45:09,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_168-50337-0 from training. Duration: 0.84 2023-09-30 15:45:14,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:45:14,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_597-154640-0 from training. Duration: 0.9 2023-09-30 15:45:15,083 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=759360.0, ans=0.125 2023-09-30 15:45:22,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0 from training. Duration: 0.98 2023-09-30 15:45:25,973 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_344-269786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:45:29,480 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_322-253154-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:45:29,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_221-15823-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:45:29,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_9-21737-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:45:29,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_522-341588-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:45:31,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_363-185703-0 from training. Duration: 0.95 2023-09-30 15:45:32,913 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer1.min_positive, batch_count=759426.6666666666, ans=0.025 2023-09-30 15:45:34,177 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_265-76216-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:45:38,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0 from training. Duration: 0.58 2023-09-30 15:45:40,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_97-335084-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:45:41,279 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=3.14 vs. limit=6.0 2023-09-30 15:45:43,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_690-126306-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:45:43,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_102-92751-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:45:46,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:45:48,588 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0 from training. Duration: 0.82 2023-09-30 15:45:48,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:45:51,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_194-146486-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:45:51,687 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_171-277668-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:45:51,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _34423_210_8750_1_1533430798456_3330199_251-332788-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:45:54,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:45:56,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0 from training. Duration: 0.74 2023-09-30 15:45:58,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:46:02,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_139-337989-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:46:03,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_460-263670-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:46:05,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_186-309161-0 from training. Duration: 0.54 2023-09-30 15:46:05,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_87-278686-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:46:08,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_314-106499-0 from training. Duration: 0.92 2023-09-30 15:46:08,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_412-287526-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:46:13,470 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0 from training. Duration: 0.86 2023-09-30 15:46:17,981 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_381-317791-0 from training. Duration: 0.73 2023-09-30 15:46:19,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_560-310411-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:46:19,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp0.9 from training. Duration: 0.611125 2023-09-30 15:46:19,954 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0473W0002-918951 from training. Duration: 0.924 2023-09-30 15:46:21,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1083-220122-0 from training. Duration: 0.51 2023-09-30 15:46:22,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0 from training. Duration: 0.78 2023-09-30 15:46:26,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_331-19420-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:46:27,431 INFO [train.py:1039] (3/4) Epoch 22, batch 2400, loss[loss=0.1819, simple_loss=0.261, pruned_loss=0.05141, over 23314.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2509, pruned_loss=0.04878, over 4711076.59 frames. ], batch size: 105, lr: 4.68e-03, grad_scale: 16.0 2023-09-30 15:46:30,805 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:46:34,927 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_32-185135-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:46:35,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_46-93547-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:46:37,099 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7931_210_1794_1_1528594728_5564059_280-279560-0 from training. Duration: 0.98 2023-09-30 15:46:37,188 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_937-208281-0 from training. Duration: 0.85 2023-09-30 15:46:37,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module2.balancer2.prob, batch_count=759693.3333333334, ans=0.125 2023-09-30 15:46:40,556 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass_mid.scale_min, batch_count=759693.3333333334, ans=0.2 2023-09-30 15:46:43,795 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer1.prob, batch_count=759760.0, ans=0.125 2023-09-30 15:46:44,928 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14928_210_5632_1_1530773461_4714000_454-112198-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 15:46:44,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_944-153333-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:46:45,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=759760.0, ans=0.125 2023-09-30 15:46:46,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0 from training. Duration: 0.83 2023-09-30 15:46:47,965 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:46:49,464 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7837_210_1792_1_1528453445_5841305_654-54195-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:46:49,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0 from training. Duration: 0.53 2023-09-30 15:46:51,385 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=759760.0, ans=0.125 2023-09-30 15:46:56,271 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=759760.0, ans=0.1 2023-09-30 15:46:57,376 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30167_210_3528_1_1532428012_4772375_480-142230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:46:59,041 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_160-218721-0 from training. Duration: 0.96 2023-09-30 15:47:02,329 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_65-88242-0_sp0.9 from training. Duration: 0.62225 2023-09-30 15:47:04,729 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module1.balancer2.prob, batch_count=759826.6666666666, ans=0.125 2023-09-30 15:47:05,701 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.547e+02 1.822e+02 2.005e+02 2.211e+02 3.199e+02, threshold=4.010e+02, percent-clipped=0.0 2023-09-30 15:47:08,019 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0 from training. Duration: 0.87 2023-09-30 15:47:11,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_188-38388-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:47:12,219 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=256, metric=4.72 vs. limit=15.0 2023-09-30 15:47:13,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_81-289335-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:47:16,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_110-8778-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:47:17,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0 from training. Duration: 0.73 2023-09-30 15:47:19,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_453-133983-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 15:47:24,279 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8054_210_7927_1_1528627721_4984094_553-17379-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:47:27,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_341-171072-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:47:30,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_265-257286-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:47:32,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:47:32,413 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_464-33931-0_sp0.9 from training. Duration: 0.6 2023-09-30 15:47:32,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _60804_210_9642_1_1534831199859_7268299_632-143863-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:47:32,487 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_431-310997-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:47:33,991 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_114-263860-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:47:34,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 15:47:35,803 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=759960.0, ans=0.125 2023-09-30 15:47:38,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_338-325870-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:47:40,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 15:47:40,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_259-34038-0 from training. Duration: 0.74 2023-09-30 15:47:41,762 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff3_skip_rate, batch_count=759960.0, ans=0.0 2023-09-30 15:47:42,853 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_388-129552-0 from training. Duration: 0.96 2023-09-30 15:47:44,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_342-251455-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:47:44,758 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_8-191390-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:47:46,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_613-232856-0 from training. Duration: 0.54 2023-09-30 15:47:46,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_557-349534-0 from training. Duration: 0.91 2023-09-30 15:47:47,721 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0 from training. Duration: 0.85 2023-09-30 15:47:47,729 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0001-61807 from training. Duration: 0.966 2023-09-30 15:47:47,892 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0 from training. Duration: 0.57 2023-09-30 15:47:48,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.out_combiner.scale_min, batch_count=759960.0, ans=0.2 2023-09-30 15:47:49,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1069-24743-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:47:50,804 INFO [train.py:1039] (3/4) Epoch 22, batch 2450, loss[loss=0.1569, simple_loss=0.2453, pruned_loss=0.0342, over 24490.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2503, pruned_loss=0.04838, over 4717107.83 frames. ], batch size: 66, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:47:50,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_323-172873-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:47:50,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_802-226709-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:47:52,413 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0500-71767 from training. Duration: 0.931875 2023-09-30 15:47:52,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533605310265_2115212_233-129699-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:47:52,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp0.9 from training. Duration: 0.72225 2023-09-30 15:47:57,253 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:47:57,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_34-26343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:48:00,982 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_512-256238-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:48:00,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_196-55502-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:48:02,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0 from training. Duration: 0.75 2023-09-30 15:48:05,075 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=256, metric=12.79 vs. limit=22.5 2023-09-30 15:48:07,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_345-230416-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:48:07,575 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=760093.3333333334, ans=0.125 2023-09-30 15:48:08,617 WARNING [train.py:1197] (3/4) Exclude cut with ID _51078_210_9070_1_1534161557424_3805250_225-327809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:48:10,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_18-189198-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:48:10,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:48:10,720 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=760093.3333333334, ans=0.125 2023-09-30 15:48:11,955 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_726-270780-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:48:12,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_654-52713-0 from training. Duration: 0.77 2023-09-30 15:48:17,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_191-16006-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:48:20,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_66-260583-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:48:20,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_63-129006-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:48:23,635 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_393-57888-0_sp0.9 from training. Duration: 0.711125 2023-09-30 15:48:23,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_857-298942-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:48:25,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:48:25,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_424-227947-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:48:28,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_95-1896-0 from training. Duration: 0.98 2023-09-30 15:48:29,741 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_96-244299-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:48:34,680 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:48:38,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_562-319825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:48:39,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_284-173474-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:48:41,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_497-100133-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:48:41,272 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:48:41,369 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_116-303447-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:48:42,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_44-79427-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:48:43,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0 from training. Duration: 0.77 2023-09-30 15:48:46,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 15:48:48,304 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14058_210_4163_1_1530690953_6327180_49-210687-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:48:50,935 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=760226.6666666666, ans=0.125 2023-09-30 15:48:52,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_351-127903-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:48:53,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_184-206939-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:48:58,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:48:58,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0 from training. Duration: 0.92 2023-09-30 15:48:58,320 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:48:59,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_106-43145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:48:59,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0 from training. Duration: 0.87 2023-09-30 15:49:01,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_801-102362-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:49:01,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_408-40740-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:49:06,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_448-212135-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:49:08,343 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=760293.3333333334, ans=0.0 2023-09-30 15:49:10,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_320-124047-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:49:10,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:49:13,148 INFO [train.py:1039] (3/4) Epoch 22, batch 2500, loss[loss=0.1764, simple_loss=0.2582, pruned_loss=0.04724, over 23351.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.249, pruned_loss=0.04815, over 4700583.34 frames. ], batch size: 93, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:49:13,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0 from training. Duration: 0.83 2023-09-30 15:49:14,896 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_335-163562-0_sp1.1 from training. Duration: 0.77275 2023-09-30 15:49:21,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_171-275004-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:49:29,036 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=760426.6666666666, ans=0.2 2023-09-30 15:49:31,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:49:31,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_164-118421-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:49:33,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_380-24170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:49:33,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp1.1 from training. Duration: 0.3454375 2023-09-30 15:49:40,935 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:49:42,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_85-326916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:49:42,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0_sp1.1 from training. Duration: 0.57275 2023-09-30 15:49:42,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 15:49:44,702 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_403-39619-0 from training. Duration: 0.84 2023-09-30 15:49:44,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_427-87841-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:49:46,268 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_286-68181-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:49:46,336 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0 from training. Duration: 0.83 2023-09-30 15:49:46,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_106-203713-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:49:47,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0 from training. Duration: 0.95 2023-09-30 15:49:47,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_586-93054-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:49:50,873 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.545e+02 1.767e+02 1.934e+02 2.176e+02 2.965e+02, threshold=3.869e+02, percent-clipped=0.0 2023-09-30 15:49:52,728 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_262-10571-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:49:53,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_432-67885-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:49:56,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 15:49:56,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0 from training. Duration: 0.86 2023-09-30 15:49:56,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _69051_210_1531_1_1535885566901_3829538_332-92090-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:49:58,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_104-324125-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:50:03,673 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41764_210_2521_1_1533686110_4089313_501-2119-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:50:05,789 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.out_whiten.whitening_limit, batch_count=760560.0, ans=15.0 2023-09-30 15:50:08,146 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_56-100988-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:50:09,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:50:14,432 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_74-48717-0_sp1.1 from training. Duration: 0.52725 2023-09-30 15:50:16,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_177-49092-0 from training. Duration: 0.91 2023-09-30 15:50:18,083 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_44-176353-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:50:18,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_110-130961-0_sp1.1 from training. Duration: 0.42725 2023-09-30 15:50:19,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_37-189262-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 15:50:19,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3172_210_2087_1_1526292929_3987865_197-97762-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 15:50:21,247 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0065-334897 from training. Duration: 0.896 2023-09-30 15:50:21,248 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0677W0080-334912 from training. Duration: 0.768 2023-09-30 15:50:21,267 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0 from training. Duration: 0.7 2023-09-30 15:50:24,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_460-205287-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:50:25,900 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_102-204477-0 from training. Duration: 0.97 2023-09-30 15:50:26,030 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_skip_rate, batch_count=760626.6666666666, ans=0.0 2023-09-30 15:50:27,950 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34815_210_6388_1_1533381320_3571903_279-305565-0 from training. Duration: 0.92 2023-09-30 15:50:28,075 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:50:30,141 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0 from training. Duration: 0.76 2023-09-30 15:50:33,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_493-102745-0 from training. Duration: 0.98 2023-09-30 15:50:36,289 INFO [train.py:1039] (3/4) Epoch 22, batch 2550, loss[loss=0.168, simple_loss=0.2431, pruned_loss=0.04648, over 20577.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2488, pruned_loss=0.0479, over 4708701.31 frames. ], batch size: 45, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:50:36,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _61133_210_9969_1_1535196777227_3696409_40-93442-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:50:37,907 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_45-258852-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:50:37,989 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_130-36076-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:50:41,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8694_210_6400_1_1529065754_3260756_332-13553-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:50:41,119 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0 from training. Duration: 0.51 2023-09-30 15:50:42,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_927-43827-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:50:45,770 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_397-147196-0 from training. Duration: 0.8 2023-09-30 15:50:47,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:50:48,900 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47438_210_5399_1_1533797471_4513356_626-127722-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:50:52,572 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_256-323883-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:50:52,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_839-173017-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 15:50:53,997 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:50:54,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_119-224384-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:50:54,135 WARNING [train.py:1197] (3/4) Exclude cut with ID _57523_210_1856_1_1534766294545_3732989_262-267933-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:50:57,139 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35361_210_4238_1_1533262810_7793476_675-87012-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:50:57,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0 from training. Duration: 0.94 2023-09-30 15:50:58,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_535-14410-0_sp1.1 from training. Duration: 0.42725 2023-09-30 15:50:58,613 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24673_210_6400_1_1531968633_778659_94-154511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:50:58,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_212-10501-0 from training. Duration: 0.94 2023-09-30 15:51:10,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 15:51:16,866 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_238-47020-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:51:16,881 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21500_210_2054_1_1532149310_2716758_278-18053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:51:16,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_453-9626-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:51:17,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 15:51:23,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_431-120895-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:51:26,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 15:51:26,922 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:51:26,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 15:51:28,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp1.1 from training. Duration: 0.536375 2023-09-30 15:51:28,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_153-97441-0_sp0.9 from training. Duration: 0.62225 2023-09-30 15:51:31,477 WARNING [train.py:1197] (3/4) Exclude cut with ID _12733_210_8341_1_1530167424556_3646380_523-85621-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:51:31,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _64048_210_10626_1_1535198382802_3835179_352-179834-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:51:38,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_43-242364-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:51:38,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0 from training. Duration: 0.93 2023-09-30 15:51:38,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_325-237800-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:51:40,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_385-171459-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:51:41,913 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp1.1 from training. Duration: 0.62725 2023-09-30 15:51:43,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:51:45,079 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_309-33162-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:51:50,036 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:51:51,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1185-22038-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:51:51,838 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward2.out_whiten, num_groups=1, num_channels=384, metric=8.93 vs. limit=15.0 2023-09-30 15:51:54,230 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1197-69460-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:51:57,752 INFO [train.py:1039] (3/4) Epoch 22, batch 2600, loss[loss=0.1712, simple_loss=0.2612, pruned_loss=0.0406, over 24548.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2499, pruned_loss=0.04819, over 4714555.30 frames. ], batch size: 71, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:51:57,841 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0022-501157 from training. Duration: 0.896 2023-09-30 15:51:59,471 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9023W0009-506302 from training. Duration: 0.896 2023-09-30 15:51:59,502 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2821_210_2715_1_1526108385_3763560_73-296673-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:51:59,558 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9025W0113-507442 from training. Duration: 0.896 2023-09-30 15:51:59,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0 from training. Duration: 0.92 2023-09-30 15:52:01,062 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0332-509722 from training. Duration: 0.768 2023-09-30 15:52:02,725 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_68-101953-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:52:02,764 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9042W0011-515617 from training. Duration: 0.768 2023-09-30 15:52:03,314 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=11.37 vs. limit=15.0 2023-09-30 15:52:04,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0 from training. Duration: 0.97 2023-09-30 15:52:05,864 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9052W0006-520792 from training. Duration: 0.896 2023-09-30 15:52:10,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:52:12,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0 from training. Duration: 0.82 2023-09-30 15:52:14,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0 from training. Duration: 0.77 2023-09-30 15:52:15,887 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_246-125000-0_sp0.9 from training. Duration: 0.688875 2023-09-30 15:52:15,948 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_243-42116-0 from training. Duration: 0.73 2023-09-30 15:52:18,902 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0121-538492 from training. Duration: 0.896 2023-09-30 15:52:18,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0 from training. Duration: 0.55 2023-09-30 15:52:26,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_850-304621-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:52:26,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_154-285415-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:52:28,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_538-278194-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:52:28,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0 from training. Duration: 0.71 2023-09-30 15:52:29,755 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0_sp1.1 from training. Duration: 0.863625 2023-09-30 15:52:34,443 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.586e+02 1.852e+02 2.092e+02 2.401e+02 4.337e+02, threshold=4.185e+02, percent-clipped=2.0 2023-09-30 15:52:37,597 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9147W0019-567847 from training. Duration: 0.768 2023-09-30 15:52:43,792 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_228-189439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:52:45,342 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_196-77172-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:52:45,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_625-338546-0 from training. Duration: 0.88 2023-09-30 15:52:47,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_193-327630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:52:47,502 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_391-24006-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:52:47,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0 from training. Duration: 0.8 2023-09-30 15:52:49,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _8395_210_1531_1_1528597871523_4411579_431-132470-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:52:51,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_465-248326-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:52:51,460 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.balancer2.prob, batch_count=761226.6666666666, ans=0.125 2023-09-30 15:52:52,804 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_101-141922-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:52:57,208 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9212W0005-600172 from training. Duration: 0.896 2023-09-30 15:52:57,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _63186_210_2084_1_1535414479705_3908649_378-166820-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:52:58,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_52-211683-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 15:53:04,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1147-237729-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:53:05,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp0.9 from training. Duration: 0.911125 2023-09-30 15:53:05,021 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_454-276429-0 from training. Duration: 0.87 2023-09-30 15:53:06,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _53713_210_24116_1_1534314487762_7416751_755-165313-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:53:08,674 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_436-317072-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:53:09,228 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.42 vs. limit=15.0 2023-09-30 15:53:10,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:53:13,767 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=761293.3333333334, ans=0.2 2023-09-30 15:53:16,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_326-165900-0 from training. Duration: 0.69 2023-09-30 15:53:17,594 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=12.73 vs. limit=15.0 2023-09-30 15:53:18,065 INFO [train.py:1039] (3/4) Epoch 22, batch 2650, loss[loss=0.188, simple_loss=0.2626, pruned_loss=0.05673, over 23323.00 frames. ], tot_loss[loss=0.1748, simple_loss=0.2516, pruned_loss=0.04903, over 4716410.13 frames. ], batch size: 119, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:53:18,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_96-30246-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:53:20,563 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 15:53:24,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0 from training. Duration: 0.84 2023-09-30 15:53:24,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_54-112623-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:53:24,715 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass_mid.scale_min, batch_count=761360.0, ans=0.2 2023-09-30 15:53:26,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_328-264776-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 15:53:27,947 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9325W0001-648427 from training. Duration: 0.768 2023-09-30 15:53:27,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _56480_210_4220_1_1534679942079_3075409_49-74717-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:53:29,534 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_6-241869-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:53:31,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 15:53:32,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_17-90541-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:53:35,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_525-89447-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:53:36,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0 from training. Duration: 0.92 2023-09-30 15:53:36,086 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:53:37,481 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_179-45206-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:53:39,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff3_skip_rate, batch_count=761426.6666666666, ans=0.0 2023-09-30 15:53:40,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _40137_210_12210_1_1533290484046_3795180_249-187059-0 from training. Duration: 0.88 2023-09-30 15:53:42,578 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9653W0194-675472 from training. Duration: 0.9658125 2023-09-30 15:53:44,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _51332_210_9988_1_1534244371836_3801270_336-255604-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:53:48,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0 from training. Duration: 0.87 2023-09-30 15:53:49,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1167-20437-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:53:49,986 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0 from training. Duration: 0.76 2023-09-30 15:53:55,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_470-172418-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:53:55,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_581-315411-0_sp1.1 from training. Duration: 0.563625 2023-09-30 15:53:55,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34450_210_9547_1_1533099443_4109050_519-342439-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:53:56,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_50-175961-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:53:59,865 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0 from training. Duration: 0.44 2023-09-30 15:53:59,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0 from training. Duration: 0.76 2023-09-30 15:54:05,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:54:09,623 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0 from training. Duration: 0.8 2023-09-30 15:54:09,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_466-252727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:54:09,783 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15972_210_6400_1_1530877934_3405692_316-35478-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:54:11,222 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_809-17156-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:54:11,296 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_594-263474-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:54:11,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_190-228204-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:54:12,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_117-21193-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:54:14,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_190-21315-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:54:14,705 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_184-302050-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:54:16,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:54:18,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:54:19,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_79-46864-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:54:19,932 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_358-215558-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:54:21,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_621-66764-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:54:22,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_338-156851-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:54:22,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_392-261988-0_sp0.9 from training. Duration: 0.72225 2023-09-30 15:54:26,158 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_409-84682-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:54:27,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_30-326681-0_sp0.9 from training. Duration: 0.92225 2023-09-30 15:54:27,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_389-184695-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:54:29,681 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_442-320317-0 from training. Duration: 0.82 2023-09-30 15:54:31,451 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_756-29911-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:54:34,856 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_401-112036-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:54:35,300 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=761626.6666666666, ans=0.0 2023-09-30 15:54:36,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_181-308827-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:54:36,758 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=761626.6666666666, ans=0.125 2023-09-30 15:54:37,154 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=512, metric=2.72 vs. limit=15.0 2023-09-30 15:54:37,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_332-258725-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:54:39,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp0.9 from training. Duration: 0.811125 2023-09-30 15:54:39,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _49039_210_6315_1_1533967537541_3846118_99-302239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:54:41,029 INFO [train.py:1039] (3/4) Epoch 22, batch 2700, loss[loss=0.16, simple_loss=0.2432, pruned_loss=0.0384, over 24313.00 frames. ], tot_loss[loss=0.1755, simple_loss=0.2522, pruned_loss=0.04942, over 4713812.52 frames. ], batch size: 61, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:54:42,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _4122_210_2084_1_1526713164417_4353280_312-294828-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:54:42,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0 from training. Duration: 0.84 2023-09-30 15:54:44,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:54:47,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_125-144174-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 15:54:50,323 WARNING [train.py:1197] (3/4) Exclude cut with ID _53565_210_10635_1_1534337218335_3820259_351-54570-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:54:50,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_410-40065-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:54:50,419 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_167-339442-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:54:50,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:54:52,466 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_725-182484-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:54:52,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 15:54:52,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_605-133395-0_sp1.1 from training. Duration: 0.6 2023-09-30 15:54:52,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_351-294650-0 from training. Duration: 0.63 2023-09-30 15:54:52,683 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:54:54,279 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:54:54,452 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass_mid.scale_min, batch_count=761693.3333333334, ans=0.2 2023-09-30 15:54:55,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:54:56,469 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=6.15 vs. limit=12.0 2023-09-30 15:54:57,226 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_743-327651-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:55:01,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0_sp0.9 from training. Duration: 0.8 2023-09-30 15:55:01,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0 from training. Duration: 0.98 2023-09-30 15:55:02,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:55:06,671 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.bypass.skip_rate, batch_count=761760.0, ans=0.07 2023-09-30 15:55:08,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 15:55:08,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_191-251819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:55:14,912 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:55:14,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_412-3442-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:55:14,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:55:16,497 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_154-135696-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:55:19,397 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.664e+02 1.905e+02 2.101e+02 2.408e+02 3.392e+02, threshold=4.202e+02, percent-clipped=0.0 2023-09-30 15:55:19,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_148-186805-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:55:22,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_605-165641-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:55:22,729 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:55:22,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:55:29,157 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_438-105429-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:55:29,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp0.9 from training. Duration: 0.988875 2023-09-30 15:55:35,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:55:37,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7113_210_1849_1_1527942685_3448768_194-311626-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:55:41,541 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 15:55:41,544 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_123-213457-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:55:43,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_456-244861-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:55:45,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_242-349170-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:55:46,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_471-281351-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:55:47,047 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_572-126176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:55:48,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1507-263032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:55:48,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:55:52,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _29345_210_4381_1_1532689168263_3860169_149-257268-0_sp1.1 from training. Duration: 0.736375 2023-09-30 15:55:54,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_381-252014-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:55:54,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_512-2717-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:55:57,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0_sp0.9 from training. Duration: 0.42225 2023-09-30 15:55:57,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_731-20117-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:55:57,954 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.balancer2.prob, batch_count=761960.0, ans=0.125 2023-09-30 15:55:59,256 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0_sp1.1 from training. Duration: 0.8 2023-09-30 15:55:59,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_313-134930-0 from training. Duration: 0.97 2023-09-30 15:55:59,555 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=761960.0, ans=0.2 2023-09-30 15:56:01,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_374-138971-0 from training. Duration: 0.94 2023-09-30 15:56:02,859 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1227-287886-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:56:04,254 INFO [train.py:1039] (3/4) Epoch 22, batch 2750, loss[loss=0.1846, simple_loss=0.2493, pruned_loss=0.05994, over 23860.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2522, pruned_loss=0.04976, over 4704492.99 frames. ], batch size: 195, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 15:56:05,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _59493_210_17282_1_1535078419077_3225879_332-217544-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:56:05,944 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_361-119640-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:56:08,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_142-118058-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:09,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_178-177582-0_sp1.1 from training. Duration: 0.67275 2023-09-30 15:56:09,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _25303_210_17519_1_1532050207576_3635970_41-195572-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:09,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=762026.6666666666, ans=0.125 2023-09-30 15:56:14,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_329-341322-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:56:14,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _5608_210_2106_1_1527415439089_6819768_909-82767-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 15:56:15,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_261-297503-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:56:16,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_459-291706-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:16,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0 from training. Duration: 0.73 2023-09-30 15:56:16,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_250-285937-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:56:16,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_332-253745-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:56:22,558 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0 from training. Duration: 0.46 2023-09-30 15:56:24,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_226-267957-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:56:24,236 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_608-254567-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:25,614 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_124-113562-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:56:25,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_117-290916-0_sp1.1 from training. Duration: 0.463625 2023-09-30 15:56:27,260 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_102-66533-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:56:28,711 WARNING [train.py:1197] (3/4) Exclude cut with ID _55383_210_10635_1_1534468426172_2231669_134-128246-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:56:30,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _45871_210_5665_1_1533864614245_3296060_49-308316-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:56:30,291 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_176-199205-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:56:30,635 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=762093.3333333334, ans=0.125 2023-09-30 15:56:36,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 15:56:36,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _15150_210_8341_1_1530752411803_7256384_469-239509-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 15:56:36,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 15:56:38,441 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_302-47302-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:40,017 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:56:47,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_559-120504-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:56:50,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_345-252877-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 15:56:50,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_902-233003-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:56:55,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_204-261385-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:56:55,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_189-297451-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:56:55,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 15:57:01,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_273-149841-0_sp1.1 from training. Duration: 0.836375 2023-09-30 15:57:01,939 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 15:57:01,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0 from training. Duration: 0.75 2023-09-30 15:57:07,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _49097_210_16049_1_1533952780207_4360979_276-87053-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:57:09,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0 from training. Duration: 0.67 2023-09-30 15:57:16,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_128-202104-0_sp1.1 from training. Duration: 0.57275 2023-09-30 15:57:18,214 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:57:18,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_159-328157-0 from training. Duration: 0.52 2023-09-30 15:57:19,748 WARNING [train.py:1197] (3/4) Exclude cut with ID _14069_210_5632_1_1530512692033_4803390_98-21934-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:57:19,964 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 15:57:21,842 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6163_210_6400_1_1527506746_4263068_283-137811-0 from training. Duration: 0.77 2023-09-30 15:57:21,909 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0_sp1.1 from training. Duration: 0.7 2023-09-30 15:57:25,582 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp0.9 from training. Duration: 0.47775 2023-09-30 15:57:26,894 INFO [train.py:1039] (3/4) Epoch 22, batch 2800, loss[loss=0.1663, simple_loss=0.232, pruned_loss=0.05027, over 23638.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.2501, pruned_loss=0.04938, over 4705076.79 frames. ], batch size: 256, lr: 4.67e-03, grad_scale: 32.0 2023-09-30 15:57:26,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_201-78811-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:57:27,022 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_537-164630-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:57:28,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0 from training. Duration: 0.83 2023-09-30 15:57:28,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_308-154450-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:57:28,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_214-240574-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:57:30,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_615-305517-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:57:31,570 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0002-61808 from training. Duration: 0.708 2023-09-30 15:57:31,571 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0125W0017-61823 from training. Duration: 0.759 2023-09-30 15:57:34,691 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_4-145291-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:57:35,026 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=762360.0, ans=0.125 2023-09-30 15:57:37,618 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:57:37,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_635-193046-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:57:40,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_321-258866-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:57:42,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8558_210_1410_1_1528505225_7613406_131-329524-0 from training. Duration: 0.79 2023-09-30 15:57:44,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_847-41334-0_sp0.9 from training. Duration: 0.488875 2023-09-30 15:57:45,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _14227_210_4381_1_1530701804286_3940259_125-275642-0 from training. Duration: 0.99 2023-09-30 15:57:49,141 WARNING [train.py:1197] (3/4) Exclude cut with ID _25248_210_8750_1_1532062825941_5554180_248-177255-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:57:49,212 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:57:49,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _23109_210_4381_1_1532150828777_901949_29-258672-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:57:54,439 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_2-227151-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:57:54,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_565-53982-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:57:54,511 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_33-346031-0_sp0.9 from training. Duration: 0.67775 2023-09-30 15:57:56,666 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_95-61782-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:58:04,782 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.609e+02 1.836e+02 2.011e+02 2.397e+02 3.435e+02, threshold=4.022e+02, percent-clipped=0.0 2023-09-30 15:58:06,486 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_686-54383-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 15:58:07,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_505-276823-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:58:09,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_745-128443-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:58:11,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _3844_210_1390_1_1526727803582_2871209_20-251664-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 15:58:11,117 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_487-164628-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:58:15,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_309-222545-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:58:16,008 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3531_210_3528_1_1526294203_5216456_309-227302-0 from training. Duration: 0.69 2023-09-30 15:58:17,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_42-192460-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:58:17,698 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=762560.0, ans=0.1 2023-09-30 15:58:19,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:58:19,038 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_427-78097-0_sp0.9 from training. Duration: 0.888875 2023-09-30 15:58:19,399 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.prob, batch_count=762560.0, ans=0.125 2023-09-30 15:58:24,173 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_378-205254-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:58:26,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_672-314830-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:58:26,542 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=762560.0, ans=0.1 2023-09-30 15:58:29,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp1.1 from training. Duration: 0.763625 2023-09-30 15:58:31,205 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0_sp1.1 from training. Duration: 0.9 2023-09-30 15:58:31,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _21632_210_2253_1_1531994001765_3744140_154-84090-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:58:31,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 15:58:33,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_43-71989-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 15:58:33,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 15:58:33,654 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=762626.6666666666, ans=0.125 2023-09-30 15:58:34,261 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.52 vs. limit=10.0 2023-09-30 15:58:35,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_629-179454-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:58:35,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_49-222917-0 from training. Duration: 0.92 2023-09-30 15:58:36,974 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_602-331772-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:58:38,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_292-134978-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 15:58:38,484 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38793_210_2544_1_1533176114_3849320_200-299292-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:58:38,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=762626.6666666666, ans=0.125 2023-09-30 15:58:39,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0 from training. Duration: 0.86 2023-09-30 15:58:41,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_386-225511-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:58:41,577 WARNING [train.py:1197] (3/4) Exclude cut with ID _38323_210_12108_1_1533121233369_3303049_416-11919-0_sp1.1 from training. Duration: 0.87275 2023-09-30 15:58:43,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_256-306830-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 15:58:43,550 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0 from training. Duration: 0.67 2023-09-30 15:58:49,602 INFO [train.py:1039] (3/4) Epoch 22, batch 2850, loss[loss=0.1604, simple_loss=0.2302, pruned_loss=0.04526, over 23551.00 frames. ], tot_loss[loss=0.1738, simple_loss=0.2496, pruned_loss=0.04901, over 4716203.88 frames. ], batch size: 285, lr: 4.67e-03, grad_scale: 32.0 2023-09-30 15:58:49,816 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 15:58:49,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 15:58:50,158 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=762693.3333333334, ans=0.125 2023-09-30 15:58:51,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 15:58:52,899 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=762693.3333333334, ans=0.5 2023-09-30 15:58:54,037 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_144-135833-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:58:54,948 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=9.11 vs. limit=15.0 2023-09-30 15:58:57,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_380-343670-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:58:59,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_213-265174-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:58:59,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_86-314283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 15:59:00,201 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=762693.3333333334, ans=0.1 2023-09-30 15:59:02,952 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_68-223813-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:59:03,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_24-198429-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 15:59:04,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_119-207162-0_sp0.9 from training. Duration: 0.788875 2023-09-30 15:59:05,964 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0 from training. Duration: 0.5 2023-09-30 15:59:07,915 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.3.encoder.layers.3.self_attn_weights, loss-sum=0.000e+00 2023-09-30 15:59:11,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0 from training. Duration: 0.43 2023-09-30 15:59:11,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_288-189949-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:59:13,443 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_529-242298-0 from training. Duration: 0.95 2023-09-30 15:59:14,919 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_503-110289-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:59:15,463 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.24 vs. limit=15.0 2023-09-30 15:59:16,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0 from training. Duration: 0.86 2023-09-30 15:59:18,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_456-22141-0 from training. Duration: 0.88 2023-09-30 15:59:19,855 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module2.balancer2.prob, batch_count=762760.0, ans=0.125 2023-09-30 15:59:21,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_284-117840-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:59:33,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_75-201625-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:59:33,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_255-312654-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:59:34,025 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=762826.6666666666, ans=0.0 2023-09-30 15:59:34,915 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0_sp0.9 from training. Duration: 0.97775 2023-09-30 15:59:36,463 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_227-213972-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 15:59:36,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 15:59:36,550 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp1.1 from training. Duration: 0.72725 2023-09-30 15:59:38,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 15:59:39,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_39-248870-0 from training. Duration: 0.69 2023-09-30 15:59:41,243 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp1.1 from training. Duration: 0.5 2023-09-30 15:59:41,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_413-56154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 15:59:41,355 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_201-84544-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 15:59:42,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_105-95775-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:59:43,190 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=762893.3333333334, ans=0.1 2023-09-30 15:59:46,319 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_131-191669-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:59:47,730 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_695-4323-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 15:59:49,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_991-130565-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:59:52,225 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0_sp1.1 from training. Duration: 0.82725 2023-09-30 15:59:53,760 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_54-347670-0_sp1.1 from training. Duration: 0.92725 2023-09-30 15:59:53,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_200-153445-0_sp1.1 from training. Duration: 0.936375 2023-09-30 15:59:55,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_279-302911-0_sp1.1 from training. Duration: 0.97275 2023-09-30 15:59:58,401 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:00:01,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _53379_210_20034_1_1534222027363_3854654_23-222715-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:00:03,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_449-267986-0 from training. Duration: 0.83 2023-09-30 16:00:05,209 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0 from training. Duration: 0.84 2023-09-30 16:00:06,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 16:00:06,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_244-19107-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:00:06,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0 from training. Duration: 0.56 2023-09-30 16:00:07,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:00:09,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_645-145235-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:00:09,184 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25767_210_2161_1_1532307483_3867748_506-171777-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:00:10,537 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:00:10,537 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0669W0063-330908 from training. Duration: 0.896 2023-09-30 16:00:10,611 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0070-331913 from training. Duration: 0.896 2023-09-30 16:00:10,616 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:00:10,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_162-177507-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:00:11,987 INFO [train.py:1039] (3/4) Epoch 22, batch 2900, loss[loss=0.1774, simple_loss=0.2441, pruned_loss=0.05534, over 23789.00 frames. ], tot_loss[loss=0.1736, simple_loss=0.2499, pruned_loss=0.04868, over 4726380.90 frames. ], batch size: 164, lr: 4.67e-03, grad_scale: 16.0 2023-09-30 16:00:15,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp0.9 from training. Duration: 0.67775 2023-09-30 16:00:16,649 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_7-150481-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:00:16,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_72-273262-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:00:18,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0 from training. Duration: 0.75 2023-09-30 16:00:23,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1337-11141-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:00:23,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_101-293367-0 from training. Duration: 0.63 2023-09-30 16:00:25,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_10-100367-0 from training. Duration: 0.98 2023-09-30 16:00:26,780 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:00:26,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_844-96989-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:00:28,306 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_301-144595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:00:28,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_115-147343-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:00:32,976 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:00:33,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_769-272914-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:00:36,639 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0_sp0.9 from training. Duration: 0.77775 2023-09-30 16:00:38,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_526-62079-0 from training. Duration: 0.67 2023-09-30 16:00:38,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:00:39,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _57997_210_4163_1_1534766425889_3961049_360-279140-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:00:43,330 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_133-1696-0 from training. Duration: 0.99 2023-09-30 16:00:43,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _5190_210_4557_1_1527244368799_373439_28-197030-0 from training. Duration: 0.72 2023-09-30 16:00:46,490 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_53-39003-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:00:46,495 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_351-167954-0 from training. Duration: 0.48 2023-09-30 16:00:46,531 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:00:50,817 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.551e+02 1.778e+02 1.944e+02 2.291e+02 4.038e+02, threshold=3.888e+02, percent-clipped=1.0 2023-09-30 16:00:50,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_328-264892-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:00:50,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0_sp0.9 from training. Duration: 0.67775 2023-09-30 16:00:51,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.balancer2.prob, batch_count=763160.0, ans=0.125 2023-09-30 16:00:52,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_465-55448-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:00:54,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_311-53106-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:00:58,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_389-277915-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:01:01,654 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535791094_677837_63-277139-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:01:03,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0 from training. Duration: 0.71 2023-09-30 16:01:03,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_686-247600-0 from training. Duration: 0.92 2023-09-30 16:01:03,320 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_108-255430-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:01:07,784 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_114-201399-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:01:09,563 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_506-42614-0 from training. Duration: 0.79 2023-09-30 16:01:11,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:01:16,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_141-36243-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:01:26,035 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_358-287492-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:01:26,077 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:01:27,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0 from training. Duration: 0.49 2023-09-30 16:01:32,531 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_428-181925-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:01:32,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0 from training. Duration: 0.89 2023-09-30 16:01:32,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1104-271009-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:01:32,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:01:34,060 INFO [train.py:1039] (3/4) Epoch 22, batch 2950, loss[loss=0.1822, simple_loss=0.2511, pruned_loss=0.05666, over 23761.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2505, pruned_loss=0.0492, over 4713863.05 frames. ], batch size: 164, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:01:37,669 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_19-62511-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:01:40,626 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_805-71497-0 from training. Duration: 0.71 2023-09-30 16:01:40,826 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.ff3_skip_rate, batch_count=763360.0, ans=0.0 2023-09-30 16:01:42,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_573-152077-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:01:42,091 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_393-177816-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:01:43,644 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_278-35159-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:01:43,792 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward1.out_proj.dropout_p, batch_count=763360.0, ans=0.1 2023-09-30 16:01:43,830 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=763360.0, ans=0.125 2023-09-30 16:01:47,046 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:01:47,220 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0 from training. Duration: 0.94 2023-09-30 16:01:47,265 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=763360.0, ans=0.125 2023-09-30 16:01:49,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0 from training. Duration: 0.84 2023-09-30 16:01:49,591 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_436-318113-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:01:49,593 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_148-152225-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:01:55,703 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2541_210_1794_1_1525590269_3084765_156-173713-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:01:58,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_531-24157-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:02:00,235 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_382-217492-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:02:01,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_270-158013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:02:05,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_311-206227-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:02:05,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_282-322611-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:02:09,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_562-32295-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:02:09,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_408-87330-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:02:09,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_137-318710-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:02:11,215 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=763493.3333333334, ans=0.0 2023-09-30 16:02:12,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0 from training. Duration: 0.86 2023-09-30 16:02:14,348 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=763493.3333333334, ans=0.125 2023-09-30 16:02:15,898 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.bypass.skip_rate, batch_count=763493.3333333334, ans=0.09899494936611666 2023-09-30 16:02:17,094 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_178-56269-0_sp1.1 from training. Duration: 0.95275 2023-09-30 16:02:17,131 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9109W0012-549758 from training. Duration: 0.512 2023-09-30 16:02:17,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:02:18,843 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9121W0012-554933 from training. Duration: 0.896 2023-09-30 16:02:21,746 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_476-272757-0 from training. Duration: 0.93 2023-09-30 16:02:21,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1085-171718-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:02:22,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:02:22,542 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9130W0113-559703 from training. Duration: 0.896 2023-09-30 16:02:22,548 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp1.1 from training. Duration: 0.463625 2023-09-30 16:02:25,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_96-12756-0 from training. Duration: 0.98 2023-09-30 16:02:25,667 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_725-193477-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:02:25,731 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:02:28,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_206-51075-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:02:29,626 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.whiten, num_groups=1, num_channels=512, metric=6.52 vs. limit=12.0 2023-09-30 16:02:30,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_176-56120-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:02:30,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _40284_210_2084_1_1533513679814_3704279_220-201864-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:02:30,559 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9165W0004-576638 from training. Duration: 0.896 2023-09-30 16:02:32,032 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_218-343336-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:02:32,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_318-162017-0 from training. Duration: 0.91 2023-09-30 16:02:40,103 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_49544_210_1794_1_1534379770_5262083_483-349298-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:02:42,048 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_197-28164-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:02:42,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0 from training. Duration: 0.57 2023-09-30 16:02:42,201 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_38965_210_4852_1_1533174906_3484735_403-332963-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:02:43,820 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_340-174176-0 from training. Duration: 0.49 2023-09-30 16:02:48,170 WARNING [train.py:1197] (3/4) Exclude cut with ID _56708_210_13177_1_1535252351282_3422899_363-917-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:02:49,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_525-159063-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:02:51,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:02:51,401 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_86-329250-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:02:51,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 16:02:53,043 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_1083-147536-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:02:53,326 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=763626.6666666666, ans=0.2 2023-09-30 16:02:55,884 INFO [train.py:1039] (3/4) Epoch 22, batch 3000, loss[loss=0.1719, simple_loss=0.2423, pruned_loss=0.05069, over 23499.00 frames. ], tot_loss[loss=0.1751, simple_loss=0.2512, pruned_loss=0.04953, over 4715335.92 frames. ], batch size: 120, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:02:55,885 INFO [train.py:1062] (3/4) Computing validation loss 2023-09-30 16:03:10,446 INFO [train.py:1071] (3/4) Epoch 22, validation: loss=0.3133, simple_loss=0.2748, pruned_loss=0.1759, over 1125622.00 frames. 2023-09-30 16:03:10,446 INFO [train.py:1072] (3/4) Maximum memory allocated so far is 21511MB 2023-09-30 16:03:10,526 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_54-311741-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:03:10,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_237-58433-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:03:10,599 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_98-51676-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:03:10,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_386-270472-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:03:12,690 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:03:12,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_314-93656-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:03:12,862 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0 from training. Duration: 0.9 2023-09-30 16:03:14,977 WARNING [train.py:1197] (3/4) Exclude cut with ID _36686_210_5665_1_1533778225913_3646410_65-221120-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:03:18,106 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_271-68178-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:03:18,190 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:03:21,342 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9303W0114-637133 from training. Duration: 0.896 2023-09-30 16:03:22,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_490-207861-0 from training. Duration: 0.98 2023-09-30 16:03:24,437 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_173-256601-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:03:25,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530841782272_3503520_327-69677-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:03:25,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_508-335369-0 from training. Duration: 0.48 2023-09-30 16:03:27,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_24-46567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:03:30,563 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=13.09 vs. limit=15.0 2023-09-30 16:03:34,538 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 16:03:43,893 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:03:46,062 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten.whitening_limit, batch_count=763826.6666666666, ans=15.0 2023-09-30 16:03:50,839 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.527e+02 1.854e+02 2.053e+02 2.240e+02 3.574e+02, threshold=4.107e+02, percent-clipped=0.0 2023-09-30 16:03:51,826 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=6.52 vs. limit=15.0 2023-09-30 16:03:52,479 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0 from training. Duration: 0.72 2023-09-30 16:03:52,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_356-167816-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:03:55,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:03:57,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_358-172940-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:03:57,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_668-344147-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:04:00,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _62337_210_12392_1_1535009273105_3412229_255-157667-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:04:00,006 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0 from training. Duration: 0.85 2023-09-30 16:04:00,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53638_210_21581_1_1534297319_5808449_885-80172-0 from training. Duration: 0.96 2023-09-30 16:04:02,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_105-183067-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:04:03,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_345-169156-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 16:04:06,061 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:04:07,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:04:07,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_332-121925-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:04:07,375 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_495-65515-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:04:10,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:04:11,897 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_708-71345-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:04:11,897 WARNING [train.py:1197] (3/4) Exclude cut with ID 3033-130750-0096-55598-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:04:15,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_219-224445-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:04:15,364 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.0.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:04:16,712 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0 from training. Duration: 0.71 2023-09-30 16:04:18,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _45021_210_9994_1_1533619751094_3330288_96-239010-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:04:18,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_489-305116-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:04:18,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_269-154726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:04:18,681 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer2.prob, batch_count=763960.0, ans=0.125 2023-09-30 16:04:23,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_126-54418-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:04:24,770 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_182-224159-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:04:24,935 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0_sp0.9 from training. Duration: 0.488875 2023-09-30 16:04:25,003 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0 from training. Duration: 0.58 2023-09-30 16:04:27,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_637-279094-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:04:27,642 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_578-262107-0 from training. Duration: 0.99 2023-09-30 16:04:27,948 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.balancer1.prob, batch_count=763960.0, ans=0.125 2023-09-30 16:04:29,054 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_82-221196-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:04:30,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0 from training. Duration: 0.67 2023-09-30 16:04:32,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:04:32,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:04:32,553 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_55-31904-0 from training. Duration: 0.88 2023-09-30 16:04:32,670 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_25-332138-0 from training. Duration: 0.91 2023-09-30 16:04:32,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 16:04:34,039 INFO [train.py:1039] (3/4) Epoch 22, batch 3050, loss[loss=0.1696, simple_loss=0.2558, pruned_loss=0.04174, over 24428.00 frames. ], tot_loss[loss=0.1759, simple_loss=0.2518, pruned_loss=0.04998, over 4714165.82 frames. ], batch size: 66, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:04:34,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_458-299435-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:04:36,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_275-88403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:04:36,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_841-341679-0_sp1.1 from training. Duration: 0.52725 2023-09-30 16:04:36,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _41391_210_7994_1_1533603771872_3638201_1-54521-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:04:38,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_495-186609-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:04:40,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0 from training. Duration: 0.81 2023-09-30 16:04:44,679 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_127-135615-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:04:46,277 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_915-21561-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:04:46,340 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_6-214815-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:04:46,523 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_module2.balancer1.prob, batch_count=764026.6666666666, ans=0.125 2023-09-30 16:04:49,621 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45399_210_4852_1_1533624725_7579737_190-137494-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:04:53,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0 from training. Duration: 0.94 2023-09-30 16:04:59,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_90-31383-0 from training. Duration: 0.87 2023-09-30 16:04:59,287 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0 from training. Duration: 0.94 2023-09-30 16:05:01,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_684-85342-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:05:04,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_97-325053-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:05:05,328 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.whiten1, num_groups=1, num_channels=288, metric=7.96 vs. limit=10.0 2023-09-30 16:05:07,640 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_168-25150-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:05:07,653 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_111-27984-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:05:07,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_276-226230-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:05:11,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:05:12,776 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_158-250116-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:05:12,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_279-332556-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:05:12,882 WARNING [train.py:1197] (3/4) Exclude cut with ID _42863_210_9070_1_1533902398788_3696920_488-230146-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:05:12,883 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_387-247159-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:05:13,586 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn2.whiten, num_groups=1, num_channels=512, metric=15.97 vs. limit=22.5 2023-09-30 16:05:16,403 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6729_210_2550_1_1528032481_1788725_261-42619-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:05:18,061 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_831-44724-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:05:18,298 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module2.balancer2.prob, batch_count=764160.0, ans=0.125 2023-09-30 16:05:21,064 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_280-285944-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:05:21,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0 from training. Duration: 0.96 2023-09-30 16:05:22,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _29678_210_7893_1_1532421042787_3906118_456-202548-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:05:22,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_107-332398-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:05:25,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:05:25,923 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:05:27,433 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_385-234809-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:05:27,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_292-215346-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:05:33,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_196-124268-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:05:35,441 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:05:35,670 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=764226.6666666666, ans=0.2 2023-09-30 16:05:40,807 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_349-6928-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:05:40,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _60621_210_17071_1_1535013053530_3671739_226-255202-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:05:40,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_448-130302-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:05:43,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:05:44,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_912-47473-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:05:44,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:05:46,290 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_1-99680-0 from training. Duration: 0.67 2023-09-30 16:05:49,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_199-86432-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:05:49,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _61148_210_6897_1_1535094022726_4217610_362-182430-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:05:49,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_301-154763-0 from training. Duration: 0.98 2023-09-30 16:05:51,159 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:05:56,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:05:56,550 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=4.20 vs. limit=15.0 2023-09-30 16:05:57,366 INFO [train.py:1039] (3/4) Epoch 22, batch 3100, loss[loss=0.2123, simple_loss=0.2589, pruned_loss=0.08292, over 19643.00 frames. ], tot_loss[loss=0.1764, simple_loss=0.2521, pruned_loss=0.05037, over 4694706.14 frames. ], batch size: 388, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:05:58,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_111-6376-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:05:59,408 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.bypass.scale_min, batch_count=764360.0, ans=0.2 2023-09-30 16:06:00,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 16:06:03,551 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0 from training. Duration: 0.89 2023-09-30 16:06:07,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_77-311404-0 from training. Duration: 0.94 2023-09-30 16:06:07,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0 from training. Duration: 0.83 2023-09-30 16:06:10,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:06:13,902 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_79-277189-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:06:13,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_195-295268-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:06:17,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp0.9 from training. Duration: 0.52225 2023-09-30 16:06:20,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_777-71446-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:06:25,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0 from training. Duration: 0.7 2023-09-30 16:06:31,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _13684_210_7497_1_1530619354863_3598359_312-235606-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:06:33,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_283-349402-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:06:33,303 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_29-117932-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:06:33,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.prob, batch_count=764493.3333333334, ans=0.125 2023-09-30 16:06:34,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_721-283351-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:06:34,788 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_175-229073-0_sp0.9 from training. Duration: 0.5 2023-09-30 16:06:35,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_70-268830-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:06:35,040 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_234-83104-0 from training. Duration: 0.62 2023-09-30 16:06:35,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_317-199546-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:06:36,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_547-196022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:06:37,163 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff2_skip_rate, batch_count=764493.3333333334, ans=0.0 2023-09-30 16:06:38,095 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.583e+02 1.888e+02 2.068e+02 2.233e+02 3.046e+02, threshold=4.136e+02, percent-clipped=0.0 2023-09-30 16:06:38,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0 from training. Duration: 0.64 2023-09-30 16:06:40,499 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_426-326246-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:06:45,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_445-117398-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:06:45,545 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_563-347832-0 from training. Duration: 0.89 2023-09-30 16:06:47,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0 from training. Duration: 0.54 2023-09-30 16:06:47,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _59611_210_8895_1_1534741198240_3650361_187-212779-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:06:47,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_282-159367-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:06:47,595 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.2.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:06:50,902 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68702_210_23689_1_1535799577_3420068_33-136883-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:06:50,939 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47228_210_4983_1_1533780021_3672027_184-293706-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:06:52,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_656-188361-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:06:52,589 WARNING [train.py:1197] (3/4) Exclude cut with ID _4866_210_2577_1_1527404412284_4364659_352-157930-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:06:52,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_210-106047-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:06:55,412 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:06:55,479 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3910_210_1794_1_1526777667_3747260_178-41186-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:06:55,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_24-172263-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:06:55,501 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_318-203153-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 16:07:00,720 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_222-51982-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:07:02,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_87-131427-0 from training. Duration: 0.78 2023-09-30 16:07:03,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_372-298093-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:07:05,267 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0 from training. Duration: 0.8 2023-09-30 16:07:06,660 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_429-177469-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:07:06,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_724-172651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:07:06,791 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0 from training. Duration: 0.51 2023-09-30 16:07:18,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _48677_210_5663_1_1533902405919_3892989_111-11439-0 from training. Duration: 0.96 2023-09-30 16:07:20,450 INFO [train.py:1039] (3/4) Epoch 22, batch 3150, loss[loss=0.1585, simple_loss=0.2425, pruned_loss=0.03726, over 24453.00 frames. ], tot_loss[loss=0.1745, simple_loss=0.2504, pruned_loss=0.04927, over 4707562.83 frames. ], batch size: 63, lr: 4.66e-03, grad_scale: 8.0 2023-09-30 16:07:22,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _8085_210_4557_1_1528455633493_3716100_95-103398-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:07:22,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_1041-110837-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:07:23,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_1155-121757-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:07:23,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_602-275260-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:07:24,754 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_265-164486-0 from training. Duration: 0.96 2023-09-30 16:07:25,016 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=764693.3333333334, ans=0.125 2023-09-30 16:07:26,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_142-229375-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:07:26,111 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_310-26186-0_sp1.1 from training. Duration: 0.636375 2023-09-30 16:07:27,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_136-270644-0 from training. Duration: 0.93 2023-09-30 16:07:30,923 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_438-286372-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:07:32,439 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0128W0004-63309 from training. Duration: 0.831 2023-09-30 16:07:34,411 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.balancer1.prob, batch_count=764693.3333333334, ans=0.125 2023-09-30 16:07:35,442 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0 from training. Duration: 0.53 2023-09-30 16:07:35,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _41326_210_5854_1_1533778076509_3492080_277-59511-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:07:35,875 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.ff3_skip_rate, batch_count=764760.0, ans=0.0 2023-09-30 16:07:37,089 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0144W0112-71379 from training. Duration: 0.992 2023-09-30 16:07:38,587 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp0.9 from training. Duration: 0.47775 2023-09-30 16:07:38,917 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=764760.0, ans=0.125 2023-09-30 16:07:40,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_237-332027-0 from training. Duration: 0.95 2023-09-30 16:07:41,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _7970_210_1883_1_1528455616296_3472230_25-178407-0 from training. Duration: 0.71 2023-09-30 16:07:41,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0 from training. Duration: 0.81 2023-09-30 16:07:41,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _39571_210_13449_1_1533801584143_3651077_319-268877-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:07:41,671 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_155-246880-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:07:43,299 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_566-122599-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:07:44,847 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530701932_530275_18-76322-0 from training. Duration: 0.87 2023-09-30 16:07:46,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_159-106035-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:07:46,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_98-276013-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:07:48,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_330-216124-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:07:50,586 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp0.9 from training. Duration: 0.688875 2023-09-30 16:07:55,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_343-139898-0 from training. Duration: 0.96 2023-09-30 16:07:55,188 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:07:58,874 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_184-229630-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:07:59,226 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:08:00,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_153-18895-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:08:00,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_598-161926-0 from training. Duration: 0.9 2023-09-30 16:08:03,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0 from training. Duration: 0.59 2023-09-30 16:08:05,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_718-68505-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:08:05,351 WARNING [train.py:1197] (3/4) Exclude cut with ID _4068_210_2629_1_1526697350395_1546259_224-288693-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 16:08:05,375 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:08:06,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _15458_210_8341_1_1530858614263_3834410_49-235472-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:08:06,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:08:09,672 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:08:09,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0_sp0.9 from training. Duration: 0.72225 2023-09-30 16:08:09,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0 from training. Duration: 0.54 2023-09-30 16:08:10,460 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.93 vs. limit=15.0 2023-09-30 16:08:11,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_272-249985-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:08:11,244 WARNING [train.py:1197] (3/4) Exclude cut with ID _50376_210_13401_1_1534031502101_4217740_518-187390-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:12,810 WARNING [train.py:1197] (3/4) Exclude cut with ID _59738_210_15379_1_1534849512562_3941470_165-248260-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:08:12,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _23818_210_6228_1_1531917063884_3676920_400-343925-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:08:12,968 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0 from training. Duration: 0.77 2023-09-30 16:08:14,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_289-207931-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:08:17,446 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0 from training. Duration: 0.9 2023-09-30 16:08:17,465 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_203-232438-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:17,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0 from training. Duration: 0.86 2023-09-30 16:08:18,244 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn1.whiten, num_groups=1, num_channels=384, metric=11.65 vs. limit=22.5 2023-09-30 16:08:18,445 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=384, metric=9.40 vs. limit=15.0 2023-09-30 16:08:19,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0 from training. Duration: 0.95 2023-09-30 16:08:19,463 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module1.balancer2.prob, batch_count=764893.3333333334, ans=0.125 2023-09-30 16:08:20,663 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_94-195801-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:08:20,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_269-103204-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:08:22,803 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_748-36150-0 from training. Duration: 0.91 2023-09-30 16:08:24,253 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_148-172861-0_sp1.1 from training. Duration: 0.4545625 2023-09-30 16:08:24,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_460-77730-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:08:26,523 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_374-20753-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:08:28,016 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_374-289983-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:29,424 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34886_210_10633_1_1533203882_1755661_76-156096-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:08:34,530 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:08:36,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _46683_210_9642_1_1533730913492_4591116_489-146727-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:36,400 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=764960.0, ans=0.125 2023-09-30 16:08:38,089 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp0.9 from training. Duration: 0.511125 2023-09-30 16:08:42,515 INFO [train.py:1039] (3/4) Epoch 22, batch 3200, loss[loss=0.1592, simple_loss=0.237, pruned_loss=0.04073, over 24280.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2489, pruned_loss=0.04856, over 4711214.95 frames. ], batch size: 61, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:08:44,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_243-46549-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:08:44,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp1.1 from training. Duration: 0.536375 2023-09-30 16:08:48,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_121-321334-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:08:48,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_254-102149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:08:50,292 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_96-152158-0 from training. Duration: 0.85 2023-09-30 16:08:53,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_161-102979-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:08:57,231 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3787_210_2040_1_1526477544_5478586_603-120324-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:08:59,356 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.whiten, num_groups=1, num_channels=256, metric=2.31 vs. limit=12.0 2023-09-30 16:09:00,387 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_247-213456-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:09:09,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_skip_rate, batch_count=765093.3333333334, ans=0.0 2023-09-30 16:09:11,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:09:20,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_215-202973-0 from training. Duration: 0.88 2023-09-30 16:09:22,234 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_17-320491-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:09:23,572 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.510e+02 1.875e+02 2.082e+02 2.411e+02 3.393e+02, threshold=4.163e+02, percent-clipped=0.0 2023-09-30 16:09:25,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_310-114452-0 from training. Duration: 0.59 2023-09-30 16:09:25,504 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 16:09:30,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_159-281310-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:09:30,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_195-167011-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:09:32,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_156-257607-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:09:37,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2295_210_2087_1_1525500993_2407863_116-238894-0 from training. Duration: 0.7 2023-09-30 16:09:38,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp0.9 from training. Duration: 0.5 2023-09-30 16:09:40,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0 from training. Duration: 0.82 2023-09-30 16:09:44,353 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0 from training. Duration: 0.91 2023-09-30 16:09:47,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:09:53,882 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_651-95702-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:09:53,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:09:55,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_381-125325-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:09:55,430 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0622W0021-307659 from training. Duration: 0.896 2023-09-30 16:09:55,434 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:09:57,143 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=765293.3333333334, ans=0.1 2023-09-30 16:10:00,230 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_607-3416-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:10:01,832 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0 from training. Duration: 0.64 2023-09-30 16:10:01,918 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0 from training. Duration: 0.6 2023-09-30 16:10:03,377 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0 from training. Duration: 0.55 2023-09-30 16:10:04,740 INFO [train.py:1039] (3/4) Epoch 22, batch 3250, loss[loss=0.1797, simple_loss=0.27, pruned_loss=0.04472, over 24272.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2492, pruned_loss=0.04848, over 4709114.20 frames. ], batch size: 74, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:10:04,863 WARNING [train.py:1197] (3/4) Exclude cut with ID _14860_210_4915_1_1530702020340_7343240_836-328489-0 from training. Duration: 0.9 2023-09-30 16:10:07,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_1033-274376-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:10:09,088 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=765360.0, ans=0.0 2023-09-30 16:10:09,138 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=765360.0, ans=0.125 2023-09-30 16:10:10,201 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_475-127455-0_sp0.9 from training. Duration: 0.77775 2023-09-30 16:10:10,212 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0071-331914 from training. Duration: 0.896 2023-09-30 16:10:10,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_2-1523-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:10:10,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_460-208279-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:12,435 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0052-335859 from training. Duration: 0.64 2023-09-30 16:10:17,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_3-237434-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:10:19,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_405-185283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:10:27,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_927-83684-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:10:27,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_356-169233-0 from training. Duration: 0.99 2023-09-30 16:10:28,877 WARNING [train.py:1197] (3/4) Exclude cut with ID _19896_210_1883_1_1531709022662_6649569_562-233001-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:10:30,282 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_354-78629-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:10:30,284 WARNING [train.py:1197] (3/4) Exclude cut with ID _60135_210_13427_1_1534935557922_3651319_96-168198-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:10:31,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _31602_210_5854_1_1532677021456_4458250_249-96604-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:10:32,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_258-224599-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:10:35,147 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_215-160135-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:35,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_113-18163-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:10:35,295 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_552-337340-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:10:35,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_10-12887-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:35,360 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_401-284840-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:35,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _44659_210_2721_1_1534036120061_6614860_483-141285-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:10:40,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_358-332734-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:10:43,417 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:10:45,049 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_354-267196-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:10:45,083 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_192-309792-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:10:46,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_290-151255-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:10:46,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5575_210_1410_1_1527301305_2570170_249-93769-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:10:46,733 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_463-52378-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:10:52,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_580-93798-0 from training. Duration: 0.56 2023-09-30 16:10:52,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_385-13762-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:10:52,189 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_458-67321-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:10:54,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_188-154791-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:10:56,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:11:02,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_661-264857-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:11:09,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:11:09,982 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_241-278329-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:11:09,983 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11349_210_6753_1_1529800861_1101207_26-65602-0 from training. Duration: 0.96 2023-09-30 16:11:09,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_448-4048-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:11:10,012 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 16:11:11,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_256-54454-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:11:15,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0 from training. Duration: 0.9 2023-09-30 16:11:15,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0 from training. Duration: 0.94 2023-09-30 16:11:15,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:11:16,763 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_316-294364-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:11:16,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_762-231063-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:11:18,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_68-219807-0_sp0.9 from training. Duration: 0.52225 2023-09-30 16:11:18,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_479-158053-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:11:22,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _66112_210_10635_1_1535590236305_3801484_83-38181-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:11:22,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_255-146346-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:11:25,057 WARNING [train.py:1197] (3/4) Exclude cut with ID _48836_210_12210_1_1534075848293_3904150_429-35475-0 from training. Duration: 0.89 2023-09-30 16:11:25,076 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50154_210_13731_1_1534750862_1080961_119-187163-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:11:28,031 INFO [train.py:1039] (3/4) Epoch 22, batch 3300, loss[loss=0.1744, simple_loss=0.257, pruned_loss=0.04592, over 23412.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2496, pruned_loss=0.04822, over 4716733.96 frames. ], batch size: 93, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:11:28,088 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:11:28,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _14884_210_2252_1_1530707459689_5561250_591-173406-0 from training. Duration: 0.96 2023-09-30 16:11:30,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_54-198193-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:11:32,471 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6545_210_2040_1_1527774670_4245258_407-48070-0 from training. Duration: 0.76 2023-09-30 16:11:34,060 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0 from training. Duration: 0.84 2023-09-30 16:11:35,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_507-303370-0 from training. Duration: 0.96 2023-09-30 16:11:36,976 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_164-282366-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:11:40,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_312-158727-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:11:40,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_174-205058-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:11:41,677 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_166-237358-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:11:43,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 16:11:43,314 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_96-290039-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 16:11:46,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _53219_210_4525_1_1534834836008_4064825_261-101691-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:11:48,454 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_96-293390-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:11:51,747 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer_na.min_abs, batch_count=765760.0, ans=0.02 2023-09-30 16:11:51,831 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer2.prob, batch_count=765760.0, ans=0.125 2023-09-30 16:11:53,030 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_271-234962-0 from training. Duration: 0.79 2023-09-30 16:11:53,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_136-30660-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:11:53,163 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_625-281046-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:11:56,077 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_333-168193-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:11:57,493 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0269-509664 from training. Duration: 0.896 2023-09-30 16:11:59,037 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_450-147393-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:12:01,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_643-52007-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 16:12:01,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_330-327861-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:12:01,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_260-317651-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:12:01,200 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9044W0010-516654 from training. Duration: 0.896 2023-09-30 16:12:05,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward1.hidden_balancer.prob, batch_count=765826.6666666666, ans=0.125 2023-09-30 16:12:06,252 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_265-118378-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:12:06,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_321-148641-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:12:08,585 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_101-169176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:12:08,589 WARNING [train.py:1197] (3/4) Exclude cut with ID 497-129325-0061-62254-0_sp1.1 from training. Duration: 0.97725 2023-09-30 16:12:09,731 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.582e+02 1.926e+02 2.080e+02 2.384e+02 3.230e+02, threshold=4.160e+02, percent-clipped=0.0 2023-09-30 16:12:09,962 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_795-33252-0_sp1.1 from training. Duration: 0.336375 2023-09-30 16:12:09,998 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_168-246176-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:12:11,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_120-150943-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:12:13,186 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0005-536844 from training. Duration: 0.768 2023-09-30 16:12:13,382 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=765826.6666666666, ans=0.1 2023-09-30 16:12:14,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0 from training. Duration: 0.84 2023-09-30 16:12:14,769 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:12:16,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0 from training. Duration: 0.39 2023-09-30 16:12:19,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_379-184223-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:12:23,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0_sp1.1 from training. Duration: 0.62725 2023-09-30 16:12:23,198 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:12:26,507 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_module1.balancer1.prob, batch_count=765893.3333333334, ans=0.125 2023-09-30 16:12:27,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _31088_210_16446_1_1532520012284_3440919_20-260902-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:12:27,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_856-344574-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:12:27,789 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_36756_210_7234_1_1533026991_966276_4-164118-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:12:29,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp1.1 from training. Duration: 0.5 2023-09-30 16:12:32,166 WARNING [train.py:1197] (3/4) Exclude cut with ID _5910_210_1389_1_1527337972454_5050100_555-159815-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:12:32,195 WARNING [train.py:1197] (3/4) Exclude cut with ID _37086_210_9038_1_1532999540891_7021250_469-147941-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:12:33,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:12:36,591 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0006-572499 from training. Duration: 0.896 2023-09-30 16:12:36,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_277-210798-0 from training. Duration: 0.55 2023-09-30 16:12:38,905 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_103-336064-0_sp1.1 from training. Duration: 0.463625 2023-09-30 16:12:38,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3414_210_1988_1_1526212718_4813195_331-290945-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:12:39,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_365-29276-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:12:39,249 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.balancer.max_positive, batch_count=765960.0, ans=0.95 2023-09-30 16:12:41,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _14821_210_8750_1_1530680503991_6829359_69-250847-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:12:41,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1102-61301-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:12:41,802 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=4.16 vs. limit=15.0 2023-09-30 16:12:42,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 16:12:44,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _15351_210_10626_1_1530856635653_3576210_214-129918-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:12:44,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_120-41668-0_sp0.9 from training. Duration: 0.77775 2023-09-30 16:12:46,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_27-72278-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:12:46,318 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_521-258868-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 16:12:49,353 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0 from training. Duration: 0.77 2023-09-30 16:12:49,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_465-157433-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:12:49,786 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_skip_rate, batch_count=766026.6666666666, ans=0.0 2023-09-30 16:12:50,909 INFO [train.py:1039] (3/4) Epoch 22, batch 3350, loss[loss=0.1697, simple_loss=0.2477, pruned_loss=0.04586, over 24635.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2503, pruned_loss=0.04797, over 4726872.77 frames. ], batch size: 65, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:12:51,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_248-319353-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:12:53,942 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_102-77064-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:12:53,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_379-128794-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:12:55,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_236-260835-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:12:59,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _24271_210_12658_1_1531987270022_7190519_184-342363-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:12:59,052 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_67-221663-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:00,687 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:13:02,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_458-42232-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:03,795 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:13:05,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _58602_210_2346_1_1534903210085_6908830_792-102241-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:05,779 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=766093.3333333334, ans=0.125 2023-09-30 16:13:08,835 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:13:10,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_239-98429-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:13:10,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_538-32291-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:13:12,540 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_419-317062-0 from training. Duration: 0.91 2023-09-30 16:13:12,857 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=766093.3333333334, ans=0.1 2023-09-30 16:13:14,005 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0202-640329 from training. Duration: 0.896 2023-09-30 16:13:14,058 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_419-313291-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:13:15,849 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=766093.3333333334, ans=0.125 2023-09-30 16:13:18,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0 from training. Duration: 0.63 2023-09-30 16:13:18,561 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0 from training. Duration: 0.73 2023-09-30 16:13:18,709 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_103-84010-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:13:20,652 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_497-97218-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:13:22,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_262-84613-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:13:22,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0 from training. Duration: 0.81 2023-09-30 16:13:22,271 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_224-300489-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:22,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_170-336541-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:13:25,143 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47069_210_15368_1_1533781267_4431320_180-31746-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:25,332 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_447-7041-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:25,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_2-55283-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:26,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _12555_210_8750_1_1530075594402_3780239_70-181911-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:13:31,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_311-328022-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:13:35,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _14528_210_8254_1_1530669700514_4306939_330-2987-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:35,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_385-184858-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:13:39,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_348-333360-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:13:40,506 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=3.73 vs. limit=15.0 2023-09-30 16:13:41,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_753-214751-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:13:42,850 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_202-208013-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:42,873 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_610-129300-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:13:44,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_321-23852-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:13:46,663 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_173-238248-0 from training. Duration: 0.83 2023-09-30 16:13:46,675 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_405-339986-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 16:13:46,718 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0 from training. Duration: 0.87 2023-09-30 16:13:48,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:13:48,274 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0 from training. Duration: 0.62 2023-09-30 16:13:49,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_146-343734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:13:51,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _3845_210_1390_1_1526731326141_3523610_72-61441-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:13:58,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_991-125028-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:13:59,891 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0 from training. Duration: 0.66 2023-09-30 16:13:59,967 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_416-257730-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 16:14:00,085 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:14:01,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_242-76943-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:14:04,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_190-49648-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:14:08,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_214-88076-0 from training. Duration: 0.88 2023-09-30 16:14:08,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _5585_210_2667_1_1527418813324_8007340_89-332072-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 16:14:08,688 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_555-293448-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:14:10,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_286-331314-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:14:11,655 WARNING [train.py:1197] (3/4) Exclude cut with ID _13921_210_9994_1_1530838702800_3003429_46-149444-0 from training. Duration: 0.94 2023-09-30 16:14:13,631 INFO [train.py:1039] (3/4) Epoch 22, batch 3400, loss[loss=0.1627, simple_loss=0.2462, pruned_loss=0.03961, over 24465.00 frames. ], tot_loss[loss=0.1742, simple_loss=0.2511, pruned_loss=0.04866, over 4725329.38 frames. ], batch size: 66, lr: 4.66e-03, grad_scale: 16.0 2023-09-30 16:14:13,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_768-43595-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:14:13,726 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0 from training. Duration: 0.8 2023-09-30 16:14:15,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_1007-316380-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:14:17,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_150-283437-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:14:17,266 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3474_210_2028_1_1526188513_4825246_145-309131-0_sp1.1 from training. Duration: 0.67275 2023-09-30 16:14:18,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:14:18,799 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_238-256668-0 from training. Duration: 0.69 2023-09-30 16:14:19,384 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=4.31 vs. limit=15.0 2023-09-30 16:14:25,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0 from training. Duration: 0.6 2023-09-30 16:14:25,524 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0174W0006-766659 from training. Duration: 0.953 2023-09-30 16:14:25,552 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_668-23019-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:14:28,612 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_322-74328-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:14:28,622 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 16:14:30,195 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53378_210_17282_1_1534221071_5769509_641-286201-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:14:31,698 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0_sp1.1 from training. Duration: 0.5 2023-09-30 16:14:37,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _5250_210_5219_1_1527249561806_8692110_1270-317971-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:14:39,137 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_784-213099-0 from training. Duration: 0.93 2023-09-30 16:14:39,370 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_skip_rate, batch_count=766426.6666666666, ans=0.0 2023-09-30 16:14:44,339 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_115-233929-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:14:44,544 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff2_skip_rate, batch_count=766493.3333333334, ans=0.0 2023-09-30 16:14:47,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_256-223809-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:14:47,402 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_285-87781-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:14:48,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp1.1 from training. Duration: 0.57275 2023-09-30 16:14:56,559 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:14:57,965 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.457e+02 1.853e+02 2.034e+02 2.228e+02 2.939e+02, threshold=4.068e+02, percent-clipped=0.0 2023-09-30 16:14:58,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_333-110399-0 from training. Duration: 0.96 2023-09-30 16:15:00,351 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2.whitening_limit, batch_count=766493.3333333334, ans=15.0 2023-09-30 16:15:05,001 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50423_210_13270_1_1534128598_5021734_759-90833-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:15:06,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_151-254798-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:15:06,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0 from training. Duration: 0.92 2023-09-30 16:15:07,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_673-36456-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:15:07,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _36538_210_9994_1_1533204020363_3795620_131-8685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:15:09,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _12556_210_8341_1_1530081024100_7327690_713-55626-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:15:09,400 WARNING [train.py:1197] (3/4) Exclude cut with ID _2322_210_2667_1_1525604548971_3656299_440-80494-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:15:11,094 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_210-325081-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:15:11,415 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=766560.0, ans=0.125 2023-09-30 16:15:16,156 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_375-244976-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:15:16,167 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530952293_1664795_119-31001-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:15:21,615 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_319-42326-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:15:23,308 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_372-173012-0 from training. Duration: 0.95 2023-09-30 16:15:28,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_41-31157-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 16:15:33,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0 from training. Duration: 0.68 2023-09-30 16:15:36,889 INFO [train.py:1039] (3/4) Epoch 22, batch 3450, loss[loss=0.1734, simple_loss=0.2334, pruned_loss=0.05669, over 23750.00 frames. ], tot_loss[loss=0.1753, simple_loss=0.2521, pruned_loss=0.04928, over 4721250.58 frames. ], batch size: 232, lr: 4.65e-03, grad_scale: 4.0 2023-09-30 16:15:37,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0 from training. Duration: 0.73 2023-09-30 16:15:37,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_152-67046-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:15:40,013 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:15:40,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0 from training. Duration: 0.64 2023-09-30 16:15:40,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_335-137972-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:15:43,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _5166_210_2084_1_1527073197160_4087993_331-117829-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:15:51,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_125-125818-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:15:53,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527854471671_2870519_48-294897-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:15:54,613 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_116-109398-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:15:54,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_222-172685-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:15:56,895 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_320-132275-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:16:03,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_76-311963-0 from training. Duration: 0.7 2023-09-30 16:16:10,282 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_32-205288-0 from training. Duration: 0.78 2023-09-30 16:16:10,324 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_86-16881-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 16:16:10,393 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:16:13,346 WARNING [train.py:1197] (3/4) Exclude cut with ID _57572_210_5517_1_1534921219624_3809484_187-295293-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:16:13,692 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=766826.6666666666, ans=0.1 2023-09-30 16:16:18,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_563-329943-0 from training. Duration: 0.99 2023-09-30 16:16:19,632 WARNING [train.py:1197] (3/4) Exclude cut with ID _7160_210_1876_1_1527940923231_3703799_302-150728-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:16:22,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_926-7526-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:16:22,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _62574_210_2655_1_1535180407058_3933249_467-22348-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:16:24,460 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp0.9 from training. Duration: 0.811125 2023-09-30 16:16:26,774 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_385-193052-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:16:28,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_444-28041-0 from training. Duration: 0.85 2023-09-30 16:16:28,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_341-249237-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:16:30,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_425-269826-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:16:33,676 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_124-185597-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:16:36,714 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_380-204756-0 from training. Duration: 0.86 2023-09-30 16:16:41,705 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:16:43,556 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.balancer2.prob, batch_count=766960.0, ans=0.125 2023-09-30 16:16:48,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_372-131163-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:16:50,080 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_447-233513-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:16:53,200 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_54-118333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:16:56,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_388-230239-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:16:57,884 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_141-54639-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:16:57,988 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:16:59,271 INFO [train.py:1039] (3/4) Epoch 22, batch 3500, loss[loss=0.1653, simple_loss=0.2268, pruned_loss=0.05192, over 22812.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2499, pruned_loss=0.04818, over 4722523.41 frames. ], batch size: 322, lr: 4.65e-03, grad_scale: 8.0 2023-09-30 16:16:59,363 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_532-137847-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:17:04,699 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_155-283231-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:17:07,716 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:17:09,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_185-174805-0 from training. Duration: 0.68 2023-09-30 16:17:11,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_662-210469-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 16:17:13,466 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_426-313557-0_sp0.9 from training. Duration: 0.588875 2023-09-30 16:17:16,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_407-204343-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:17:16,688 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0 from training. Duration: 0.82 2023-09-30 16:17:21,800 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4737_210_2040_1_1526998689_3293008_378-178650-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:17:21,940 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_432-327451-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:17:23,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_188-114464-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:17:23,651 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_798-106179-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:17:24,970 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0_sp1.1 from training. Duration: 0.52725 2023-09-30 16:17:25,056 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_361-78786-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:26,482 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:17:26,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_387-159008-0 from training. Duration: 0.86 2023-09-30 16:17:29,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_959-18820-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:30,917 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_133-564-0_sp0.9 from training. Duration: 0.87775 2023-09-30 16:17:32,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_12-29287-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:17:36,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_489-185052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:37,647 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_634-194174-0 from training. Duration: 0.97 2023-09-30 16:17:37,701 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_91-26919-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:17:39,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_516-82303-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:17:42,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:17:42,878 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.conv_module2.balancer1.max_abs, batch_count=767160.0, ans=10.0 2023-09-30 16:17:44,403 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.347e+02 1.810e+02 1.991e+02 2.339e+02 3.631e+02, threshold=3.981e+02, percent-clipped=0.0 2023-09-30 16:17:44,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_495-202963-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:46,096 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_332-105904-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:17:46,123 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40764_210_1925_1_1533882359_3752345_493-167886-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:17:47,726 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0 from training. Duration: 0.75 2023-09-30 16:17:47,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_26-175553-0 from training. Duration: 0.61 2023-09-30 16:17:48,195 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.out_proj.dropout_p, batch_count=767226.6666666666, ans=0.1 2023-09-30 16:17:50,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_890-5117-0 from training. Duration: 0.78 2023-09-30 16:17:52,033 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_206-306860-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:17:53,600 WARNING [train.py:1197] (3/4) Exclude cut with ID _25882_210_3036_1_1532775665815_6479510_570-79824-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:17:53,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_731-2428-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:17:53,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:17:58,313 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_178-39722-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 16:17:58,410 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:18:03,344 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41428_210_2161_1_1533774184_3992532_65-271323-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:18:04,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_308-161067-0 from training. Duration: 0.66 2023-09-30 16:18:04,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_459-173867-0 from training. Duration: 0.88 2023-09-30 16:18:04,896 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:18:07,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_354-192266-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:18:10,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_258-310570-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:18:10,326 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass_mid.scale_min, batch_count=767293.3333333334, ans=0.2 2023-09-30 16:18:11,625 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_548-141039-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:18:13,250 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15101_210_7927_1_1530786415_5033274_320-327527-0 from training. Duration: 0.96 2023-09-30 16:18:13,371 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:18:14,855 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543318_4552_1-85072-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:18:16,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _64639_210_5816_1_1535367619437_3528000_461-329156-0 from training. Duration: 0.96 2023-09-30 16:18:18,460 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0 from training. Duration: 0.45 2023-09-30 16:18:20,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _20455_210_5219_1_1531565996775_8098100_402-112739-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:18:22,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _53489_210_6228_1_1534298357169_3835970_256-115565-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:18:22,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26326_210_7925_1_1533002493_3592406_360-305260-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:18:22,751 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_18-204383-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:18:24,016 INFO [train.py:1039] (3/4) Epoch 22, batch 3550, loss[loss=0.1692, simple_loss=0.2363, pruned_loss=0.05106, over 23476.00 frames. ], tot_loss[loss=0.1717, simple_loss=0.2478, pruned_loss=0.04785, over 4709768.78 frames. ], batch size: 285, lr: 4.65e-03, grad_scale: 8.0 2023-09-30 16:18:25,805 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:18:26,555 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.whiten, num_groups=1, num_channels=512, metric=5.40 vs. limit=12.0 2023-09-30 16:18:29,326 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.39 vs. limit=6.0 2023-09-30 16:18:33,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.out_combiner.scale_min, batch_count=767360.0, ans=0.2 2023-09-30 16:18:34,818 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_116-299186-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:18:35,011 WARNING [train.py:1197] (3/4) Exclude cut with ID _13913_210_9994_1_1530579615187_3383897_473-277682-0_sp1.1 from training. Duration: 0.3545625 2023-09-30 16:18:39,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_49-225702-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:18:39,535 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3080_210_2071_1_1526086102_4365166_312-8726-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:18:43,183 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_489-190175-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:18:43,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_345-95717-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:18:44,702 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_85-138257-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:18:47,879 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:18:49,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_450-203637-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:18:49,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_416-132893-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:18:50,761 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_437-337690-0_sp0.9 from training. Duration: 0.7 2023-09-30 16:18:50,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_700-183549-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 16:18:51,230 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer1.max_abs, batch_count=767426.6666666666, ans=10.0 2023-09-30 16:18:58,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_191-59784-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:18:59,841 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_218-295097-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:19:00,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_310-109461-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:19:00,072 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_131-321149-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:19:01,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_133-188266-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:19:01,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_505-205270-0 from training. Duration: 0.38 2023-09-30 16:19:01,523 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67223_210_4238_1_1535612120_2537431_102-88248-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:19:03,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_60-235433-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:19:04,630 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_199-53526-0_sp1.1 from training. Duration: 0.3818125 2023-09-30 16:19:10,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_439-90979-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:19:10,826 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1162-132880-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:19:12,263 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_220-22317-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:19:15,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_909-64151-0 from training. Duration: 0.94 2023-09-30 16:19:15,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:19:17,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_302-22084-0 from training. Duration: 0.86 2023-09-30 16:19:17,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_663-166973-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:19:19,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:19:19,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_362-230377-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:19:23,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0 from training. Duration: 0.56 2023-09-30 16:19:25,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_281-1313-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:19:30,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_64-215118-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:19:32,650 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0 from training. Duration: 0.89 2023-09-30 16:19:34,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_402-208830-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:19:36,130 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.skip_rate, batch_count=767626.6666666666, ans=0.07 2023-09-30 16:19:37,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_98-193494-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:19:38,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_383-285196-0 from training. Duration: 0.97 2023-09-30 16:19:40,770 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=767626.6666666666, ans=0.0 2023-09-30 16:19:44,955 INFO [train.py:1039] (3/4) Epoch 22, batch 3600, loss[loss=0.1816, simple_loss=0.2569, pruned_loss=0.05322, over 23738.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2479, pruned_loss=0.04802, over 4704717.97 frames. ], batch size: 232, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:19:46,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6767_210_3273_1_1527855783_1718932_142-127132-0 from training. Duration: 0.95 2023-09-30 16:19:46,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _39867_210_9826_1_1533553222030_9344259_1018-339635-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:19:46,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _14603_210_4220_1_1530615688441_3097319_274-274798-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:19:48,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_376-263393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:19:50,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_363-344675-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:19:51,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_58-68956-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:19:55,085 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_709-311571-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:19:56,708 WARNING [train.py:1197] (3/4) Exclude cut with ID _46067_210_4220_1_1534125571066_3307450_136-257407-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:19:58,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_324-323854-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:19:58,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_234-260682-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:19:58,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _36634_210_1794_1_1533296352450_1399884_55-119451-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:19:59,751 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0 from training. Duration: 0.89 2023-09-30 16:20:02,734 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5522_210_3273_1_1527249606_3909096_366-172508-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 16:20:04,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_187-10504-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:20:06,516 WARNING [train.py:1197] (3/4) Exclude cut with ID _51308_210_14209_1_1534119783354_7762226_282-257791-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:20:09,572 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_467-288776-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:20:11,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_138-154812-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:20:11,157 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_1154-185930-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:20:11,199 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_52-279899-0 from training. Duration: 0.69 2023-09-30 16:20:12,726 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_141-89113-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:20:14,437 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=767760.0, ans=0.125 2023-09-30 16:20:15,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_297-47310-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:20:17,242 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_486-151131-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:20:18,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_21-232980-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:20:22,014 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6165_210_3528_1_1527590982_4379841_338-106380-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:20:24,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _55162_210_17071_1_1534408248583_3753480_355-257218-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:20:25,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _2895_210_2477_1_1525956785794_4063750_289-11475-0 from training. Duration: 0.82 2023-09-30 16:20:30,132 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.490e+02 1.855e+02 2.093e+02 2.539e+02 3.867e+02, threshold=4.186e+02, percent-clipped=0.0 2023-09-30 16:20:31,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _16756_210_4915_1_1531047803763_7104259_839-183431-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:20:33,403 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_427-208901-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:20:33,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _36512_210_6987_1_1533033143998_3126069_181-306500-0 from training. Duration: 0.95 2023-09-30 16:20:39,276 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:20:44,771 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.attention_skip_rate, batch_count=767893.3333333334, ans=0.0 2023-09-30 16:20:45,954 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_590-58996-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:20:47,592 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_108-47561-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:20:49,836 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_skip_rate, batch_count=767893.3333333334, ans=0.0 2023-09-30 16:20:51,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _60854_210_26984_1_1534903208829_4595140_401-158239-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:20:51,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:20:51,213 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0 from training. Duration: 0.78 2023-09-30 16:20:52,844 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0 from training. Duration: 0.9 2023-09-30 16:20:53,175 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.0.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:20:54,428 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_47896_210_20508_1_1534492518_5958868_558-314572-0 from training. Duration: 0.97 2023-09-30 16:20:57,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _66070_210_7640_1_1535441393096_7805310_290-40284-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:20:57,839 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:20:58,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _7719_210_3525_1_1528456153163_4296960_88-250259-0 from training. Duration: 0.84 2023-09-30 16:20:59,510 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_1157-114102-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:20:59,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:20:59,570 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_254-347254-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:20:59,760 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.attention_skip_rate, batch_count=767960.0, ans=0.0 2023-09-30 16:21:01,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_28-70987-0 from training. Duration: 0.87 2023-09-30 16:21:04,140 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0 from training. Duration: 0.45 2023-09-30 16:21:05,937 WARNING [train.py:1197] (3/4) Exclude cut with ID _40901_210_5665_1_1533363789958_3856130_356-107330-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:21:06,059 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3780_210_2087_1_1526467389_2145572_176-107140-0 from training. Duration: 0.69 2023-09-30 16:21:09,570 INFO [train.py:1039] (3/4) Epoch 22, batch 3650, loss[loss=0.1631, simple_loss=0.2357, pruned_loss=0.04526, over 24441.00 frames. ], tot_loss[loss=0.1726, simple_loss=0.249, pruned_loss=0.0481, over 4717148.29 frames. ], batch size: 58, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:21:14,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0 from training. Duration: 0.79 2023-09-30 16:21:14,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_85-28583-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:21:17,838 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.self_attn1.whiten, num_groups=1, num_channels=512, metric=13.77 vs. limit=22.5 2023-09-30 16:21:19,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_29-293896-0 from training. Duration: 0.61 2023-09-30 16:21:21,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0 from training. Duration: 0.97 2023-09-30 16:21:26,307 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_360-52446-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:21:26,309 WARNING [train.py:1197] (3/4) Exclude cut with ID _9389_210_1664_1_1529217337301_5289269_289-213417-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:21:27,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:21:31,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp0.9 from training. Duration: 0.77775 2023-09-30 16:21:31,272 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_76-308663-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:21:32,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0 from training. Duration: 0.67 2023-09-30 16:21:34,214 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_387-131538-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:21:34,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_365-182054-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:21:34,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0 from training. Duration: 0.93 2023-09-30 16:21:36,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _5409_210_1388_1_1527292850189_5865191_178-127970-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 16:21:36,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_352-12734-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:21:36,554 WARNING [train.py:1197] (3/4) Exclude cut with ID _65134_210_14763_1_1535335393985_3505368_121-129373-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:21:39,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:21:43,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _48678_210_5663_1_1533988830947_3769199_306-255886-0 from training. Duration: 0.77 2023-09-30 16:21:44,682 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0 from training. Duration: 0.87 2023-09-30 16:21:46,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _2941_210_2577_1_1526199673150_2489759_208-236402-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:21:47,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0 from training. Duration: 0.97 2023-09-30 16:21:49,836 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_306-328175-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:21:49,864 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:21:56,778 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:21:58,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _44010_210_4525_1_1533884262964_4013620_483-69110-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:21:58,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:21:58,458 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.ff2_skip_rate, batch_count=768226.6666666666, ans=0.0 2023-09-30 16:21:59,854 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:22:00,006 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward1.out_proj.dropout_p, batch_count=768226.6666666666, ans=0.1 2023-09-30 16:22:01,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_268-87909-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:22:04,364 WARNING [train.py:1197] (3/4) Exclude cut with ID _21878_210_3525_1_1531738772002_7264330_559-184862-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:22:06,080 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46032_210_14656_1_1533781412_4507969_237-120766-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:22:07,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_330-170906-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:22:07,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_401-51578-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:22:07,872 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.pos_emb_skip_rate, batch_count=768226.6666666666, ans=0.0 2023-09-30 16:22:07,910 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=768226.6666666666, ans=0.1 2023-09-30 16:22:10,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_209-205365-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 16:22:12,384 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_57-242170-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:22:12,482 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_127-37371-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:22:18,703 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9171W0019-578905 from training. Duration: 0.896 2023-09-30 16:22:19,132 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.ff3_skip_rate, batch_count=768293.3333333334, ans=0.0 2023-09-30 16:22:21,342 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=768293.3333333334, ans=0.125 2023-09-30 16:22:22,357 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24605_210_1794_1_1532047863_4149755_349-49056-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:22:22,385 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_897-21237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:22:25,742 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:22:25,820 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_348-152548-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:22:27,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:22:29,596 WARNING [train.py:1197] (3/4) Exclude cut with ID _61008_210_10633_1_1534852769198_6974370_345-91705-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:22:31,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0 from training. Duration: 0.99 2023-09-30 16:22:31,191 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_359-345972-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:22:32,631 INFO [train.py:1039] (3/4) Epoch 22, batch 3700, loss[loss=0.161, simple_loss=0.2458, pruned_loss=0.03813, over 24676.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2501, pruned_loss=0.04796, over 4723345.02 frames. ], batch size: 73, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:22:32,851 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:22:34,568 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.balancer2.prob, batch_count=768360.0, ans=0.125 2023-09-30 16:22:35,840 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1256-120974-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:22:37,305 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_372-30302-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:22:40,405 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_42012_210_7927_1_1533439936_3871556_451-344262-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:22:40,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_28-324729-0 from training. Duration: 0.94 2023-09-30 16:22:40,424 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_313-18394-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:22:40,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_620-276793-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 16:22:42,031 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_172-348777-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 16:22:45,482 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_proj.dropout_p, batch_count=768360.0, ans=0.1 2023-09-30 16:22:46,560 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:22:47,203 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=768426.6666666666, ans=0.125 2023-09-30 16:22:48,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_5-300035-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:22:48,986 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_343-309023-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:22:50,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:22:51,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_41-182910-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:22:52,018 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_229-45300-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 16:22:55,042 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_40-214716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:22:57,119 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0203-640330 from training. Duration: 0.896 2023-09-30 16:23:01,665 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=4.56 vs. limit=15.0 2023-09-30 16:23:05,802 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_264-279573-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:23:07,232 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:23:07,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_359-118926-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:23:07,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_85-65056-0 from training. Duration: 0.96 2023-09-30 16:23:07,467 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:23:12,051 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_128-49444-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:23:12,200 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0 from training. Duration: 0.86 2023-09-30 16:23:13,695 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41822_210_13270_1_1533955976_4379284_260-256391-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:23:14,066 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=768493.3333333334, ans=0.125 2023-09-30 16:23:15,174 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_599-247317-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:23:16,431 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.461e+02 1.838e+02 2.020e+02 2.456e+02 4.416e+02, threshold=4.040e+02, percent-clipped=2.0 2023-09-30 16:23:18,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_209-239267-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:23:18,199 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_46-207599-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 16:23:21,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_912-282412-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 16:23:26,381 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:23:26,388 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_296-192597-0 from training. Duration: 0.81 2023-09-30 16:23:27,887 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_571-220562-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:23:27,916 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0 from training. Duration: 0.98 2023-09-30 16:23:32,453 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_149-85602-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:23:32,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:23:37,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1099-33689-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:23:37,423 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_301-10732-0 from training. Duration: 0.81 2023-09-30 16:23:40,528 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_403-34698-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:23:40,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:23:40,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_527-238990-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:23:41,963 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_426-7328-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:23:45,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_189-317158-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:23:45,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.attention_skip_rate, batch_count=768626.6666666666, ans=0.0 2023-09-30 16:23:46,604 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_162-103620-0 from training. Duration: 0.93 2023-09-30 16:23:46,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_49-234546-0 from training. Duration: 0.83 2023-09-30 16:23:48,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_370-331600-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:23:48,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_166-59042-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:23:48,512 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_138-332773-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:23:49,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:23:53,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _27967_210_12140_1_1532588391859_8419130_737-41898-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:23:54,770 INFO [train.py:1039] (3/4) Epoch 22, batch 3750, loss[loss=0.2549, simple_loss=0.3159, pruned_loss=0.09689, over 19472.00 frames. ], tot_loss[loss=0.1747, simple_loss=0.2515, pruned_loss=0.04896, over 4714408.92 frames. ], batch size: 388, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:23:54,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:23:56,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _41817_210_18835_1_1533434477160_7423049_352-42537-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:23:57,107 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.nonlin_attention.whiten2.whitening_limit, batch_count=768693.3333333334, ans=15.0 2023-09-30 16:23:59,327 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_431-294102-0 from training. Duration: 0.89 2023-09-30 16:24:00,885 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_454-314901-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 16:24:02,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp0.9 from training. Duration: 0.87775 2023-09-30 16:24:04,039 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0 from training. Duration: 0.78 2023-09-30 16:24:04,130 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:24:05,659 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_96-177176-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:24:05,808 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_246-348742-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:24:06,378 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=10.95 vs. limit=15.0 2023-09-30 16:24:06,636 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.03 vs. limit=15.0 2023-09-30 16:24:09,162 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_65-169397-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:24:11,481 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward3.hidden_balancer.prob, batch_count=768760.0, ans=0.125 2023-09-30 16:24:14,325 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_1003-99642-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:24:17,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_320-14078-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:24:19,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_367-61330-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:24:20,675 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_515-298514-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:24:24,966 WARNING [train.py:1197] (3/4) Exclude cut with ID _53731_210_7994_1_1534553979244_3757214_471-320815-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:24:25,060 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_306-232480-0 from training. Duration: 0.96 2023-09-30 16:24:26,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _6570_210_4557_1_1527850889148_7204330_478-162247-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:24:28,125 WARNING [train.py:1197] (3/4) Exclude cut with ID _56423_210_5854_1_1534896061049_3165969_345-284696-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:24:28,179 WARNING [train.py:1197] (3/4) Exclude cut with ID _44778_210_2054_1_1533798127203_7443090_600-122253-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:24:31,894 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_199-295611-0 from training. Duration: 0.55 2023-09-30 16:24:35,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_734-129761-0 from training. Duration: 0.82 2023-09-30 16:24:36,509 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_349-169257-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:24:37,869 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:24:40,910 WARNING [train.py:1197] (3/4) Exclude cut with ID _16908_210_5724_1_1531119157248_3772700_61-258978-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:24:46,683 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_172-156931-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:24:48,163 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0_sp1.1 from training. Duration: 0.42725 2023-09-30 16:24:51,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_116-332008-0 from training. Duration: 0.98 2023-09-30 16:24:53,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_358-114789-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:24:56,294 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_152-212533-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:24:56,396 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_267-214116-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:25:00,958 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:25:06,151 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_577-332600-0_sp0.9 from training. Duration: 0.6333125 2023-09-30 16:25:07,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp0.9 from training. Duration: 0.6 2023-09-30 16:25:09,348 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_141-89727-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:25:09,442 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.2.prob, batch_count=768960.0, ans=0.125 2023-09-30 16:25:10,898 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:25:14,005 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_401-53423-0_sp1.1 from training. Duration: 0.5 2023-09-30 16:25:16,063 INFO [train.py:1039] (3/4) Epoch 22, batch 3800, loss[loss=0.1485, simple_loss=0.2319, pruned_loss=0.03258, over 24572.00 frames. ], tot_loss[loss=0.1749, simple_loss=0.2515, pruned_loss=0.0491, over 4718961.68 frames. ], batch size: 60, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:25:23,519 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_422-227034-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:25:26,742 WARNING [train.py:1197] (3/4) Exclude cut with ID _4184_210_2550_1_1526730192013_2219380_102-250299-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:25:28,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp0.9 from training. Duration: 0.5444375 2023-09-30 16:25:29,644 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_271-297505-0 from training. Duration: 0.99 2023-09-30 16:25:31,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _35941_210_15379_1_1532995294515_4023089_213-142973-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:25:31,368 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_162-63018-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:25:32,984 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_365-217119-0_sp0.9 from training. Duration: 0.7 2023-09-30 16:25:35,304 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.balancer2.prob, batch_count=769093.3333333334, ans=0.125 2023-09-30 16:25:36,503 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_662-275621-0_sp0.9 from training. Duration: 0.4666875 2023-09-30 16:25:36,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_374-187555-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:25:36,639 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_289-180904-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:25:38,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_189-47206-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:25:38,261 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:25:39,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_576-76621-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:25:41,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_253-246386-0 from training. Duration: 0.9 2023-09-30 16:25:44,167 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_372-6869-0_sp1.1 from training. Duration: 0.4 2023-09-30 16:25:44,263 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_21434_210_4238_1_1531916339_1222252_22-54954-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:25:47,358 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_492-170361-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:25:47,414 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.conv.8.prob, batch_count=769160.0, ans=0.125 2023-09-30 16:25:50,959 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:25:51,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _8365_210_2064_1_1528592311070_4677690_371-122295-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:25:53,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp1.1 from training. Duration: 0.463625 2023-09-30 16:25:53,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_372-5678-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:25:55,228 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.hidden_balancer.prob, batch_count=769160.0, ans=0.125 2023-09-30 16:25:56,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _52892_210_6721_1_1534378941259_4124129_90-165108-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:25:56,536 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_143-339198-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:26:00,876 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.554e+02 1.761e+02 1.975e+02 2.225e+02 3.481e+02, threshold=3.951e+02, percent-clipped=0.0 2023-09-30 16:26:02,517 WARNING [train.py:1197] (3/4) Exclude cut with ID _14268_210_9206_1_1530622771285_4714099_637-86630-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 16:26:02,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0 from training. Duration: 0.7 2023-09-30 16:26:02,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_178-6946-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:26:11,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_460-163885-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:26:15,930 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_270-26053-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:26:17,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0 from training. Duration: 0.75 2023-09-30 16:26:19,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _5289_210_2084_1_1527678202736_3779299_142-103317-0 from training. Duration: 0.84 2023-09-30 16:26:19,309 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_150-143438-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:26:22,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _27894_210_12392_1_1532415496078_6902439_668-269779-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:26:23,180 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward3.hidden_balancer.prob, batch_count=769293.3333333334, ans=0.125 2023-09-30 16:26:24,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_56-222075-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:26:24,564 WARNING [train.py:1197] (3/4) Exclude cut with ID _4092_210_2151_1_1526643044449_3135759_356-278694-0 from training. Duration: 0.47 2023-09-30 16:26:29,719 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0 from training. Duration: 0.99 2023-09-30 16:26:29,738 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0 from training. Duration: 0.65 2023-09-30 16:26:29,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_239-232545-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:26:31,881 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=512, metric=8.42 vs. limit=15.0 2023-09-30 16:26:32,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_153-27432-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:26:39,665 INFO [train.py:1039] (3/4) Epoch 22, batch 3850, loss[loss=0.1735, simple_loss=0.2658, pruned_loss=0.04061, over 24554.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2509, pruned_loss=0.04861, over 4717808.24 frames. ], batch size: 71, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:26:39,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_37-41919-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:26:39,852 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_196-289637-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:26:44,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:26:44,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_557-324258-0 from training. Duration: 0.81 2023-09-30 16:26:46,204 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_1012-279473-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:26:46,362 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_66916_210_14656_1_1535725525_326566_7-92649-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:26:49,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_480-260269-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 16:26:53,254 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_121-155001-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:26:54,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0_sp1.1 from training. Duration: 0.52725 2023-09-30 16:26:56,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _47790_210_16187_1_1533862814066_4207519_2-39653-0 from training. Duration: 0.98 2023-09-30 16:26:59,075 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=769426.6666666666, ans=0.0 2023-09-30 16:27:03,977 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_112-229012-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:07,034 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_626-79528-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:27:09,258 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_856-90052-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:27:10,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_122-109023-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:27:12,362 WARNING [train.py:1197] (3/4) Exclude cut with ID _64414_210_17301_1_1535243252048_3757759_37-70328-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:12,470 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_622-344907-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:27:12,555 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_410-321956-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:27:12,575 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:27:14,216 WARNING [train.py:1197] (3/4) Exclude cut with ID _37537_210_2253_1_1533171758568_3421949_127-285192-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:14,555 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.conv_module1.balancer2.prob, batch_count=769493.3333333334, ans=0.125 2023-09-30 16:27:15,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_723-228168-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:15,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_426-246984-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:16,248 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.self_attn_weights.pos_emb_skip_rate, batch_count=769493.3333333334, ans=0.0 2023-09-30 16:27:17,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_399-156791-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:27:17,513 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_391-22148-0 from training. Duration: 0.77 2023-09-30 16:27:17,554 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_64-297072-0 from training. Duration: 0.91 2023-09-30 16:27:19,093 WARNING [train.py:1197] (3/4) Exclude cut with ID _17875_210_2087_1_1531224012907_1821339_225-280015-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:27:19,144 WARNING [train.py:1197] (3/4) Exclude cut with ID _50655_210_18628_1_1534229942193_3772154_325-246169-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:22,211 WARNING [train.py:1197] (3/4) Exclude cut with ID _29529_210_7234_1_1532393961455_3977460_597-257348-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:27:22,278 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_77-173485-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:23,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _6270_210_2084_1_1527850911573_3856420_26-277343-0 from training. Duration: 0.82 2023-09-30 16:27:25,779 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0 from training. Duration: 0.67 2023-09-30 16:27:28,851 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_293-145278-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:27:30,994 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_257-335162-0 from training. Duration: 0.95 2023-09-30 16:27:32,656 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_619-344499-0_sp0.9 from training. Duration: 0.7 2023-09-30 16:27:39,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_456-315379-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:27:40,873 WARNING [train.py:1197] (3/4) Exclude cut with ID _55857_210_2346_1_1534644007831_6898209_631-221692-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:27:44,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _64631_210_27767_1_1535349580068_4165160_324-3682-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:27:46,105 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_317-211107-0 from training. Duration: 0.92 2023-09-30 16:27:47,801 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_311-61262-0 from training. Duration: 0.65 2023-09-30 16:27:50,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_365-68882-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:27:52,149 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_816-181825-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:27:53,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 16:27:53,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:27:55,312 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68095_210_20185_1_1535718226_8888855_1009-164755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:55,475 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_400-103813-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:55,476 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_491-261923-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:27:55,485 WARNING [train.py:1197] (3/4) Exclude cut with ID _31745_210_12742_1_1532739458238_7670330_712-342563-0 from training. Duration: 0.97 2023-09-30 16:27:56,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _41161_210_20034_1_1533431249657_3555314_175-190032-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:27:58,473 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0 from training. Duration: 0.89 2023-09-30 16:27:58,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_225-70961-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:27:58,530 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_235-175994-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:28:02,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_907-182949-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:28:03,544 INFO [train.py:1039] (3/4) Epoch 22, batch 3900, loss[loss=0.1821, simple_loss=0.2512, pruned_loss=0.05655, over 23775.00 frames. ], tot_loss[loss=0.1727, simple_loss=0.2492, pruned_loss=0.04808, over 4718882.98 frames. ], batch size: 164, lr: 4.65e-03, grad_scale: 16.0 2023-09-30 16:28:03,605 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_94-193692-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:28:03,955 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=769693.3333333334, ans=0.0 2023-09-30 16:28:05,769 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_342-168068-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:28:05,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_368-74239-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:28:05,878 WARNING [train.py:1197] (3/4) Exclude cut with ID _44831_210_13427_1_1533816011549_3689035_291-295928-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:28:05,992 WARNING [train.py:1197] (3/4) Exclude cut with ID _56406_210_9288_1_1534899618664_3496149_455-131997-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:28:07,251 WARNING [train.py:1197] (3/4) Exclude cut with ID _6473_210_1741_1_1527901307396_3815380_108-17335-0 from training. Duration: 0.65 2023-09-30 16:28:07,356 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_286-135600-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:28:11,822 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_928-321675-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:28:12,070 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.ff3_skip_rate, batch_count=769693.3333333334, ans=0.0 2023-09-30 16:28:13,874 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_81-300212-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:28:13,946 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:28:15,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _61079_210_4525_1_1534921008198_4011419_228-176447-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:28:16,248 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.98 vs. limit=15.0 2023-09-30 16:28:17,145 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_372-198878-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:28:19,078 WARNING [train.py:1197] (3/4) Exclude cut with ID _64521_210_14699_1_1535243427259_3815328_244-208725-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:28:20,766 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8275_210_4198_1_1528455320_3892571_491-17707-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:28:22,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0 from training. Duration: 0.81 2023-09-30 16:28:22,350 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_690-286055-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:28:23,949 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_89-321955-0 from training. Duration: 0.8 2023-09-30 16:28:23,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_213-98721-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:28:25,426 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0 from training. Duration: 0.97 2023-09-30 16:28:26,161 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward3.out_whiten, num_groups=1, num_channels=512, metric=12.89 vs. limit=15.0 2023-09-30 16:28:27,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_498-5039-0 from training. Duration: 0.78 2023-09-30 16:28:30,256 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_146-276944-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:28:31,044 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.feed_forward1.out_whiten, num_groups=1, num_channels=384, metric=9.41 vs. limit=15.0 2023-09-30 16:28:31,722 WARNING [train.py:1197] (3/4) Exclude cut with ID _63171_210_15488_1_1535104792794_3295110_155-34488-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:28:31,744 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:28:33,215 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_63-101996-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:28:39,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_271-269084-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:28:41,888 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:28:46,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:28:46,404 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_209-65306-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:28:48,312 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.523e+02 1.859e+02 2.113e+02 2.353e+02 3.355e+02, threshold=4.226e+02, percent-clipped=0.0 2023-09-30 16:28:48,425 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26433_210_12108_1_1532157158_4056480_317-335807-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:28:55,176 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_238-94127-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:28:55,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_207-132038-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:29:00,892 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.feed_forward3.out_whiten, num_groups=1, num_channels=192, metric=11.43 vs. limit=15.0 2023-09-30 16:29:02,850 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_167-137255-0_sp1.1 from training. Duration: 0.5818125 2023-09-30 16:29:03,028 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:29:12,329 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.balancer2.prob, batch_count=769960.0, ans=0.125 2023-09-30 16:29:13,509 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_253-110617-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:29:15,764 WARNING [train.py:1197] (3/4) Exclude cut with ID _39290_210_4220_1_1533862778760_3075118_92-311600-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:29:17,725 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_42-142473-0 from training. Duration: 0.72 2023-09-30 16:29:17,792 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2296_210_2087_1_1525503448_3397335_57-185430-0 from training. Duration: 0.6 2023-09-30 16:29:19,261 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:29:19,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_222-198127-0 from training. Duration: 0.84 2023-09-30 16:29:21,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_423-202699-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:29:22,433 WARNING [train.py:1197] (3/4) Exclude cut with ID _15037_210_2253_1_1530752294104_3267650_13-130361-0 from training. Duration: 0.99 2023-09-30 16:29:25,801 INFO [train.py:1039] (3/4) Epoch 22, batch 3950, loss[loss=0.1538, simple_loss=0.2437, pruned_loss=0.03194, over 24631.00 frames. ], tot_loss[loss=0.1725, simple_loss=0.2491, pruned_loss=0.04799, over 4712447.88 frames. ], batch size: 68, lr: 4.64e-03, grad_scale: 16.0 2023-09-30 16:29:28,576 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer1.prob, batch_count=770026.6666666666, ans=0.125 2023-09-30 16:29:29,693 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_628-198080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:29:31,145 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3171_210_2087_1_1526109084_2330007_37-64334-0 from training. Duration: 0.82 2023-09-30 16:29:31,238 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_12-221186-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:29:34,264 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_1103-56445-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:29:37,182 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_169-140068-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:29:43,231 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0118-331961 from training. Duration: 0.896 2023-09-30 16:29:43,336 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_402-102311-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:29:43,386 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_202-293523-0 from training. Duration: 0.72 2023-09-30 16:29:44,784 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0679W0038-335846 from training. Duration: 0.768 2023-09-30 16:29:44,826 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37357_210_17024_1_1533089504_2316121_79-198866-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:29:48,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_292-271285-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:29:48,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_377-296839-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:29:48,124 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_45886_210_17024_1_1533704917_2435333_58-131069-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:29:52,373 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_395-185060-0 from training. Duration: 0.95 2023-09-30 16:29:54,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_304-266479-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:29:55,409 WARNING [train.py:1197] (3/4) Exclude cut with ID _3867_210_1883_1_1526641200575_4475830_319-349850-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:29:55,436 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_304-59227-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:29:55,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _8021_210_7994_1_1528376451358_3653069_100-140231-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:29:56,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_329-125156-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:29:57,456 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module1.balancer2.min_abs, batch_count=770160.0, ans=0.5 2023-09-30 16:30:01,348 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=770160.0, ans=0.125 2023-09-30 16:30:01,644 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn1.whiten, num_groups=1, num_channels=384, metric=13.83 vs. limit=22.5 2023-09-30 16:30:07,505 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_792-287234-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:30:08,884 WARNING [train.py:1197] (3/4) Exclude cut with ID _19708_210_9206_1_1532136650733_4238364_347-65240-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:30:13,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_70-27732-0 from training. Duration: 0.94 2023-09-30 16:30:18,393 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=770226.6666666666, ans=0.0 2023-09-30 16:30:19,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_397-132565-0 from training. Duration: 0.99 2023-09-30 16:30:19,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_624-195301-0 from training. Duration: 0.85 2023-09-30 16:30:19,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:30:21,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_354-344695-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:30:29,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:30:29,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _15012_210_1968_1_1530856820429_5095350_688-178849-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:30:31,543 WARNING [train.py:1197] (3/4) Exclude cut with ID _25565_210_12392_1_1532423009382_3952098_372-69023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:30:31,584 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_61-327062-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:30:31,641 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0 from training. Duration: 0.62 2023-09-30 16:30:33,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer1.prob, batch_count=770293.3333333334, ans=0.125 2023-09-30 16:30:37,765 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_6-226750-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:30:39,415 WARNING [train.py:1197] (3/4) Exclude cut with ID _14348_210_8341_1_1530599434996_3778640_240-232370-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:30:43,903 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_468-189058-0 from training. Duration: 0.96 2023-09-30 16:30:48,426 INFO [train.py:1039] (3/4) Epoch 22, batch 4000, loss[loss=0.1735, simple_loss=0.2408, pruned_loss=0.05312, over 23458.00 frames. ], tot_loss[loss=0.1734, simple_loss=0.2499, pruned_loss=0.04849, over 4714699.15 frames. ], batch size: 285, lr: 4.64e-03, grad_scale: 16.0 2023-09-30 16:30:53,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _21987_210_5854_1_1531821500482_3637038_247-93340-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:30:58,902 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer2.prob, batch_count=770360.0, ans=0.125 2023-09-30 16:31:02,185 WARNING [train.py:1197] (3/4) Exclude cut with ID _66868_210_9527_1_1535509287954_4561140_436-191602-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:31:06,917 WARNING [train.py:1197] (3/4) Exclude cut with ID _18895_210_6228_1_1531357274234_3784930_394-58203-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:31:07,024 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_258-281498-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:31:08,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33348_210_5632_1_1533086912_3324030_245-292752-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:31:08,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0 from training. Duration: 0.93 2023-09-30 16:31:08,574 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:31:10,686 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_245-174553-0 from training. Duration: 0.88 2023-09-30 16:31:10,695 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_231-104736-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:31:10,716 WARNING [train.py:1197] (3/4) Exclude cut with ID _44585_210_18628_1_1533605350387_4518199_280-73534-0 from training. Duration: 0.89 2023-09-30 16:31:14,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_22-201476-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:31:17,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_325-62802-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:31:17,484 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_199-274062-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:31:17,489 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_8-319957-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:31:17,529 WARNING [train.py:1197] (3/4) Exclude cut with ID _29343_210_4381_1_1532602757752_3674109_224-349726-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:31:17,537 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_520-126729-0_sp1.1 from training. Duration: 0.636375 2023-09-30 16:31:19,142 WARNING [train.py:1197] (3/4) Exclude cut with ID _44360_210_4381_1_1533725927656_4014058_239-22076-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:31:20,242 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.0.conv_module2.whiten, num_groups=1, num_channels=192, metric=8.78 vs. limit=15.0 2023-09-30 16:31:20,724 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9012W0011-501146 from training. Duration: 0.896 2023-09-30 16:31:20,875 WARNING [train.py:1197] (3/4) Exclude cut with ID _29857_210_20034_1_1532395886753_3798160_279-185801-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:31:22,285 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_261-223933-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:31:23,925 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9029W0025-509426 from training. Duration: 0.896 2023-09-30 16:31:24,122 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.feed_forward2.hidden_balancer.prob, batch_count=770493.3333333334, ans=0.125 2023-09-30 16:31:25,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 16:31:25,380 WARNING [train.py:1197] (3/4) Exclude cut with ID _68635_210_9317_1_1535770813410_4315870_159-332599-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:31:27,104 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=770493.3333333334, ans=0.0 2023-09-30 16:31:32,069 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_161-276996-0 from training. Duration: 0.95 2023-09-30 16:31:33,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7088_210_1874_1_1527939776_1101113_127-68870-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:31:34,782 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.602e+02 1.863e+02 2.026e+02 2.291e+02 3.253e+02, threshold=4.053e+02, percent-clipped=0.0 2023-09-30 16:31:38,316 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_322-217220-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:31:39,740 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0002-530636 from training. Duration: 0.768 2023-09-30 16:31:41,348 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_340-336408-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:31:42,848 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_395-37222-0 from training. Duration: 0.76 2023-09-30 16:31:42,855 WARNING [train.py:1197] (3/4) Exclude cut with ID _64099_210_17301_1_1535526977656_1578188_159-209156-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:31:44,370 WARNING [train.py:1197] (3/4) Exclude cut with ID _53950_210_5816_1_1534417264919_3862951_28-29681-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:31:44,495 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527250689464_120889_15-126760-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:31:46,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:31:46,872 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_436-186311-0_sp0.9 from training. Duration: 0.72225 2023-09-30 16:31:48,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_452-172147-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:31:49,186 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_80-147301-0 from training. Duration: 0.84 2023-09-30 16:31:49,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_351-107588-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:31:50,784 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9111W0002-550781 from training. Duration: 0.896 2023-09-30 16:31:52,704 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=770560.0, ans=0.125 2023-09-30 16:31:55,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:32:00,029 WARNING [train.py:1197] (3/4) Exclude cut with ID _13938_210_9988_1_1530518718978_3693960_304-112784-0_sp1.1 from training. Duration: 0.4181875 2023-09-30 16:32:00,244 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder_embed.convnext.hidden_balancer.prob, batch_count=770626.6666666666, ans=0.125 2023-09-30 16:32:01,680 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_202-67017-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:32:01,750 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_448-4358-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:32:03,161 WARNING [train.py:1197] (3/4) Exclude cut with ID _66840_210_5219_1_1535452168426_4196349_203-243234-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:32:03,311 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_201-213513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:32:08,814 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_117-187560-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:32:09,028 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.nonlin_attention.balancer.prob, batch_count=770626.6666666666, ans=0.125 2023-09-30 16:32:11,942 INFO [train.py:1039] (3/4) Epoch 22, batch 4050, loss[loss=0.1973, simple_loss=0.2621, pruned_loss=0.06631, over 22750.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2506, pruned_loss=0.04863, over 4714546.63 frames. ], batch size: 322, lr: 4.64e-03, grad_scale: 16.0 2023-09-30 16:32:13,447 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_135-174080-0_sp0.9 from training. Duration: 0.588875 2023-09-30 16:32:14,912 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0 from training. Duration: 0.72 2023-09-30 16:32:15,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:32:15,821 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=7.88 vs. limit=15.0 2023-09-30 16:32:16,532 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_235-307337-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:32:16,673 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7762_210_1794_1_1528199508_4057408_419-125009-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:32:16,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.balancer1.prob, batch_count=770693.3333333334, ans=0.125 2023-09-30 16:32:18,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_215-141059-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:32:18,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _27013_210_1389_1_1532437173240_3252649_222-125573-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:32:24,269 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_126-274694-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:32:28,619 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2592_210_2290_1_1525697025_4882925_157-70512-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:32:30,096 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_292-265928-0_sp0.9 from training. Duration: 0.5666875 2023-09-30 16:32:31,704 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1211-226066-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:32:33,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_394-265747-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:32:36,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _48782_210_12658_1_1534467701544_2300422_239-67139-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:32:36,603 WARNING [train.py:1197] (3/4) Exclude cut with ID _6271_210_2084_1_1528023675565_3989259_329-113173-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:32:39,798 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_81-75849-0_sp0.9 from training. Duration: 0.4333125 2023-09-30 16:32:41,879 WARNING [train.py:1197] (3/4) Exclude cut with ID _9235_210_5219_1_1528891154253_8046120_1003-101638-0 from training. Duration: 0.73 2023-09-30 16:32:43,836 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0189-640316 from training. Duration: 0.64 2023-09-30 16:32:46,771 WARNING [train.py:1197] (3/4) Exclude cut with ID _45329_210_7005_1_1534157951211_6774339_503-287883-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:32:54,274 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0 from training. Duration: 0.64 2023-09-30 16:32:55,743 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_335-106626-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:33:01,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_237-44332-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:33:01,657 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.ff3_skip_rate, batch_count=770893.3333333334, ans=0.0 2023-09-30 16:33:04,968 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_178-147341-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:33:05,045 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_946-154426-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:33:06,395 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_326-17061-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:33:09,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_38-169920-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:33:11,116 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_283-220372-0 from training. Duration: 0.56 2023-09-30 16:33:11,129 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_252-216674-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 16:33:12,697 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_202-91290-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:33:14,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_80-74184-0 from training. Duration: 0.83 2023-09-30 16:33:14,478 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.4.encoder.layers.2.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:33:14,495 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.ff2_skip_rate, batch_count=770893.3333333334, ans=0.0 2023-09-30 16:33:19,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_411-271358-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:33:26,119 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0 from training. Duration: 0.85 2023-09-30 16:33:27,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_410-109759-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:33:27,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _14224_210_4381_1_1530529062037_4264010_364-22817-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:33:29,384 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.ff2_skip_rate, batch_count=770960.0, ans=0.0 2023-09-30 16:33:30,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_89-242335-0 from training. Duration: 0.95 2023-09-30 16:33:30,648 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_151-325626-0 from training. Duration: 0.97 2023-09-30 16:33:30,650 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_786-264332-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:33:34,167 INFO [train.py:1039] (3/4) Epoch 22, batch 4100, loss[loss=0.1496, simple_loss=0.2232, pruned_loss=0.03799, over 24312.00 frames. ], tot_loss[loss=0.1739, simple_loss=0.2509, pruned_loss=0.04847, over 4730727.74 frames. ], batch size: 56, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:33:34,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_35-153886-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:33:34,610 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer1.prob, batch_count=771026.6666666666, ans=0.125 2023-09-30 16:33:36,359 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_116-100916-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:33:36,382 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:33:36,717 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.attention_skip_rate, batch_count=771026.6666666666, ans=0.0 2023-09-30 16:33:41,209 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.bypass.scale_min, batch_count=771026.6666666666, ans=0.2 2023-09-30 16:33:43,989 WARNING [train.py:1197] (3/4) Exclude cut with ID _8263_210_1664_1_1528438953494_294100_40-204731-0 from training. Duration: 0.5 2023-09-30 16:33:45,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12868_210_6753_1_1530702479_3610564_416-293746-0 from training. Duration: 0.9 2023-09-30 16:33:47,124 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_605-219732-0 from training. Duration: 0.94 2023-09-30 16:33:48,068 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.feed_forward2.out_whiten, num_groups=1, num_channels=256, metric=4.20 vs. limit=15.0 2023-09-30 16:33:48,767 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_454-123002-0 from training. Duration: 0.69 2023-09-30 16:33:48,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_25-304273-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:33:48,867 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527917457_3833327_451-39702-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:33:50,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_304-16505-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:33:50,234 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:33:51,826 WARNING [train.py:1197] (3/4) Exclude cut with ID ID0149W0001-753701 from training. Duration: 0.989 2023-09-30 16:33:55,577 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41251_210_15368_1_1533429767_4820695_394-55393-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:33:55,717 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:33:55,741 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2729_210_3528_1_1525863505_3882214_421-73047-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:33:55,942 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.bypass.skip_rate, batch_count=771093.3333333334, ans=0.07 2023-09-30 16:33:57,187 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_278-154492-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:34:00,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_412-317077-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:34:01,993 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_727-138317-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:34:02,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_345-335048-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:34:03,378 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0 from training. Duration: 0.97 2023-09-30 16:34:03,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _44718_210_5816_1_1534050051869_6469030_643-245187-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:34:03,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_42-186162-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:34:03,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_171-256004-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:34:03,576 WARNING [train.py:1197] (3/4) Exclude cut with ID _25914_210_6400_1_1532141317058_644740_43-248478-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:34:04,972 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_577-287150-0 from training. Duration: 0.82 2023-09-30 16:34:08,691 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46015_210_7234_1_1533863319_3087837_376-276068-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:34:08,889 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0 from training. Duration: 0.56 2023-09-30 16:34:10,364 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_452-96101-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:34:11,929 WARNING [train.py:1197] (3/4) Exclude cut with ID _13252_210_1925_1_1530671372047_3336909_263-168388-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:34:11,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_469-94673-0 from training. Duration: 0.79 2023-09-30 16:34:13,428 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_303-228738-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:34:14,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_262-250826-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:34:15,314 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.out_combiner.scale_min, batch_count=771160.0, ans=0.2 2023-09-30 16:34:16,226 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:34:17,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_265-323810-0 from training. Duration: 0.96 2023-09-30 16:34:19,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_120-76843-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:34:20,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587722_3576686_295-258479-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:34:22,265 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.433e+02 1.834e+02 2.084e+02 2.360e+02 3.426e+02, threshold=4.169e+02, percent-clipped=0.0 2023-09-30 16:34:22,573 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0 from training. Duration: 0.77 2023-09-30 16:34:24,533 WARNING [train.py:1197] (3/4) Exclude cut with ID _42326_210_7770_1_1533814282531_3707269_113-308323-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:34:24,616 WARNING [train.py:1197] (3/4) Exclude cut with ID _7852_210_5219_1_1528286471316_8139400_242-105336-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:34:27,643 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_114-277905-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:34:29,319 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.conv_module2.balancer1.prob, batch_count=771226.6666666666, ans=0.125 2023-09-30 16:34:35,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13118_210_7925_1_1530755612_7608297_42-66683-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:34:38,372 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_901-79438-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:34:38,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.conv_skip_rate, batch_count=771293.3333333334, ans=0.0 2023-09-30 16:34:39,868 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30280_210_1881_1_1532872779_3660139_211-182194-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:34:48,823 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_621-45732-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:34:48,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _35350_210_16040_1_1533040191183_4075299_224-133775-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:34:53,297 WARNING [train.py:1197] (3/4) Exclude cut with ID _69597_210_12210_1_1535760025639_3639180_147-293311-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:34:53,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:34:54,902 INFO [train.py:1039] (3/4) Epoch 22, batch 4150, loss[loss=0.1718, simple_loss=0.2325, pruned_loss=0.0556, over 23363.00 frames. ], tot_loss[loss=0.1746, simple_loss=0.2512, pruned_loss=0.049, over 4716025.63 frames. ], batch size: 134, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:34:56,707 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:34:58,608 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:34:58,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _49728_210_12651_1_1534041034294_6788150_66-43496-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:34:58,736 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_28-36615-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:35:02,344 WARNING [train.py:1197] (3/4) Exclude cut with ID _14349_210_8341_1_1530615710818_3442470_115-285975-0 from training. Duration: 0.86 2023-09-30 16:35:02,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _12734_210_8341_1_1530183614296_3891929_236-47464-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:35:03,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_177-236809-0 from training. Duration: 0.49 2023-09-30 16:35:03,926 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_371-3221-0 from training. Duration: 0.9 2023-09-30 16:35:03,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_48-338976-0 from training. Duration: 0.89 2023-09-30 16:35:06,829 WARNING [train.py:1197] (3/4) Exclude cut with ID _15668_210_9659_1_1531270921894_8361359_1167-309744-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:35:10,245 WARNING [train.py:1197] (3/4) Exclude cut with ID _30327_210_16049_1_1532484161295_3924169_401-70819-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:35:10,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_346-91022-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:35:12,605 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.ff2_skip_rate, batch_count=771426.6666666666, ans=0.0 2023-09-30 16:35:15,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_174-118801-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:35:17,497 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_269-278006-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:35:18,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_390-339137-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:35:19,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_253-308377-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 16:35:19,237 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_240-271367-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:35:20,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1015-343884-0_sp0.9 from training. Duration: 0.688875 2023-09-30 16:35:25,365 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_335-48210-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:35:28,690 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_67642_210_26772_1_1535612640_7786046_309-65177-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:35:30,068 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_231-137892-0 from training. Duration: 0.99 2023-09-30 16:35:32,334 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_57-270037-0 from training. Duration: 0.77 2023-09-30 16:35:32,343 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:35:35,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_106-300985-0 from training. Duration: 0.9 2023-09-30 16:35:35,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _14632_210_9988_1_1530691279392_3923200_161-129530-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:35:35,367 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_514-88370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:35:37,160 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=771493.3333333334, ans=0.125 2023-09-30 16:35:38,420 WARNING [train.py:1197] (3/4) Exclude cut with ID _56495_210_4234_1_1534748080334_4136219_305-334505-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:35:39,940 WARNING [train.py:1197] (3/4) Exclude cut with ID _42875_210_14856_1_1533704397487_4012433_61-264914-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:35:43,168 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_91-148831-0 from training. Duration: 0.99 2023-09-30 16:35:47,058 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8127_210_4852_1_1528523923_4110370_435-313305-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:35:48,729 WARNING [train.py:1197] (3/4) Exclude cut with ID _16744_210_5986_1_1531724405384_3907567_242-111316-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:35:48,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_257-20733-0 from training. Duration: 0.76 2023-09-30 16:35:50,740 WARNING [train.py:1197] (3/4) Exclude cut with ID _8064_210_5399_1_1528372299291_4282973_4-91037-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:35:52,275 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_3-276701-0 from training. Duration: 0.91 2023-09-30 16:35:52,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.attention_skip_rate, batch_count=771560.0, ans=0.0 2023-09-30 16:35:53,817 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_36-147855-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:35:55,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_1296-298671-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:35:56,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _68965_210_1883_1_1535698962272_3497490_309-334593-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:35:58,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_128-296581-0 from training. Duration: 0.74 2023-09-30 16:35:58,417 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_170-299079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:35:58,420 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6287_210_3273_1_1527509506_2653716_182-199887-0_sp1.1 from training. Duration: 0.463625 2023-09-30 16:36:00,076 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_80-135200-0_sp1.1 from training. Duration: 0.7181875 2023-09-30 16:36:03,027 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_34-183685-0 from training. Duration: 0.98 2023-09-30 16:36:04,335 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8278_210_2550_1_1528637099_4220413_418-210155-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:36:04,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8853_210_3273_1_1528633899_818968_51-187685-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:36:04,374 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_332-221057-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 16:36:04,508 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2221_210_1988_1_1525597051_4935541_415-296882-0 from training. Duration: 0.77 2023-09-30 16:36:04,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _33640_210_2655_1_1533117605479_4855469_770-278185-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:36:04,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp1.1 from training. Duration: 0.5454375 2023-09-30 16:36:06,578 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_142-276839-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:36:08,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _28394_210_8913_1_1532430049518_3650118_126-139371-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:36:08,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_76-203867-0 from training. Duration: 0.72 2023-09-30 16:36:08,547 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=771626.6666666666, ans=0.1 2023-09-30 16:36:09,597 WARNING [train.py:1197] (3/4) Exclude cut with ID _6194_210_1534_1_1527511826160_3941194_14-282876-0_sp0.9 from training. Duration: 0.788875 2023-09-30 16:36:15,670 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:36:15,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _33920_210_4220_1_1533257851500_3084399_198-334063-0 from training. Duration: 0.94 2023-09-30 16:36:17,298 INFO [train.py:1039] (3/4) Epoch 22, batch 4200, loss[loss=0.1732, simple_loss=0.2226, pruned_loss=0.06185, over 22594.00 frames. ], tot_loss[loss=0.1731, simple_loss=0.2492, pruned_loss=0.04847, over 4712835.09 frames. ], batch size: 322, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:36:19,472 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_4-266348-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:36:19,845 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=771693.3333333334, ans=0.125 2023-09-30 16:36:23,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23856_210_4238_1_1531994785_3087934_122-38075-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:36:23,286 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_276-96593-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:36:24,778 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_308-278734-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:36:24,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5190_210_4557_1_1527245078_2965646_353-250603-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:36:27,762 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_472-216663-0 from training. Duration: 0.92 2023-09-30 16:36:29,745 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=771693.3333333334, ans=0.1 2023-09-30 16:36:30,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0 from training. Duration: 0.84 2023-09-30 16:36:30,999 WARNING [train.py:1197] (3/4) Exclude cut with ID _29003_210_19596_1_1532507391720_3894098_559-236660-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:36:33,971 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_312-204798-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:36:36,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_17-309070-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:36:40,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp0.9 from training. Duration: 0.588875 2023-09-30 16:36:41,449 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41452_210_7291_1_1533437687_3941281_405-11559-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:36:41,488 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_48730_210_5399_1_1533885886_2212769_157-124052-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:36:42,861 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_415-78792-0 from training. Duration: 0.81 2023-09-30 16:36:42,868 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_345-340690-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:36:44,438 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_124-8138-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:36:46,006 WARNING [train.py:1197] (3/4) Exclude cut with ID _49258_210_7994_1_1534122017863_3738861_194-24864-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:36:46,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_369-222060-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:36:47,507 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:36:50,541 WARNING [train.py:1197] (3/4) Exclude cut with ID _13253_210_1925_1_1530757775819_3577873_191-144977-0 from training. Duration: 0.78 2023-09-30 16:36:51,931 WARNING [train.py:1197] (3/4) Exclude cut with ID _30304_210_6319_1_1532476827261_7582060_1128-140266-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:36:57,202 WARNING [train.py:1197] (3/4) Exclude cut with ID _8772_210_1850_1_1528635595972_3664011_247-336448-0_sp0.9 from training. Duration: 0.82225 2023-09-30 16:36:57,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_318-59097-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:37:00,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_540-131164-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:37:00,557 WARNING [train.py:1197] (3/4) Exclude cut with ID _68501_210_26984_1_1535680798313_4121768_39-110836-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:37:03,658 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_860-96198-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:37:03,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _15133_210_6228_1_1530794512074_3080379_29-264458-0 from training. Duration: 0.58 2023-09-30 16:37:03,724 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_797-348343-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:37:05,218 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_23-189234-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:37:06,544 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.471e+02 1.822e+02 2.029e+02 2.265e+02 3.199e+02, threshold=4.057e+02, percent-clipped=0.0 2023-09-30 16:37:08,540 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.bypass.skip_rate, batch_count=771893.3333333334, ans=0.035 2023-09-30 16:37:08,656 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.3.balancer2.prob, batch_count=771893.3333333334, ans=0.125 2023-09-30 16:37:11,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _5410_210_1388_1_1527397055795_1719020_39-112221-0_sp1.1 from training. Duration: 0.62725 2023-09-30 16:37:13,545 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_100-348441-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:37:20,229 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_143-86344-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:37:23,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _9034_210_2588_1_1528889403647_3661279_126-29104-0 from training. Duration: 0.85 2023-09-30 16:37:24,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _39846_210_12651_1_1533603651992_1318030_138-31771-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:37:30,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _2220_210_1819_1_1525509109898_3658680_50-99837-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 16:37:30,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_836-210550-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:37:30,603 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.conv_module1.balancer2.prob, batch_count=771960.0, ans=0.125 2023-09-30 16:37:30,650 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module2.balancer2.prob, batch_count=771960.0, ans=0.125 2023-09-30 16:37:34,511 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_306-74561-0 from training. Duration: 0.94 2023-09-30 16:37:37,768 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_177-58005-0_sp1.1 from training. Duration: 0.563625 2023-09-30 16:37:40,681 INFO [train.py:1039] (3/4) Epoch 22, batch 4250, loss[loss=0.1688, simple_loss=0.2418, pruned_loss=0.04789, over 23494.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2497, pruned_loss=0.0481, over 4734990.49 frames. ], batch size: 134, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:37:42,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_19-342632-0_sp1.1 from training. Duration: 0.82725 2023-09-30 16:37:44,357 WARNING [train.py:1197] (3/4) Exclude cut with ID _35355_210_16040_1_1533212988977_4016229_74-190076-0_sp0.9 from training. Duration: 0.888875 2023-09-30 16:37:46,151 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_module2.balancer2.prob, batch_count=772026.6666666666, ans=0.125 2023-09-30 16:37:47,280 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_285-97786-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:37:50,586 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0_sp0.9 from training. Duration: 0.72225 2023-09-30 16:37:50,653 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_353-182855-0 from training. Duration: 0.96 2023-09-30 16:37:52,811 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_409-327730-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:37:55,831 WARNING [train.py:1197] (3/4) Exclude cut with ID _8030_210_7497_1_1528444886781_3531089_373-19403-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:37:58,911 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_231-340237-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:37:59,292 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.self_attn_weights.pos_emb_skip_rate, batch_count=772093.3333333334, ans=0.0 2023-09-30 16:38:02,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_536-57832-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:38:03,904 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_756-116002-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:38:06,283 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37152_210_9697_1_1533106573_3844188_29-239070-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:38:06,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_61-334539-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:38:07,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_159-28488-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:38:09,355 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_764-71272-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:38:10,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_258-178274-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:38:12,462 WARNING [train.py:1197] (3/4) Exclude cut with ID _7537_210_2084_1_1528502395431_3924359_14-312906-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:38:14,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_674-116639-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:38:16,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0 from training. Duration: 0.98 2023-09-30 16:38:20,500 WARNING [train.py:1197] (3/4) Exclude cut with ID _4067_210_2629_1_1526691682890_5258220_264-348896-0 from training. Duration: 0.85 2023-09-30 16:38:20,515 WARNING [train.py:1197] (3/4) Exclude cut with ID _51899_210_20034_1_1534129254466_3669220_233-161492-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:38:20,636 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_233-200023-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:38:20,676 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23850_210_4238_1_1532232597_1632433_100-229879-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:38:22,153 WARNING [train.py:1197] (3/4) Exclude cut with ID _6329_210_2084_1_1527922912620_3702240_375-245677-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:38:22,159 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40223_210_6228_1_1533298404_4812267_355-317415-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:38:22,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_235-81488-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:38:24,318 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.68 vs. limit=15.0 2023-09-30 16:38:27,521 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_172-215932-0_sp1.1 from training. Duration: 0.6 2023-09-30 16:38:28,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0_sp1.1 from training. Duration: 0.8 2023-09-30 16:38:32,283 WARNING [train.py:1197] (3/4) Exclude cut with ID _38020_210_5986_1_1533112252259_7006089_131-53533-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:38:33,692 WARNING [train.py:1197] (3/4) Exclude cut with ID _42334_210_7770_1_1534206610396_3605880_392-48154-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:38:35,134 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_337-181502-0 from training. Duration: 0.65 2023-09-30 16:38:35,146 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:38:37,352 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_60-146189-0 from training. Duration: 0.98 2023-09-30 16:38:38,877 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3133_210_2087_1_1526106446_2324690_243-143119-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:38:39,232 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=772226.6666666666, ans=0.125 2023-09-30 16:38:40,994 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp0.9 from training. Duration: 0.811125 2023-09-30 16:38:44,181 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_83-182645-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:38:44,225 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_403-292260-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:38:45,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_377-169391-0 from training. Duration: 0.96 2023-09-30 16:38:47,117 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.self_attn2.whiten, num_groups=1, num_channels=192, metric=10.46 vs. limit=22.5 2023-09-30 16:38:47,546 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_494-199538-0_sp1.1 from training. Duration: 0.6818125 2023-09-30 16:38:47,645 WARNING [train.py:1197] (3/4) Exclude cut with ID _3067_210_2667_1_1526212974640_4503168_119-175776-0_sp1.1 from training. Duration: 0.67275 2023-09-30 16:38:52,308 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_183-303839-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:38:55,813 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_165-38958-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:38:56,007 WARNING [train.py:1197] (3/4) Exclude cut with ID _41314_210_12651_1_1533459476543_3766460_87-272068-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:38:57,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _3541_210_2477_1_1526475408709_3263973_353-95-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:38:59,137 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_73-302813-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:39:01,381 WARNING [train.py:1197] (3/4) Exclude cut with ID _64220_210_9527_1_1535250151772_4875259_88-95011-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:39:01,478 WARNING [train.py:1197] (3/4) Exclude cut with ID _44902_210_16187_1_1533888006062_3792078_160-12107-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:39:01,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _4093_210_2151_1_1526813931242_6425420_547-5424-0 from training. Duration: 0.94 2023-09-30 16:39:02,010 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.conv_module2.whiten, num_groups=1, num_channels=256, metric=3.98 vs. limit=15.0 2023-09-30 16:39:03,053 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_268-85656-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:39:04,424 INFO [train.py:1039] (3/4) Epoch 22, batch 4300, loss[loss=0.1736, simple_loss=0.2569, pruned_loss=0.04516, over 23870.00 frames. ], tot_loss[loss=0.173, simple_loss=0.2497, pruned_loss=0.04816, over 4727940.81 frames. ], batch size: 86, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:39:07,892 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.balancer2.prob, batch_count=772360.0, ans=0.125 2023-09-30 16:39:09,100 WARNING [train.py:1197] (3/4) Exclude cut with ID _66210_210_12651_1_1535446812922_3365850_42-284985-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:39:09,255 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass_mid.scale_min, batch_count=772360.0, ans=0.2 2023-09-30 16:39:10,455 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_465-214726-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:39:14,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _19523_210_4353_1_1531476231587_7072609_50-95770-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:39:23,416 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_269-77567-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:39:23,419 WARNING [train.py:1197] (3/4) Exclude cut with ID _5261_210_2084_1_1527332833467_7186549_7-111954-0 from training. Duration: 0.56 2023-09-30 16:39:24,988 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43802_210_2290_1_1533516203_8829504_85-195240-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:39:26,579 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2822_210_2715_1_1526112586_1999587_165-36656-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:39:26,608 WARNING [train.py:1197] (3/4) Exclude cut with ID _37826_210_2046_1_1533261539501_3713100_181-78632-0_sp1.1 from training. Duration: 0.7090625 2023-09-30 16:39:26,630 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0671W0044-331887 from training. Duration: 0.896 2023-09-30 16:39:28,328 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.ff2_skip_rate, batch_count=772426.6666666666, ans=0.0 2023-09-30 16:39:31,664 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_266-137104-0_sp1.1 from training. Duration: 0.5909375 2023-09-30 16:39:33,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _27477_210_6228_1_1532224923266_3843379_137-116442-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:39:38,443 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23801_210_16223_1_1532066392_3294657_570-5439-0 from training. Duration: 0.97 2023-09-30 16:39:38,464 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_29-280622-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:39:38,500 WARNING [train.py:1197] (3/4) Exclude cut with ID 3557-8342-0013-54691-0 from training. Duration: 0.92 2023-09-30 16:39:41,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _6222_210_1385_1_1527503539679_7052170_711-15688-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 16:39:41,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15126_210_6753_1_1530778677_2591185_120-277707-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:39:43,578 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_8-169924-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:39:43,581 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4692_210_3528_1_1526899755_4323254_107-44401-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:39:45,029 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_265-276625-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:39:48,571 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_148-79022-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:39:48,744 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_292-36336-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:39:50,069 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_329-82235-0 from training. Duration: 0.82 2023-09-30 16:39:50,211 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_257-312870-0 from training. Duration: 0.88 2023-09-30 16:39:51,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_436-4493-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:39:53,166 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.888e+02 2.133e+02 2.440e+02 3.863e+02, threshold=4.266e+02, percent-clipped=0.0 2023-09-30 16:39:54,957 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_321-54129-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:39:54,969 WARNING [train.py:1197] (3/4) Exclude cut with ID _5287_210_2084_1_1527418883040_3878670_333-48508-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:39:54,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_244-278158-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:39:55,055 WARNING [train.py:1197] (3/4) Exclude cut with ID _46952_210_5986_1_1534230010357_3107379_232-344503-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:39:55,072 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0 from training. Duration: 0.75 2023-09-30 16:39:55,075 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_141-166600-0 from training. Duration: 0.95 2023-09-30 16:39:56,692 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_296-287678-0 from training. Duration: 0.95 2023-09-30 16:39:58,108 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_265-60343-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:39:58,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_332-256168-0 from training. Duration: 0.75 2023-09-30 16:39:58,210 WARNING [train.py:1197] (3/4) Exclude cut with ID _13679_210_7497_1_1530534293662_3355098_411-186088-0 from training. Duration: 0.94 2023-09-30 16:40:02,799 WARNING [train.py:1197] (3/4) Exclude cut with ID _48433_210_22356_1_1533888018396_3671259_458-126779-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:40:05,166 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0803W0380-397572 from training. Duration: 0.032 2023-09-30 16:40:05,252 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_278-272612-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:40:07,990 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.42 vs. limit=15.0 2023-09-30 16:40:08,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_491-263101-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:40:08,810 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_334-336778-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:40:11,841 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_50-3289-0 from training. Duration: 0.91 2023-09-30 16:40:11,950 WARNING [train.py:1197] (3/4) Exclude cut with ID _8699_210_2069_1_1528630346831_3960409_155-39766-0_sp0.9 from training. Duration: 0.8666875 2023-09-30 16:40:11,957 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_502-82145-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:40:13,335 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_550-43132-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:40:13,399 WARNING [train.py:1197] (3/4) Exclude cut with ID _14998_210_5219_1_1530705582080_7474970_446-112117-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:40:13,485 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3691_210_3528_1_1526468169_4062053_355-334432-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:40:16,522 WARNING [train.py:1197] (3/4) Exclude cut with ID _58605_210_12210_1_1534723195939_3904259_239-94720-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:40:18,208 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_391-344072-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:40:18,379 WARNING [train.py:1197] (3/4) Exclude cut with ID _52030_210_17301_1_1534382970223_5077079_558-322014-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:40:20,143 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_256-32662-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:40:25,293 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_572-185150-0 from training. Duration: 0.97 2023-09-30 16:40:26,610 INFO [train.py:1039] (3/4) Epoch 22, batch 4350, loss[loss=0.1833, simple_loss=0.2575, pruned_loss=0.05456, over 23310.00 frames. ], tot_loss[loss=0.1735, simple_loss=0.2505, pruned_loss=0.04825, over 4732214.31 frames. ], batch size: 105, lr: 4.64e-03, grad_scale: 8.0 2023-09-30 16:40:26,717 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2044_210_2087_1_1525518481_3835682_89-35748-0_sp0.9 from training. Duration: 0.87775 2023-09-30 16:40:31,302 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_88-66747-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:40:34,338 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_357-237029-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:40:34,928 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.conv_module2.balancer1.prob, batch_count=772693.3333333334, ans=0.125 2023-09-30 16:40:36,031 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_302-2550-0_sp1.1 from training. Duration: 0.736375 2023-09-30 16:40:36,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _9461_210_4557_1_1529060514726_3677451_59-194017-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:40:41,842 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_665-196155-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:40:44,894 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_326-315007-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:40:46,569 WARNING [train.py:1197] (3/4) Exclude cut with ID _15021_210_8750_1_1530766814621_7272480_105-328174-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:40:46,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _56494_210_4220_1_1535284837625_3033169_106-263722-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:40:49,047 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten1, num_groups=1, num_channels=192, metric=5.47 vs. limit=10.0 2023-09-30 16:40:49,706 WARNING [train.py:1197] (3/4) Exclude cut with ID _19583_210_4852_1_1531476402622_6486849_770-200430-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:40:53,247 WARNING [train.py:1197] (3/4) Exclude cut with ID _65640_210_6310_1_1535368201445_3173250_209-13008-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:40:55,421 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_212-149947-0_sp0.9 from training. Duration: 0.811125 2023-09-30 16:41:00,289 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_188-171673-0 from training. Duration: 0.58 2023-09-30 16:41:01,793 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_286-281724-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:41:01,914 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_16-254909-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:05,488 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.feed_forward3.out_whiten, num_groups=1, num_channels=384, metric=12.75 vs. limit=15.0 2023-09-30 16:41:07,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _46057_210_10640_1_1533693636065_3557860_52-95655-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:11,437 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_554-45617-0 from training. Duration: 0.99 2023-09-30 16:41:13,707 WARNING [train.py:1197] (3/4) Exclude cut with ID _14683_210_5219_1_1530698352608_3974859_473-344611-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:41:13,921 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.feed_forward2.hidden_balancer.prob, batch_count=772826.6666666666, ans=0.125 2023-09-30 16:41:15,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_17-25045-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 16:41:19,821 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9072W0003-530637 from training. Duration: 0.768 2023-09-30 16:41:20,319 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_skip_rate, batch_count=772893.3333333334, ans=0.0 2023-09-30 16:41:21,361 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_76-74025-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:41:21,453 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:41:22,921 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9084W0025-536862 from training. Duration: 0.896 2023-09-30 16:41:24,419 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0021-538392 from training. Duration: 0.768 2023-09-30 16:41:24,440 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528543323_645515_56-163234-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:41:24,489 WARNING [train.py:1197] (3/4) Exclude cut with ID _45560_210_21982_1_1533812603951_3584999_44-332643-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:41:24,598 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1285-256622-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:41:25,958 WARNING [train.py:1197] (3/4) Exclude cut with ID _52781_210_16187_1_1534297729099_1118149_141-325632-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:41:27,903 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1051-137631-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:41:27,973 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_13091_210_7925_1_1530609447_8987354_233-18333-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:41:28,270 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.feed_forward1.hidden_balancer.prob, batch_count=772893.3333333334, ans=0.125 2023-09-30 16:41:31,102 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_34008_210_10431_1_1533345424_8183247_662-317331-0 from training. Duration: 0.94 2023-09-30 16:41:31,126 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_291-248920-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:31,131 WARNING [train.py:1197] (3/4) Exclude cut with ID _65263_210_22356_1_1535763598691_3921037_274-288481-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:41:32,535 WARNING [train.py:1197] (3/4) Exclude cut with ID _5189_210_4557_1_1527248319428_3769229_233-168080-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:32,657 WARNING [train.py:1197] (3/4) Exclude cut with ID _54871_210_22356_1_1534665798253_3856359_135-184281-0 from training. Duration: 0.99 2023-09-30 16:41:35,525 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0012-555972 from training. Duration: 0.896 2023-09-30 16:41:35,532 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9123W0118-556077 from training. Duration: 0.896 2023-09-30 16:41:35,547 WARNING [train.py:1197] (3/4) Exclude cut with ID _8461_210_7497_1_1528506203708_3635220_406-275649-0 from training. Duration: 0.42 2023-09-30 16:41:35,785 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.1.conv_module1.balancer1.prob, batch_count=772960.0, ans=0.125 2023-09-30 16:41:38,169 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.0.layers.1.whiten, num_groups=1, num_channels=192, metric=3.59 vs. limit=12.0 2023-09-30 16:41:38,756 WARNING [train.py:1197] (3/4) Exclude cut with ID _6342_210_2084_1_1527898010029_3685279_309-13684-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:41:38,789 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_129-337802-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:41:38,821 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_649-266825-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:41:40,265 WARNING [train.py:1197] (3/4) Exclude cut with ID _40519_210_13794_1_1533715228655_7337390_895-224742-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:41:41,927 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_234-14174-0 from training. Duration: 0.82 2023-09-30 16:41:44,874 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9156W0009-572502 from training. Duration: 0.768 2023-09-30 16:41:44,888 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8074_210_4211_1_1528456606_4551715_424-245529-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:48,060 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=773026.6666666666, ans=0.1 2023-09-30 16:41:49,151 INFO [train.py:1039] (3/4) Epoch 22, batch 4400, loss[loss=0.1846, simple_loss=0.2616, pruned_loss=0.05377, over 23678.00 frames. ], tot_loss[loss=0.1744, simple_loss=0.251, pruned_loss=0.04884, over 4716427.15 frames. ], batch size: 85, lr: 4.64e-03, grad_scale: 16.0 2023-09-30 16:41:49,337 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_232-10149-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:41:49,351 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_152-30410-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:41:50,970 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.attention_skip_rate, batch_count=773026.6666666666, ans=0.0 2023-09-30 16:41:53,684 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_35441_210_16554_1_1533347572_4092680_291-82075-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:41:55,281 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_64-298917-0 from training. Duration: 0.88 2023-09-30 16:41:55,325 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp0.9 from training. Duration: 0.7655625 2023-09-30 16:41:56,828 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0 from training. Duration: 0.8 2023-09-30 16:41:56,884 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9193W0023-590322 from training. Duration: 0.896 2023-09-30 16:41:58,363 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_206-10097-0_sp1.1 from training. Duration: 0.4454375 2023-09-30 16:41:58,383 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_17-51731-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:42:01,319 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14953_210_2151_1_1530701710_5616165_710-165400-0 from training. Duration: 0.87 2023-09-30 16:42:05,634 WARNING [train.py:1197] (3/4) Exclude cut with ID _53203_210_2087_1_1534246218309_475590_61-85777-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:42:07,122 WARNING [train.py:1197] (3/4) Exclude cut with ID _55844_210_2346_1_1534471212569_7625707_1050-10453-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:42:07,142 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9218W0013-603297 from training. Duration: 0.896 2023-09-30 16:42:08,954 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_267-102818-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:42:08,956 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_191-41023-0 from training. Duration: 0.71 2023-09-30 16:42:09,052 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9231W0202-610227 from training. Duration: 0.896 2023-09-30 16:42:12,310 WARNING [train.py:1197] (3/4) Exclude cut with ID _6537_210_1883_1_1527987559710_3597129_382-133062-0 from training. Duration: 0.68 2023-09-30 16:42:13,734 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_43-84928-0 from training. Duration: 0.99 2023-09-30 16:42:13,777 WARNING [train.py:1197] (3/4) Exclude cut with ID _59256_210_5986_1_1535076085997_3589120_121-237386-0 from training. Duration: 0.96 2023-09-30 16:42:13,838 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_137-256986-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:42:15,365 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9417_210_1794_1_1529235261_4614131_479-346963-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:42:15,457 WARNING [train.py:1197] (3/4) Exclude cut with ID _26474_210_9570_1_1532520022699_3623380_267-7362-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:42:18,227 WARNING [train.py:1197] (3/4) Exclude cut with ID _55328_210_27767_1_1534580973260_4088170_287-61272-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:42:18,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_589-9070-0 from training. Duration: 0.71 2023-09-30 16:42:18,456 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp1.1 from training. Duration: 0.37275 2023-09-30 16:42:19,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_164-184548-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:42:22,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_6-23328-0_sp1.1 from training. Duration: 0.7818125 2023-09-30 16:42:22,160 WARNING [train.py:1197] (3/4) Exclude cut with ID _29303_210_2575_1_1532392237303_7410649_612-165069-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:42:22,518 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.conv_module2.balancer1.max_abs, batch_count=773160.0, ans=10.0 2023-09-30 16:42:23,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _2938_210_2577_1_1526108398778_6347830_383-321224-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:42:25,209 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_283-160525-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:42:25,890 WARNING [train.py:1197] (3/4) Exclude cut with ID _8093_210_9259_1_1528362031121_4186822_505-97987-0 from training. Duration: 0.86 2023-09-30 16:42:26,022 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0190-640317 from training. Duration: 0.768 2023-09-30 16:42:26,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.balancer_ff2.min_abs, batch_count=773160.0, ans=0.1 2023-09-30 16:42:28,945 WARNING [train.py:1197] (3/4) Exclude cut with ID _50093_210_4220_1_1534417141207_3432299_280-297390-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:42:36,389 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17292_210_6753_1_1531125388_2943734_320-71385-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:42:38,242 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.442e+02 1.760e+02 1.970e+02 2.320e+02 3.797e+02, threshold=3.941e+02, percent-clipped=0.0 2023-09-30 16:42:39,084 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=8, num_channels=256, metric=4.83 vs. limit=6.0 2023-09-30 16:42:39,895 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_213-103282-0 from training. Duration: 0.78 2023-09-30 16:42:44,455 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7615_210_4211_1_1528110965_4580160_72-235211-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:42:46,087 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_173-320977-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:42:47,772 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_783-278109-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:42:49,262 WARNING [train.py:1197] (3/4) Exclude cut with ID _3737_210_2071_1_1526691787377_3754790_90-30673-0 from training. Duration: 0.84 2023-09-30 16:42:49,307 WARNING [train.py:1197] (3/4) Exclude cut with ID _22826_210_9994_1_1531908201522_3279169_415-161-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:42:49,324 WARNING [train.py:1197] (3/4) Exclude cut with ID _6825_210_2151_1_1528023632076_2265630_86-66192-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:42:49,328 WARNING [train.py:1197] (3/4) Exclude cut with ID _4626_210_3532_1_1526817652717_3916859_446-206656-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:42:50,886 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_166-128941-0_sp0.9 from training. Duration: 0.6 2023-09-30 16:42:55,493 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_345-167986-0 from training. Duration: 0.84 2023-09-30 16:42:55,858 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer2.min_positive, batch_count=773293.3333333334, ans=0.05 2023-09-30 16:42:58,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _6275_210_2084_1_1528110062213_7669049_973-198085-0 from training. Duration: 0.75 2023-09-30 16:43:01,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _60057_210_10640_1_1534903065793_3333150_171-181133-0 from training. Duration: 0.88 2023-09-30 16:43:01,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_336-196513-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:43:01,066 WARNING [train.py:1197] (3/4) Exclude cut with ID _6493_210_1747_1_1528459210424_4012470_513-140967-0 from training. Duration: 0.79 2023-09-30 16:43:01,757 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=2.09 vs. limit=6.0 2023-09-30 16:43:02,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6673_210_3273_1_1527767790_3173240_327-306185-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:43:08,868 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1032-236863-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:43:11,815 WARNING [train.py:1197] (3/4) Exclude cut with ID _14867_210_1968_1_1530684179890_3846969_413-29292-0 from training. Duration: 0.9 2023-09-30 16:43:13,084 INFO [train.py:1039] (3/4) Epoch 22, batch 4450, loss[loss=0.1917, simple_loss=0.2675, pruned_loss=0.05795, over 23231.00 frames. ], tot_loss[loss=0.174, simple_loss=0.251, pruned_loss=0.04856, over 4715933.51 frames. ], batch size: 93, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:43:16,924 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_186-343322-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:43:18,679 WARNING [train.py:1197] (3/4) Exclude cut with ID _56235_210_10640_1_1534557335235_4161099_624-46091-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:43:20,207 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_791-197891-0_sp0.9 from training. Duration: 0.9333125 2023-09-30 16:43:25,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_50050_210_16223_1_1535435796_7290138_786-288457-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:43:25,067 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_213-227283-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:43:28,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _42659_210_2070_1_1533607442765_3120380_58-341121-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:43:31,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_506-23200-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:43:33,165 WARNING [train.py:1197] (3/4) Exclude cut with ID _14170_210_5219_1_1530525949868_4469650_336-213530-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:43:33,207 WARNING [train.py:1197] (3/4) Exclude cut with ID _64135_210_2253_1_1535677267876_7608972_180-5312-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:43:36,718 WARNING [train.py:1197] (3/4) Exclude cut with ID _12302_210_5219_1_1529924446466_8107280_835-279754-0 from training. Duration: 0.9 2023-09-30 16:43:36,721 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_20120_210_6753_1_1531643035_5482859_510-96996-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:43:36,848 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_15517_210_6753_1_1530954008_3487336_216-187834-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:43:36,902 WARNING [train.py:1197] (3/4) Exclude cut with ID _19504_210_9994_1_1531648863242_3325220_414-113427-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:43:36,904 WARNING [train.py:1197] (3/4) Exclude cut with ID _14871_210_7497_1_1530665975901_3442180_264-108685-0_sp0.9 from training. Duration: 0.611125 2023-09-30 16:43:39,942 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_104-288274-0_sp0.9 from training. Duration: 0.6444375 2023-09-30 16:43:43,655 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.2.conv_module2.whiten, num_groups=1, num_channels=384, metric=2.85 vs. limit=15.0 2023-09-30 16:43:44,787 INFO [scaling.py:1118] (3/4) WithLoss: name=encoder.encoders.1.encoder.layers.1.self_attn_weights, loss-sum=0.000e+00 2023-09-30 16:43:47,387 WARNING [train.py:1197] (3/4) Exclude cut with ID _61743_210_8903_1_1534985913804_8673290_520-39496-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:43:47,473 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37818_210_4238_1_1533620910_4720083_315-308565-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:43:48,981 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2009_210_2071_1_1525481134_4588937_385-302100-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:43:51,082 WARNING [train.py:1197] (3/4) Exclude cut with ID _58895_210_8750_1_1535184081320_3649089_277-53219-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:43:51,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _26664_210_7770_1_1532258943280_3418270_121-111258-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:43:56,498 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_99-167392-0_sp1.1 from training. Duration: 0.5545625 2023-09-30 16:43:58,048 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7006_210_1410_1_1527900189_8900975_1113-7369-0 from training. Duration: 0.85 2023-09-30 16:43:58,073 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_352-59958-0 from training. Duration: 0.86 2023-09-30 16:43:58,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _14886_210_4220_1_1530702020355_3516190_159-73748-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:44:00,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _16616_210_8254_1_1531025951922_7223339_131-119569-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:44:02,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_216-343007-0 from training. Duration: 0.66 2023-09-30 16:44:06,995 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_113-92608-0_sp0.9 from training. Duration: 0.6 2023-09-30 16:44:10,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_132-123271-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:44:10,671 WARNING [train.py:1197] (3/4) Exclude cut with ID _8793_210_6310_1_1528542985139_2310009_121-179782-0 from training. Duration: 0.8 2023-09-30 16:44:10,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _43997_210_4525_1_1533538632555_5189329_283-218639-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:44:10,715 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_297-78397-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:44:10,934 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.conv_skip_rate, batch_count=773560.0, ans=0.0 2023-09-30 16:44:12,142 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3475_210_2028_1_1526193578_3622569_377-166133-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:44:12,153 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_520-213149-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:44:13,732 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_567-144483-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:44:16,809 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp0.9 from training. Duration: 0.62225 2023-09-30 16:44:16,871 WARNING [train.py:1197] (3/4) Exclude cut with ID _49643_210_7989_1_1534640244224_3939031_611-185515-0 from training. Duration: 0.94 2023-09-30 16:44:18,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0_sp0.9 from training. Duration: 0.7333125 2023-09-30 16:44:19,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_51225_210_3601_1_1534400634_2327383_49-235441-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:44:21,562 WARNING [train.py:1197] (3/4) Exclude cut with ID _36552_210_4220_1_1533358828768_3042759_240-294400-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:44:23,070 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_640-304825-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:44:23,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_187-221566-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 16:44:23,309 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.self_attn_weights.pos_emb_skip_rate, batch_count=773626.6666666666, ans=0.0 2023-09-30 16:44:25,297 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.nonlin_attention.balancer.prob, batch_count=773626.6666666666, ans=0.125 2023-09-30 16:44:26,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _7606_210_4915_1_1528894253277_5080690_127-177761-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:44:30,220 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_430-116139-0 from training. Duration: 0.85 2023-09-30 16:44:31,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _3675_210_4557_1_1526639934408_4182089_456-18305-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:44:34,917 INFO [train.py:1039] (3/4) Epoch 22, batch 4500, loss[loss=0.1794, simple_loss=0.2649, pruned_loss=0.04693, over 24341.00 frames. ], tot_loss[loss=0.175, simple_loss=0.2513, pruned_loss=0.04937, over 4705110.51 frames. ], batch size: 74, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:44:35,477 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.feed_forward1.out_proj.dropout_p, batch_count=773693.3333333334, ans=0.1 2023-09-30 16:44:36,665 WARNING [train.py:1197] (3/4) Exclude cut with ID _18513_210_9206_1_1531618215223_3862620_402-5957-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:44:36,847 WARNING [train.py:1197] (3/4) Exclude cut with ID _42721_210_18835_1_1533558663552_7253119_465-156500-0 from training. Duration: 0.72 2023-09-30 16:44:36,849 WARNING [train.py:1197] (3/4) Exclude cut with ID _8343_210_1381_1_1528527665339_8011639_714-310538-0 from training. Duration: 0.86 2023-09-30 16:44:40,620 WARNING [train.py:1197] (3/4) Exclude cut with ID _39411_210_5986_1_1533538825030_3343649_335-301187-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:44:45,243 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_41750_210_1960_1_1533520678_3948042_241-158616-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:44:46,668 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_46013_210_7234_1_1533776362_3744032_50-141535-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:44:46,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_43-206757-0_sp0.9 from training. Duration: 0.8333125 2023-09-30 16:44:46,969 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer1.prob, batch_count=773693.3333333334, ans=0.125 2023-09-30 16:44:48,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _22961_210_8254_1_1531976366672_4278180_172-276296-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:44:49,668 WARNING [train.py:1197] (3/4) Exclude cut with ID _17956_210_4353_1_1531209568125_6979970_1305-301903-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:44:49,753 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_604-44507-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:44:59,252 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer2.prob, batch_count=773760.0, ans=0.125 2023-09-30 16:45:04,468 WARNING [train.py:1197] (3/4) Exclude cut with ID _23514_210_1389_1_1531832139785_3031439_88-337544-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:45:06,013 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_932-3672-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:45:09,196 WARNING [train.py:1197] (3/4) Exclude cut with ID _46502_210_13794_1_1533724452198_3657159_207-109991-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:45:09,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_216-73841-0_sp1.1 from training. Duration: 0.87275 2023-09-30 16:45:10,785 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8453_210_4211_1_1528502940_8562730_347-330673-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:45:15,570 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4183_210_2087_1_1526729100_1869838_143-280962-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 16:45:22,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_325-113798-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:45:24,246 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.544e+02 1.959e+02 2.152e+02 2.408e+02 4.470e+02, threshold=4.304e+02, percent-clipped=1.0 2023-09-30 16:45:26,019 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_91-212486-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:45:26,398 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.nonlin_attention.balancer.prob, batch_count=773893.3333333334, ans=0.125 2023-09-30 16:45:29,052 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8776_210_2040_1_1528638837_4088036_244-278763-0_sp1.1 from training. Duration: 0.8181875 2023-09-30 16:45:29,102 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_106-286130-0 from training. Duration: 0.98 2023-09-30 16:45:30,591 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2746_210_1988_1_1525872816_3795627_372-205861-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:45:30,673 WARNING [train.py:1197] (3/4) Exclude cut with ID _30800_210_22382_1_1532499347904_4515959_240-54646-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:45:32,312 WARNING [train.py:1197] (3/4) Exclude cut with ID _59500_210_8254_1_1534809610680_3736009_162-180695-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:45:32,347 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528591327_1471426_146-93357-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:45:34,797 WARNING [train.py:1197] (3/4) Exclude cut with ID _42657_210_2070_1_1533520654016_3327008_405-87432-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:45:34,834 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4528_210_3273_1_1527076654_3276777_209-219804-0 from training. Duration: 0.69 2023-09-30 16:45:34,834 WARNING [train.py:1197] (3/4) Exclude cut with ID _13687_210_7497_1_1530705699576_3168359_78-182912-0_sp0.9 from training. Duration: 0.6555625 2023-09-30 16:45:34,845 WARNING [train.py:1197] (3/4) Exclude cut with ID _31255_210_13427_1_1532865619495_3815143_322-114998-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:45:39,980 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_848-280957-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:45:40,025 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7696_210_2040_1_1528207167_3716099_117-125434-0_sp0.9 from training. Duration: 0.8444375 2023-09-30 16:45:43,285 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53234_210_1504_1_1534247647_8382892_1002-109192-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:45:47,538 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_138-64210-0_sp0.9 from training. Duration: 0.811125 2023-09-30 16:45:47,565 WARNING [train.py:1197] (3/4) Exclude cut with ID _25808_210_13292_1_1532343514562_7366109_633-47524-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:45:49,136 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0 from training. Duration: 0.95 2023-09-30 16:45:51,510 WARNING [train.py:1197] (3/4) Exclude cut with ID _8394_210_6702_1_1528590504316_3540189_335-274780-0 from training. Duration: 0.78 2023-09-30 16:45:51,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _8256_210_1741_1_1528505983621_3880739_301-330455-0 from training. Duration: 0.8 2023-09-30 16:45:55,169 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_383-205833-0 from training. Duration: 0.95 2023-09-30 16:45:55,367 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.conv_module1.balancer2.prob, batch_count=773960.0, ans=0.125 2023-09-30 16:45:58,082 INFO [train.py:1039] (3/4) Epoch 22, batch 4550, loss[loss=0.1614, simple_loss=0.2376, pruned_loss=0.04266, over 24586.00 frames. ], tot_loss[loss=0.1741, simple_loss=0.2503, pruned_loss=0.04901, over 4704415.70 frames. ], batch size: 60, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:45:58,246 WARNING [train.py:1197] (3/4) Exclude cut with ID _14877_210_9206_1_1530678686666_4497469_48-182155-0 from training. Duration: 0.87 2023-09-30 16:45:59,703 WARNING [train.py:1197] (3/4) Exclude cut with ID _14832_210_8750_1_1530691351539_3171348_1-54315-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:46:02,768 WARNING [train.py:1197] (3/4) Exclude cut with ID _44982_210_1511_1_1534312820108_3726029_413-100814-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:46:04,155 WARNING [train.py:1197] (3/4) Exclude cut with ID _3231_210_2106_1_1526205864934_6784220_104-261392-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:46:06,113 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.conv_module1.balancer1.prob, batch_count=774026.6666666666, ans=0.125 2023-09-30 16:46:07,830 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_1-13588-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:46:14,471 WARNING [train.py:1197] (3/4) Exclude cut with ID _44304_210_15374_1_1533639352765_4613129_141-8356-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:46:16,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _37892_210_19596_1_1533198677897_3964058_374-335034-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:46:18,410 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=3.13 vs. limit=15.0 2023-09-30 16:46:19,189 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:46:19,193 WARNING [train.py:1197] (3/4) Exclude cut with ID _14209_210_9659_1_1530838843920_9459520_1267-272693-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:46:19,194 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_690-326155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:46:20,781 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68847_210_14656_1_1536070214_2586843_38-20717-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:46:20,854 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_99-251314-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:46:24,759 WARNING [train.py:1197] (3/4) Exclude cut with ID _49376_210_5986_1_1533985278732_3370740_225-95630-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:46:28,191 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2655_210_2087_1_1525688959_3897400_202-147557-0 from training. Duration: 0.85 2023-09-30 16:46:28,292 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5618_210_1874_1_1527311008_5851_1-214501-0_sp1.1 from training. Duration: 0.626375 2023-09-30 16:46:28,402 WARNING [train.py:1197] (3/4) Exclude cut with ID _4865_210_2577_1_1527324271459_2832509_225-43381-0_sp1.1 from training. Duration: 0.8545625 2023-09-30 16:46:29,952 WARNING [train.py:1197] (3/4) Exclude cut with ID _7770_210_2588_1_1528207193258_3639828_242-3472-0 from training. Duration: 0.73 2023-09-30 16:46:32,036 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_whiten, num_groups=1, num_channels=256, metric=7.58 vs. limit=15.0 2023-09-30 16:46:34,339 WARNING [train.py:1197] (3/4) Exclude cut with ID _6964_210_6702_1_1527985666882_3774080_339-104791-0 from training. Duration: 0.49 2023-09-30 16:46:34,445 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_488-92261-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:46:34,707 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=774160.0, ans=0.0 2023-09-30 16:46:37,556 WARNING [train.py:1197] (3/4) Exclude cut with ID _14228_210_4381_1_1530788291029_3969100_175-163894-0 from training. Duration: 0.74 2023-09-30 16:46:39,203 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:46:41,566 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_116-275927-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:46:41,615 WARNING [train.py:1197] (3/4) Exclude cut with ID _4697_210_2069_1_1526815801953_3823098_61-223324-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:46:41,638 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6961_210_2087_1_1527937168_6815011_562-165518-0_sp1.1 from training. Duration: 0.7 2023-09-30 16:46:44,786 WARNING [train.py:1197] (3/4) Exclude cut with ID _21989_210_5854_1_1531899214931_3271360_133-44759-0 from training. Duration: 0.77 2023-09-30 16:46:48,301 WARNING [train.py:1197] (3/4) Exclude cut with ID _23557_210_8750_1_1531890106370_6846255_77-284923-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:46:51,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _13678_210_7497_1_1530530069226_3463170_464-296127-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:46:51,266 WARNING [train.py:1197] (3/4) Exclude cut with ID _22613_210_5219_1_1532253599686_4090170_393-36663-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:46:52,773 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_235-262124-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:46:53,014 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module1.balancer2.prob, batch_count=774226.6666666666, ans=0.125 2023-09-30 16:46:54,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0 from training. Duration: 0.74 2023-09-30 16:46:54,806 WARNING [train.py:1197] (3/4) Exclude cut with ID _6456_210_1534_1_1527681634212_3181049_215-309359-0 from training. Duration: 0.88 2023-09-30 16:46:54,844 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_3019_210_2028_1_1526044245_3264865_191-72235-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:46:56,475 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_380-162648-0 from training. Duration: 0.94 2023-09-30 16:46:57,520 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_8342_210_2290_1_1528628247_7646528_92-46009-0 from training. Duration: 0.54 2023-09-30 16:46:58,881 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_53-300544-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:47:00,391 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_68235_210_1925_1_1535863247_4484656_101-250943-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:00,418 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530775547361_1966965_204-22182-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:47:01,867 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_310-130092-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:47:01,899 WARNING [train.py:1197] (3/4) Exclude cut with ID _60113_210_16271_1_1535164212481_4386079_559-167191-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:47:03,449 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_93-58002-0_sp1.1 from training. Duration: 0.5181875 2023-09-30 16:47:04,787 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_110-224688-0 from training. Duration: 0.97 2023-09-30 16:47:04,987 WARNING [train.py:1197] (3/4) Exclude cut with ID _40402_210_18219_1_1533522324115_2953150_178-178004-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:47:05,001 WARNING [train.py:1197] (3/4) Exclude cut with ID _7622_210_2588_1_1528284803271_3653140_321-335475-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 16:47:05,471 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.balancer1.prob, batch_count=774293.3333333334, ans=0.125 2023-09-30 16:47:06,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _66863_210_18053_1_1535698758334_8590319_1092-271784-0 from training. Duration: 0.96 2023-09-30 16:47:06,594 WARNING [train.py:1197] (3/4) Exclude cut with ID _28317_210_12742_1_1532480302381_8510325_607-213951-0_sp1.1 from training. Duration: 0.763625 2023-09-30 16:47:06,625 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_468-283113-0 from training. Duration: 0.96 2023-09-30 16:47:11,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_13-165512-0_sp1.1 from training. Duration: 0.7454375 2023-09-30 16:47:11,190 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_69630_210_7234_1_1535788670_910989_56-169762-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:47:13,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _67335_210_5219_1_1535525975652_4275229_121-80357-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:47:14,857 WARNING [train.py:1197] (3/4) Exclude cut with ID _48852_210_3609_1_1533947396997_6636149_126-12079-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:47:14,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_342-270278-0_sp1.1 from training. Duration: 0.5 2023-09-30 16:47:16,446 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_30322_210_1925_1_1532843478_826817_35-328264-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:47:18,109 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_755-112625-0_sp1.1 from training. Duration: 0.67275 2023-09-30 16:47:21,076 INFO [train.py:1039] (3/4) Epoch 22, batch 4600, loss[loss=0.1624, simple_loss=0.2387, pruned_loss=0.04302, over 24476.00 frames. ], tot_loss[loss=0.1729, simple_loss=0.2487, pruned_loss=0.04861, over 4713334.98 frames. ], batch size: 58, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:47:21,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _38670_210_15379_1_1533448810659_4391059_132-146412-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:21,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _22020_210_2461_1_1531724164513_6326900_563-39691-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:47:24,965 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_9460_210_4211_1_1528954587_10049667_572-37035-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:47:24,985 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_129-166012-0_sp0.9 from training. Duration: 0.9444375 2023-09-30 16:47:25,098 WARNING [train.py:1197] (3/4) Exclude cut with ID _14925_210_13076_1_1530838593293_4602004_487-91921-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:47:27,213 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_195-257512-0 from training. Duration: 0.67 2023-09-30 16:47:30,752 WARNING [train.py:1197] (3/4) Exclude cut with ID _14561_210_7497_1_1530601118408_3369279_188-260011-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:47:35,341 WARNING [train.py:1197] (3/4) Exclude cut with ID _8480_210_1874_1_1528589391205_6728330_309-139896-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:47:36,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_866-321248-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:47:41,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_510-216513-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:43,379 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.conv_skip_rate, batch_count=774426.6666666666, ans=0.0 2023-09-30 16:47:46,334 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.ff2_skip_rate, batch_count=774426.6666666666, ans=0.0 2023-09-30 16:47:48,430 WARNING [train.py:1197] (3/4) Exclude cut with ID _6653_210_2629_1_1527919172441_6732679_758-143677-0 from training. Duration: 0.98 2023-09-30 16:47:49,933 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_50-214427-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:54,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_482-301330-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:47:57,394 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_490-326847-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:47:57,407 WARNING [train.py:1197] (3/4) Exclude cut with ID _45959_210_18219_1_1533801407760_7627169_984-90716-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:48:02,391 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_166-246557-0 from training. Duration: 0.64 2023-09-30 16:48:02,392 WARNING [train.py:1197] (3/4) Exclude cut with ID _15573_210_2253_1_1530839043359_3466350_51-200220-0_sp1.1 from training. Duration: 0.5090625 2023-09-30 16:48:04,236 WARNING [train.py:1197] (3/4) Exclude cut with ID _51382_210_23862_1_1534492801838_3650104_275-92796-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:48:11,416 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.522e+02 1.811e+02 1.994e+02 2.205e+02 2.930e+02, threshold=3.988e+02, percent-clipped=0.0 2023-09-30 16:48:11,544 WARNING [train.py:1197] (3/4) Exclude cut with ID _29853_210_7497_1_1532505600851_6642110_461-142829-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:48:11,638 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_788-298907-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:48:13,114 WARNING [train.py:1197] (3/4) Exclude cut with ID _21123_210_2159_1_1532170977769_6809270_739-2098-0_sp1.1 from training. Duration: 0.836375 2023-09-30 16:48:16,440 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528541907_1399322_165-323619-0 from training. Duration: 0.69 2023-09-30 16:48:19,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_401-255491-0_sp1.1 from training. Duration: 0.636375 2023-09-30 16:48:24,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _66873_210_5517_1_1535454136506_4293370_24-143283-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:48:26,113 WARNING [train.py:1197] (3/4) Exclude cut with ID _14459_210_2228_1_1531443641946_7516700_1221-280439-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:48:29,180 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_469-213482-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:48:29,192 WARNING [train.py:1197] (3/4) Exclude cut with ID _8766_210_1385_1_1528713012358_7359191_148-158509-0_sp0.9 from training. Duration: 0.4555625 2023-09-30 16:48:29,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _24314_210_4915_1_1532135078116_7170179_863-259095-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:48:29,354 WARNING [train.py:1197] (3/4) Exclude cut with ID _16530_210_1664_1_1531310502089_4347140_321-22821-0 from training. Duration: 0.89 2023-09-30 16:48:30,771 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43831_210_2158_1_1533725502_5872791_55-163137-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:48:30,858 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_26-25207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:48:31,049 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_17613_210_2151_1_1531137569_2366160_223-316461-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:48:32,493 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53679_210_4852_1_1534556726_4186141_541-213370-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:48:34,018 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530597547560_5041102_369-295086-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:48:34,104 WARNING [train.py:1197] (3/4) Exclude cut with ID _31663_210_15379_1_1532649651493_3857339_91-267696-0 from training. Duration: 0.87 2023-09-30 16:48:34,173 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0 from training. Duration: 0.81 2023-09-30 16:48:35,601 WARNING [train.py:1197] (3/4) Exclude cut with ID _14095_210_2253_1_1530856745572_6851360_32-335103-0 from training. Duration: 0.82 2023-09-30 16:48:35,611 WARNING [train.py:1197] (3/4) Exclude cut with ID _16430_210_9206_1_1531314014099_5003209_133-312727-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:48:37,490 WARNING [train.py:1197] (3/4) Exclude cut with ID _59288_210_5854_1_1535077757774_3412246_391-165934-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:48:37,590 WARNING [train.py:1197] (3/4) Exclude cut with ID _28323_210_16187_1_1532480395667_4100300_488-1281-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:48:39,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _29495_210_16304_1_1532935783634_7614860_124-193207-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:48:44,929 INFO [train.py:1039] (3/4) Epoch 22, batch 4650, loss[loss=0.1838, simple_loss=0.2271, pruned_loss=0.0703, over 19146.00 frames. ], tot_loss[loss=0.1724, simple_loss=0.2484, pruned_loss=0.04824, over 4718084.95 frames. ], batch size: 388, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:48:45,890 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.78 vs. limit=15.0 2023-09-30 16:48:49,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _14087_210_2253_1_1530529224512_3014029_226-242140-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:48:51,440 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_392-3694-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:48:52,938 WARNING [train.py:1197] (3/4) Exclude cut with ID _42756_210_9068_1_1533728815968_4541139_563-329842-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:48:53,023 WARNING [train.py:1197] (3/4) Exclude cut with ID _58583_210_5236_1_1534726835266_3574249_391-87755-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:48:53,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _28427_210_4220_1_1532581240967_3061069_281-237827-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:48:53,132 WARNING [train.py:1197] (3/4) Exclude cut with ID _61807_210_5081_1_1535593745440_8163120_44-147898-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:48:56,395 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24007_210_1794_1_1531918925_1663875_125-320772-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:48:59,474 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_143-117421-0 from training. Duration: 0.58 2023-09-30 16:49:04,038 WARNING [train.py:1197] (3/4) Exclude cut with ID _69584_210_5219_1_1535785133775_4044450_325-7656-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:49:06,915 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_37351_210_7925_1_1533012931_3812113_265-85780-0 from training. Duration: 0.88 2023-09-30 16:49:06,951 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_331-237896-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:49:08,461 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530604298182_956899_6-232695-0 from training. Duration: 0.94 2023-09-30 16:49:08,496 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7544_210_6753_1_1528587000_691468_87-178599-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:49:08,583 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5020_210_1794_1_1526990143_3961092_326-303945-0 from training. Duration: 0.99 2023-09-30 16:49:08,619 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_503-30719-0 from training. Duration: 0.98 2023-09-30 16:49:08,632 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_11305_210_1794_1_1529750428_7178148_299-344640-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:49:10,027 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534490405_2954066_265-170472-0_sp1.1 from training. Duration: 0.7545625 2023-09-30 16:49:13,540 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_12866_210_6753_1_1530613692_6684967_174-277364-0_sp1.1 from training. Duration: 0.6454375 2023-09-30 16:49:15,040 WARNING [train.py:1197] (3/4) Exclude cut with ID _58414_210_8254_1_1535421609162_7151182_193-151884-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:49:15,080 WARNING [train.py:1197] (3/4) Exclude cut with ID IC0651W0104-322063 from training. Duration: 0.768 2023-09-30 16:49:15,301 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.bypass.scale_min, batch_count=774826.6666666666, ans=0.2 2023-09-30 16:49:19,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _21881_210_3525_1_1531825242757_3798380_139-305982-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:49:22,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _61191_210_1726_1_1534904947875_7276080_79-200392-0 from training. Duration: 0.97 2023-09-30 16:49:23,783 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_451-280894-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:49:23,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_304-273433-0_sp1.1 from training. Duration: 0.9 2023-09-30 16:49:25,341 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5959_210_1794_1_1527400400_3445726_412-44052-0 from training. Duration: 0.77 2023-09-30 16:49:26,893 WARNING [train.py:1197] (3/4) Exclude cut with ID _4681_210_3525_1_1527251040321_1235713_148-26159-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:49:28,740 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.0.bypass.skip_rate, batch_count=774826.6666666666, ans=0.04949747468305833 2023-09-30 16:49:29,883 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_887-249555-0_sp0.9 from training. Duration: 0.8555625 2023-09-30 16:49:32,408 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.2.bypass.scale_min, batch_count=774893.3333333334, ans=0.2 2023-09-30 16:49:33,507 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_25236_210_2158_1_1532051968_280567_37-6860-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:49:35,376 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.1.conv_module2.balancer2.prob, batch_count=774893.3333333334, ans=0.125 2023-09-30 16:49:38,222 WARNING [train.py:1197] (3/4) Exclude cut with ID _29277_210_3279_1_1532581109293_3173069_417-115527-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:49:41,393 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_552-134748-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:49:42,749 WARNING [train.py:1197] (3/4) Exclude cut with ID _43176_210_8254_1_1533524417469_4174339_188-174143-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:49:42,822 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_127-119109-0_sp1.1 from training. Duration: 0.6545625 2023-09-30 16:49:45,934 WARNING [train.py:1197] (3/4) Exclude cut with ID _14922_210_6228_1_1530707435273_3693310_38-187851-0 from training. Duration: 0.62 2023-09-30 16:49:45,996 WARNING [train.py:1197] (3/4) Exclude cut with ID _6550_210_3525_1_1527853168673_6181929_588-125165-0 from training. Duration: 0.83 2023-09-30 16:49:46,107 WARNING [train.py:1197] (3/4) Exclude cut with ID _13682_210_7497_1_1530615502844_3345939_344-152526-0_sp1.1 from training. Duration: 0.4818125 2023-09-30 16:49:46,109 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527935634_4034444_487-236501-0 from training. Duration: 0.66 2023-09-30 16:49:48,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _23825_210_13449_1_1531915313526_3497150_379-16981-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:49:52,550 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.2.encoder.layers.1.nonlin_attention.whiten2, num_groups=1, num_channels=384, metric=5.87 vs. limit=15.0 2023-09-30 16:49:55,666 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward1.out_proj.dropout_p, batch_count=774960.0, ans=0.1 2023-09-30 16:49:56,975 WARNING [train.py:1197] (3/4) Exclude cut with ID _15164_210_4915_1_1530788389043_7379310_297-5233-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:49:56,983 WARNING [train.py:1197] (3/4) Exclude cut with ID _41298_210_9019_1_1533430371450_4822143_178-283538-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:49:58,390 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7046_210_6753_1_1527895337_6920917_642-106799-0 from training. Duration: 0.92 2023-09-30 16:49:58,425 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_532-291952-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:50:00,026 WARNING [train.py:1197] (3/4) Exclude cut with ID _47278_210_12041_1_1533902418349_3576110_260-270468-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:50:00,059 WARNING [train.py:1197] (3/4) Exclude cut with ID _14368_210_4220_1_1530532714870_3595529_144-3287-0_sp0.9 from training. Duration: 0.7444375 2023-09-30 16:50:01,680 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_162-262717-0_sp0.9 from training. Duration: 0.92225 2023-09-30 16:50:04,809 WARNING [train.py:1197] (3/4) Exclude cut with ID _3465_210_3525_1_1526175667060_3574049_62-335552-0_sp0.9 from training. Duration: 0.9666875 2023-09-30 16:50:04,832 WARNING [train.py:1197] (3/4) Exclude cut with ID _22179_210_9259_1_1531789321199_5628896_726-272852-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:50:05,262 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_module1.balancer1.max_abs, batch_count=775026.6666666666, ans=10.0 2023-09-30 16:50:06,784 INFO [train.py:1039] (3/4) Epoch 22, batch 4700, loss[loss=0.1739, simple_loss=0.2462, pruned_loss=0.05078, over 23610.00 frames. ], tot_loss[loss=0.172, simple_loss=0.2481, pruned_loss=0.04788, over 4719159.77 frames. ], batch size: 135, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:50:06,901 WARNING [train.py:1197] (3/4) Exclude cut with ID _14898_210_9206_1_1530766812377_4177190_26-135492-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:50:10,228 WARNING [train.py:1197] (3/4) Exclude cut with ID _14206_210_9659_1_1530579663457_8326180_200-95304-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:50:10,300 WARNING [train.py:1197] (3/4) Exclude cut with ID _45012_210_12651_1_1533608435136_5371380_240-37101-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:50:10,315 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0_sp0.9 from training. Duration: 0.7555625 2023-09-30 16:50:11,936 WARNING [train.py:1197] (3/4) Exclude cut with ID _8067_210_2629_1_1528506061405_7163340_907-151398-0_sp0.9 from training. Duration: 0.411125 2023-09-30 16:50:12,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _14208_210_9659_1_1530752449812_7365219_45-265684-0_sp0.9 from training. Duration: 0.8 2023-09-30 16:50:13,528 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6291_210_3273_1_1527683649_1078498_74-292608-0 from training. Duration: 0.72 2023-09-30 16:50:22,003 WARNING [train.py:1197] (3/4) Exclude cut with ID _59202_210_26984_1_1534816810612_3549174_221-84819-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:50:23,457 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_33696_210_3852_1_1533082780_2663375_181-325595-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:50:23,527 WARNING [train.py:1197] (3/4) Exclude cut with ID _47251_210_18628_1_1533814063128_4137250_351-154105-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:50:25,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _35501_210_9288_1_1532998507986_3192189_351-334740-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:50:26,583 WARNING [train.py:1197] (3/4) Exclude cut with ID _15009_210_8895_1_1530853219171_3619092_342-100157-0_sp1.1 from training. Duration: 0.4909375 2023-09-30 16:50:32,032 WARNING [train.py:1197] (3/4) Exclude cut with ID _6789_210_1534_1_1527857937218_3118950_262-48853-0 from training. Duration: 0.56 2023-09-30 16:50:32,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_430-201020-0 from training. Duration: 0.97 2023-09-30 16:50:35,155 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_71-136518-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:50:36,714 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_26971_210_7878_1_1532228784_3372792_28-291982-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:50:36,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _54218_210_12210_1_1534413646388_4409259_437-275326-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:50:41,174 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.1.conv_module1.whiten, num_groups=1, num_channels=512, metric=2.97 vs. limit=15.0 2023-09-30 16:50:42,014 WARNING [train.py:1197] (3/4) Exclude cut with ID _55134_210_21982_1_1534590028377_3882170_40-309139-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:50:48,254 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_266-329755-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:50:49,846 WARNING [train.py:1197] (3/4) Exclude cut with ID _14229_210_4381_1_1530835127418_4026359_21-174514-0_sp1.1 from training. Duration: 0.3909375 2023-09-30 16:50:51,384 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_409-207378-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:50:51,641 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_skip_rate, batch_count=775160.0, ans=0.0 2023-09-30 16:50:55,706 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.526e+02 1.872e+02 2.138e+02 2.585e+02 4.153e+02, threshold=4.275e+02, percent-clipped=1.0 2023-09-30 16:50:58,081 WARNING [train.py:1197] (3/4) Exclude cut with ID _14827_210_6310_1_1530702015915_8105460_132-269633-0 from training. Duration: 0.68 2023-09-30 16:50:58,246 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_2245_210_2715_1_1525511548_3540254_310-52483-0_sp0.9 from training. Duration: 0.97775 2023-09-30 16:51:01,259 WARNING [train.py:1197] (3/4) Exclude cut with ID _27430_210_7770_1_1532604689375_1378590_47-77403-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:03,587 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=775226.6666666666, ans=0.125 2023-09-30 16:51:05,612 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.0.layers.0.conv_module2.balancer2.prob, batch_count=775226.6666666666, ans=0.125 2023-09-30 16:51:06,843 WARNING [train.py:1197] (3/4) Exclude cut with ID _7562_210_2084_1_1528887767916_3946229_186-326422-0 from training. Duration: 0.84 2023-09-30 16:51:08,413 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14056_210_4163_1_1530604483_7572827_230-35993-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:51:08,688 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.2.ff2_skip_rate, batch_count=775226.6666666666, ans=0.0 2023-09-30 16:51:11,633 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_23854_210_1794_1_1531969696_1329157_57-123491-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:51:13,118 WARNING [train.py:1197] (3/4) Exclude cut with ID _14747_210_8341_1_1530685797463_3654089_147-131229-0 from training. Duration: 0.76 2023-09-30 16:51:14,737 WARNING [train.py:1197] (3/4) Exclude cut with ID _30191_210_1664_1_1533189915047_7196670_430-19539-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:14,761 WARNING [train.py:1197] (3/4) Exclude cut with ID _67725_210_27767_1_1535608756671_5105359_44-50762-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:51:19,928 WARNING [train.py:1197] (3/4) Exclude cut with ID _27485_210_4916_1_1532739191677_3668269_217-161755-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:51:19,998 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5931_210_2040_1_1527428481_4547999_321-193614-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:51:20,036 WARNING [train.py:1197] (3/4) Exclude cut with ID _6267_210_2084_1_1527764658526_3894730_375-219887-0 from training. Duration: 0.67 2023-09-30 16:51:21,599 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9087W0022-538393 from training. Duration: 0.768 2023-09-30 16:51:23,239 WARNING [train.py:1197] (3/4) Exclude cut with ID _68777_210_4353_1_1535810390731_3117904_83-196459-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:51:25,085 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.4.encoder.layers.0.bypass.scale_min, batch_count=775293.3333333334, ans=0.2 2023-09-30 16:51:26,249 WARNING [train.py:1197] (3/4) Exclude cut with ID _69884_210_9771_1_1535846392818_4166169_455-343812-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:26,250 WARNING [train.py:1197] (3/4) Exclude cut with ID _13916_210_9994_1_1530788341126_3763149_414-320174-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:26,257 WARNING [train.py:1197] (3/4) Exclude cut with ID _51329_210_14209_1_1534465804354_7323080_398-239819-0 from training. Duration: 0.99 2023-09-30 16:51:26,405 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_199-232314-0_sp1.1 from training. Duration: 0.92725 2023-09-30 16:51:29,379 INFO [train.py:1039] (3/4) Epoch 22, batch 4750, loss[loss=0.1885, simple_loss=0.252, pruned_loss=0.06255, over 23717.00 frames. ], tot_loss[loss=0.1723, simple_loss=0.2486, pruned_loss=0.04798, over 4726284.53 frames. ], batch size: 232, lr: 4.63e-03, grad_scale: 16.0 2023-09-30 16:51:31,112 WARNING [train.py:1197] (3/4) Exclude cut with ID _2334_210_2667_1_1525608238880_4705129_459-104997-0 from training. Duration: 0.95 2023-09-30 16:51:34,891 WARNING [train.py:1197] (3/4) Exclude cut with ID _5171_210_2084_1_1527159601556_3953996_441-209395-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:51:36,508 WARNING [train.py:1197] (3/4) Exclude cut with ID _27052_210_5986_1_1532329197187_3585009_320-80393-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:51:40,519 WARNING [train.py:1197] (3/4) Exclude cut with ID _21783_210_11357_1_1531742458730_3762679_89-217253-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:51:40,567 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_453-282588-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:51:40,835 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.2.encoder.layers.0.balancer2.prob, batch_count=775360.0, ans=0.125 2023-09-30 16:51:43,008 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_88-341718-0 from training. Duration: 0.66 2023-09-30 16:51:43,065 WARNING [train.py:1197] (3/4) Exclude cut with ID _43552_210_8913_1_1533726031059_3523429_230-3521-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:51:47,501 WARNING [train.py:1197] (3/4) Exclude cut with ID _9679_210_7497_1_1529129212906_4363929_512-313889-0 from training. Duration: 0.81 2023-09-30 16:51:47,713 WARNING [train.py:1197] (3/4) Exclude cut with ID _14346_210_9570_1_1530619360608_4177460_194-252800-0_sp1.1 from training. Duration: 0.8818125 2023-09-30 16:51:47,747 WARNING [train.py:1197] (3/4) Exclude cut with ID _55209_210_1531_1_1534675828996_4597160_239-293541-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:51:49,164 WARNING [train.py:1197] (3/4) Exclude cut with ID _35799_210_5854_1_1533263487887_3529071_15-325131-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:51:49,745 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.0.feed_forward2.hidden_balancer.prob, batch_count=775426.6666666666, ans=0.125 2023-09-30 16:51:56,084 WARNING [train.py:1197] (3/4) Exclude cut with ID _14100_210_8341_1_1530529320613_3678069_279-55760-0 from training. Duration: 0.93 2023-09-30 16:51:56,946 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.1.encoder.layers.1.conv_module2.whiten.whitening_limit, batch_count=775426.6666666666, ans=15.0 2023-09-30 16:51:59,417 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_4103_210_4852_1_1526708860_4495520_489-230703-0_sp1.1 from training. Duration: 0.77275 2023-09-30 16:52:02,406 WARNING [train.py:1197] (3/4) Exclude cut with ID _6694_210_4599_1_1527852637490_3965529_144-185051-0 from training. Duration: 0.96 2023-09-30 16:52:02,520 WARNING [train.py:1197] (3/4) Exclude cut with ID _28461_210_2253_1_1532771775189_3041299_1-228155-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:52:03,193 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.3.encoder.layers.3.feed_forward1.out_whiten, num_groups=1, num_channels=512, metric=8.94 vs. limit=15.0 2023-09-30 16:52:05,606 WARNING [train.py:1197] (3/4) Exclude cut with ID _32704_210_8954_1_1533017488840_6747370_686-252323-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:52:05,610 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_1075-294633-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:52:05,640 WARNING [train.py:1197] (3/4) Exclude cut with ID _22083_210_8254_1_1531702862439_3678970_30-92879-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:52:09,030 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9245W0121-616888 from training. Duration: 0.896 2023-09-30 16:52:09,036 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6786_210_1388_1_1527839699_4194495_378-46070-0 from training. Duration: 0.72 2023-09-30 16:52:12,833 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0 from training. Duration: 0.82 2023-09-30 16:52:15,287 WARNING [train.py:1197] (3/4) Exclude cut with ID _61984_210_22356_1_1535183971792_3507009_238-89390-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:52:18,221 WARNING [train.py:1197] (3/4) Exclude cut with ID _8630_210_5986_1_1528538408906_7469124_524-310811-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:52:19,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _15346_210_3525_1_1530787646969_6982281_307-158462-0_sp0.9 from training. Duration: 0.7666875 2023-09-30 16:52:19,853 WARNING [train.py:1197] (3/4) Exclude cut with ID IC9309W0191-640318 from training. Duration: 0.512 2023-09-30 16:52:19,860 WARNING [train.py:1197] (3/4) Exclude cut with ID _55153_210_2253_1_1534384786677_3162096_100-204617-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:52:21,585 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.0.feed_forward3.hidden_balancer.prob, batch_count=775560.0, ans=0.125 2023-09-30 16:52:23,025 WARNING [train.py:1197] (3/4) Exclude cut with ID _8337_210_1819_1_1528533605581_3614330_112-149213-0_sp1.1 from training. Duration: 0.863625 2023-09-30 16:52:27,832 INFO [scaling.py:1022] (3/4) Whitening: name=encoder_embed.out_whiten, num_groups=1, num_channels=192, metric=7.09 vs. limit=8.0 2023-09-30 16:52:28,217 WARNING [train.py:1197] (3/4) Exclude cut with ID _67831_210_12392_1_1535612464202_3092612_346-2148-0_sp1.1 from training. Duration: 0.8454375 2023-09-30 16:52:31,241 WARNING [train.py:1197] (3/4) Exclude cut with ID _14629_210_15709_1_1530602638464_1104300_51-117576-0_sp0.9 from training. Duration: 0.4444375 2023-09-30 16:52:31,299 WARNING [train.py:1197] (3/4) Exclude cut with ID _3814_210_1819_1_1526716893421_3643700_300-27235-0 from training. Duration: 0.87 2023-09-30 16:52:32,800 WARNING [train.py:1197] (3/4) Exclude cut with ID _40399_210_4353_1_1533305658194_3279406_195-116825-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:52:32,849 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_163-87552-0_sp0.9 from training. Duration: 0.9555625 2023-09-30 16:52:34,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _19514_210_9259_1_1531530071330_5084230_560-237597-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:52:34,549 WARNING [train.py:1197] (3/4) Exclude cut with ID _5259_210_2084_1_1527246186694_3827529_41-234353-0_sp1.1 from training. Duration: 0.6181875 2023-09-30 16:52:34,579 WARNING [train.py:1197] (3/4) Exclude cut with ID _6274_210_2084_1_1527937583837_7494020_769-168302-0 from training. Duration: 0.71 2023-09-30 16:52:37,614 WARNING [train.py:1197] (3/4) Exclude cut with ID _8095_210_9259_1_1528372239382_4834140_574-45541-0 from training. Duration: 0.65 2023-09-30 16:52:39,637 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.3.encoder.layers.1.feed_forward2.hidden_balancer.prob, batch_count=775626.6666666666, ans=0.125 2023-09-30 16:52:40,772 WARNING [train.py:1197] (3/4) Exclude cut with ID _50310_210_15488_1_1534068043345_3641080_262-67120-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:52:42,539 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53071_210_16554_1_1534493434_1276796_35-11927-0_sp1.1 from training. Duration: 0.963625 2023-09-30 16:52:42,542 WARNING [train.py:1197] (3/4) Exclude cut with ID _3992_210_1747_1_1526644816031_3731060_107-251434-0 from training. Duration: 0.99 2023-09-30 16:52:42,610 WARNING [train.py:1197] (3/4) Exclude cut with ID _19704_210_9206_1_1532050232766_4031330_412-54488-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:52:44,588 WARNING [train.py:1197] (3/4) Exclude cut with ID _18516_210_9206_1_1531704653206_3692406_77-258490-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:52:46,165 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40047_210_2290_1_1533290519_2374311_195-165693-0_sp0.9 from training. Duration: 0.988875 2023-09-30 16:52:46,270 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_296-36971-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:52:47,739 WARNING [train.py:1197] (3/4) Exclude cut with ID _14970_210_15709_1_1530773543493_1715730_187-20259-0_sp1.1 from training. Duration: 0.8090625 2023-09-30 16:52:51,505 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53373_210_5399_1_1534229471_4715695_135-305927-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:52:52,852 WARNING [train.py:1197] (3/4) Exclude cut with ID _6322_210_2588_1_1528007403445_3718200_60-18123-0 from training. Duration: 0.88 2023-09-30 16:52:53,000 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_166-166990-0 from training. Duration: 0.96 2023-09-30 16:52:54,280 INFO [train.py:1039] (3/4) Epoch 22, batch 4800, loss[loss=0.1808, simple_loss=0.2531, pruned_loss=0.05423, over 23680.00 frames. ], tot_loss[loss=0.1737, simple_loss=0.2499, pruned_loss=0.04871, over 4706141.30 frames. ], batch size: 149, lr: 4.63e-03, grad_scale: 32.0 2023-09-30 16:52:54,529 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_5438_210_1794_1_1527336687_2600009_233-58072-0 from training. Duration: 0.77 2023-09-30 16:52:57,712 WARNING [train.py:1197] (3/4) Exclude cut with ID _8833_210_5219_1_1528592665210_7553059_611-80313-0_sp0.9 from training. Duration: 0.711125 2023-09-30 16:52:59,125 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53555_210_5632_1_1534595220_7232297_118-43239-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:53:00,697 WARNING [train.py:1197] (3/4) Exclude cut with ID _12369_210_4381_1_1530356078581_4388520_480-13146-0 from training. Duration: 0.85 2023-09-30 16:53:05,999 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_43823_210_2158_1_1533639205_7824684_892-28814-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:53:06,066 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_24899_210_16554_1_1532046422_3827462_160-21255-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:53:06,760 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.4.encoder.layers.0.self_attn_weights.whiten_keys, num_groups=4, num_channels=128, metric=1.71 vs. limit=6.0 2023-09-30 16:53:10,875 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_154-316450-0_sp1.1 from training. Duration: 0.6909375 2023-09-30 16:53:12,444 WARNING [train.py:1197] (3/4) Exclude cut with ID _30196_210_1664_1_1533708496522_7074140_1225-301232-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:53:12,491 WARNING [train.py:1197] (3/4) Exclude cut with ID _28783_210_7943_1_1532671164808_7155479_414-208100-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:53:12,581 WARNING [train.py:1197] (3/4) Exclude cut with ID _19023_210_4353_1_1531378892552_5820780_83-254794-0 from training. Duration: 0.86 2023-09-30 16:53:14,062 WARNING [train.py:1197] (3/4) Exclude cut with ID _43086_210_14209_1_1534033612800_7677110_591-83752-0_sp1.1 from training. Duration: 0.9818125 2023-09-30 16:53:14,133 WARNING [train.py:1197] (3/4) Exclude cut with ID _29877_210_6228_1_1532397791106_5276149_266-62291-0_sp1.1 from training. Duration: 0.7909375 2023-09-30 16:53:16,518 WARNING [train.py:1197] (3/4) Exclude cut with ID _13251_210_1925_1_1530584968760_3527999_280-161633-0_sp1.1 from training. Duration: 0.663625 2023-09-30 16:53:22,818 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7543_210_6753_1_1528539468_333764_39-156634-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:53:24,508 INFO [scaling.py:1022] (3/4) Whitening: name=encoder.encoders.1.encoder.layers.0.nonlin_attention.whiten2, num_groups=1, num_channels=256, metric=5.66 vs. limit=15.0 2023-09-30 16:53:25,090 WARNING [train.py:1197] (3/4) Exclude cut with ID _67499_210_17282_1_1535509805220_3692430_505-312248-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:53:25,152 WARNING [train.py:1197] (3/4) Exclude cut with ID _5294_210_1747_1_1527249643928_3696179_180-100713-0_sp0.9 from training. Duration: 0.9 2023-09-30 16:53:26,745 WARNING [train.py:1197] (3/4) Exclude cut with ID _26528_210_9070_1_1532570410842_3851310_508-148132-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:53:26,774 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_14072_210_5632_1_1530687092_5046021_211-339622-0_sp1.1 from training. Duration: 0.4090625 2023-09-30 16:53:26,797 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_40896_210_15710_1_1533433956_7100802_339-138356-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:53:28,427 WARNING [train.py:1197] (3/4) Exclude cut with ID _27060_210_5986_1_1532674838922_3522549_211-19933-0_sp1.1 from training. Duration: 0.9454375 2023-09-30 16:53:30,150 WARNING [train.py:1197] (3/4) Exclude cut with ID _60229_210_27767_1_1534838406158_3655350_457-117923-0_sp1.1 from training. Duration: 0.9090625 2023-09-30 16:53:30,512 INFO [scaling.py:213] (3/4) ScheduledFloat: name=encoder.encoders.5.encoder.layers.1.conv_skip_rate, batch_count=775826.6666666666, ans=0.0 2023-09-30 16:53:33,103 WARNING [train.py:1197] (3/4) Exclude cut with ID _55429_210_10626_1_1534467588527_7174143_460-141439-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:53:36,876 WARNING [train.py:1197] (3/4) Exclude cut with ID _59170_210_6721_1_1534983633960_4203335_263-10780-0_sp1.1 from training. Duration: 0.9909375 2023-09-30 16:53:36,908 WARNING [train.py:1197] (3/4) Exclude cut with ID _14456_210_4915_1_1530616019460_7469240_872-264525-0_sp0.9 from training. Duration: 0.72225 2023-09-30 16:53:38,450 WARNING [train.py:1197] (3/4) Exclude cut with ID _7624_210_1531_1_1528109802198_4933101_418-349834-0_sp0.9 from training. Duration: 0.6666875 2023-09-30 16:53:41,345 WARNING [train.py:1197] (3/4) Exclude cut with ID _39006_210_16187_1_1533542425748_3705328_27-53998-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:53:41,548 WARNING [train.py:1197] (3/4) Exclude cut with ID _50783_210_12071_1_1534384797849_3638458_202-183922-0 from training. Duration: 0.85 2023-09-30 16:53:43,017 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_7002_210_1794_1_1527939683_5157286_1-281264-0 from training. Duration: 0.44 2023-09-30 16:53:43,134 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_53753_210_7234_1_1534320003_3594148_493-9314-0_sp1.1 from training. Duration: 0.9181875 2023-09-30 16:53:43,171 WARNING [train.py:1197] (3/4) Exclude cut with ID _14207_210_9659_1_1530665987863_7790990_217-95056-0_sp1.1 from training. Duration: 0.8909375 2023-09-30 16:53:44,486 INFO [optim.py:468] (3/4) Clipping_scale=2.0, grad-norm quartiles 1.478e+02 1.908e+02 2.098e+02 2.406e+02 3.815e+02, threshold=4.197e+02, percent-clipped=0.0 2023-09-30 16:53:44,685 WARNING [train.py:1197] (3/4) Exclude cut with ID _8321_210_5399_1_1528445967079_4260270_406-45086-0_sp1.1 from training. Duration: 0.72725 2023-09-30 16:53:44,696 WARNING [train.py:1197] (3/4) Exclude cut with ID _31649_210_6228_1_1532584608063_4091078_505-213821-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:53:44,710 WARNING [train.py:1197] (3/4) Exclude cut with ID _4770_210_1385_1_1526903077865_2926409_317-175062-0_sp0.9 from training. Duration: 0.911125 2023-09-30 16:53:46,322 WARNING [train.py:1197] (3/4) Exclude cut with ID _5444_210_2151_1_1527418808464_5312210_726-27165-0_sp1.1 from training. Duration: 0.6090625 2023-09-30 16:53:46,429 WARNING [train.py:1197] (3/4) Exclude cut with ID _62330_210_5956_1_1535677171667_5165612_494-170437-0_sp1.1 from training. Duration: 0.9545625 2023-09-30 16:53:51,122 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_54460_210_9771_1_1534744645_4690315_474-64054-0_sp1.1 from training. Duration: 0.936375 2023-09-30 16:53:54,925 WARNING [train.py:1197] (3/4) Exclude cut with ID _44757_210_13403_1_1533711383910_4196220_521-7556-0_sp1.1 from training. Duration: 0.97275 2023-09-30 16:53:55,158 WARNING [train.py:1197] (3/4) Exclude cut with ID 210_6860_210_4852_1_1527921296_3136226_269-348505-0_sp1.1 from training. Duration: 0.92725